Abstract

In this issue of the Journal of Visual Impairment & Blindness (JVIB), in the article entitled, “Environmental Information Required by Individuals with Visual Impairments Who Use Orientation and Mobility Aids to Navigate Campuses,” the authors use a multiple regression analysis and report an adjusted R 2 value when looking at whether four demographic variables might be influencing the level of importance participants were attributing to a type of environmental information. In a Statistical Sidebar in the March–April 2018 issue of JVIB, I briefly described the basics of regression and the meaning of the R 2 statistic, but I would like to provide more information on this topic. To briefly summarize, a linear regression line plots a straight line through a cluster of data points on a graph so that the distance between the line and all of the individual points is as small as possible. It indicates the trend in the relationship of the variable on the x-axis and the variable on the y-axis.
In the case of this article, however, there are four variables being used to predict another variable, which is why it is called “multiple regression.” This relationship is difficult to picture on a graph; thus, a mathematical expression of the relationship is used instead. In this article, three multiple regressions were conducted, one for each of three types of environmental information. The first multiple regression analysis reported in the article indicated “an adjusted R 2 of .054 (F = 2.618, p < .05, η2 = .09)” with a significant individual predictor being age (b = −.214, p < .05). It is this reporting that I would like to unpack a little.
In regression analyses, the R 2 statistic indicates how much of the variability in the dependent variable is explained by the regression model—note that these authors are reporting an “adjusted R 2.” Although a normal R 2 indicates how much variability is explained by the model and increases with each addition of a new potential predictive variable, there is a chance that the increase is simply because more predictive variables have been entered and not because of any additional real predictive value of the added variables. The adjusted R 2 takes into account the artificial raising of the model’s predictive power and gives a more realistic value of the connections between the predictive variables and the predicted variable. In the first model reported by these authors, R 2 is .054, which means that only 5.4% of the variability in the dependent measure is explained by the model.
The rest of the reporting for this statistical analysis “(F = 2.618, p < .05, η2 = .09)” indicates the F value for the significance of the whole predictive model and the fact that this F value is less than .05, which means that the model is statistically significant. (Although, at predicting only 5.4% of the variability, it is not all that meaningful.) This difference between something that is statistically significant yet not terribly meaningful is also highlighted by the effect size of the analysis. The statistic η2 (read as eta squared) is similar to R 2 but is often used as a measure of the effect size. An η2 of .09 indicates a medium effect size.
Finally, out of the four possible predictors used in this multiple regression, only one ended up being significant. The predictive variable of age was reported as having b = −.214, which indicates that a rise of 1 SD of the age variable would correspond to a decrease of 0.214 SD of the predicted variable. Of course, one would have to know the mean and standard deviation for both of these variables to know exactly what that meant in real terms. At any rate, the fact that this beta value has a significance level of less than .05 shows that it is statistically significant. The other two multiple regression analyses reported in this article are reported in a similar manner and so can be unpacked in the same way. And given that this unpacking has taken some time, I think it is time I say farewell for now.
