Detailing Analytical Processes: Exploring the Mysteries of Parametric and Nonparametric Analyses

Abstract

In this issue, the article titled, “A Study into the Relationship Between Adaptive Skills and Visual Impairment in Individuals With and Without an Intellectual Disability,” authors van der Aa, Jonker, de Looff, and Didden provide a very clear and thorough explanation of their approach to their statistical analysis. So much so that going through a portion of their results section can be instructive for readers on how experimenters approach analysis and what is often done but not reported.

In this article, there were four groups of participants with visual impairments: Low vision with intellectual disability (ID), low vision without ID, blind with ID, and blind without ID. Group sizes ranged from 34 to 78. There were also data from 2661 sighted individuals with and without ID to serve as a reference. The outcome measures were 8 subscales on The Adaptive Ability Performance Test (ADAPT) that included basic self-care, household skills, society skills, social alignment, applying school skills, dealing with money (and mail and insurance), daily structure, and making responsible choices.

When planning a study, researchers determine what they will measure and in what way they will measure it. This planning process, to a large extent, determines what kind of analysis they will be using to analyze the data. If they plan on having several good-sized groups of participants contributing data into average group scores, then an analysis of variance (ANOVA)-based approach seems likely. If there are only two groups of participants or data from two different time periods or locations or some other construct, then a t-test seems likely. However, these analytical set-ups are only plans. The data can moderate what types of analysis are actually conducted.

After carefully controlling data collection to reduce the impact of outside effects and ensuring that all trials in a study are conducted as similarly as possible across participants and days, one of the first things a researcher does is to take a look at the dataset in its totality to determine if everything makes sense. Depending on how the data were collected, there could be anomalies that have entered into the dataset. For example, someone might have written down or entered a number incorrectly, data-collection equipment might have malfunctioned, or a participant may have been acting in a way that was not consistent with the methods as explained at the beginning of the study. In these cases, there may be missing data or outliers in the data.

The authors of this article indicate that when they accomplished this step of data analysis (sometimes referred to as “data cleaning”), they found that there were no outliers in the data for any participants with ID. The authors identified outlier values in the data by displaying the data with a boxplot, a figure that displays a set of data as a vertical box (hence the name) with a horizontal line in the box showing the median score, and whiskers running from the top and bottom of the box showing the range of the dataset (often used is 1.5 times the interquartile range, which is the distance between the upper and lower quartiles). Any scores outside of these whiskers are shown as single dots and are easily caught as outliers or scores that are far outside the range of the other scores in the dataset.

What is done in reaction to outliers depends on how many there are and how severe they are. Some researchers will elect to exclude these outlier scores from their analyses. In some cases, however, this exclusion can be seen as picking the data to suit the researcher's preferred outcome, so such removal should be supported by a logical argument for the elimination and for a small number of scores.

Once all scores being used for analysis have been identified, if parametric statistics like ANOVA are planned, the researcher should examine the data to see if it meets the criteria for this approach. The authors of this article plotted the kurtosis and skewness of the data for each group of participants and used Levene's test for equality of variances as well. These steps check to see whether the spread of scores and how the scores are distributed fall outside of what would be considered acceptable for a normal distribution of scores. If there are significant deviations from a normal distribution, a researcher might be able to run all of the data through an arithmetical transformation to bring the dataset more into compliance with normal requirements. However, transforming data in this way can have unintended consequences. Thus, many researchers, when faced with data that do not meet the requirements for a normal distribution, elect to switch to a nonparametric option for analysis (i.e., statistical tests that accomplish the same analytical result as a parametric test, but without the same distribution requirements).

The parametric ANOVA test indicates any significant differences across three or more groups and it is typically followed-up with t-tests that compare pairs of groups to see where any significant difference lies. The nonparametric version is a Kruskall–Wallis H test, followed by a Dunn's procedure, which is what the authors of this article chose to do when they discovered that the scores for the groups of participants without ID did not meet the assumptions for using an ANOVA test.

One might be led to question why a researcher would not always use the nonparametric version of a test if it has fewer requirements on the data. The answer is that because the parametric versions of statistical tests are based on the mean score of each group, while nonparametric versions are based on medians or ranks. Although both the parametric and nonparametric approaches might describe what appear to be similar sets of comparisons, the fact that all of the distributions of data in the parametric versions are normal allows the results to be interpreted more robustly. Thus, there is less chance that some wonky differences in how the data are distributed in the different groups of data are not inadvertently influencing the outcome of the statistical test, which is a possibility in the nonparametric versions of tests. Therefore, researchers are best served by first designing and conducting well-organized data collection; second, cleaning and investigating data for anomalies; and third, using parametric statistics when possible.