Abstract
The Movement Assessment Battery for Children—2nd Edition (MABC-2) is a test of motor development, widely used in clinical and research settings. To address which motor abilities are actually captured by the motor tasks in the two age versions of the MABC-2, the AB2 for 7- 10-year-olds and the AB3 for 11- 16-year-olds, we examined AB2 and AB3 factorial validity. We conducted confirmatory factor analysis (SPSS AMOS 22.0) on data from the test’s standardization samples of children aged 7–10, n = 483, and 11–16, n = 674, in order to find the best fitting models. The covariance matrix of AB2 and AB3 fit a three-factor model that included tasks of manual dexterity, aiming and catching, and balance. However, factor analytic models fitting AB2 and AB3 did not involve the dynamic balance tasks of hopping with the better leg and hopping with the other leg; and the drawing trail showed very low factor validity. In sum, both AB2 and AB3 of the MABC-2 test are able to discriminate between the three specific motor abilities; but due to questionable psychometric quality, the drawing trail and hopping tasks should be modified to improve the construct validity for both age versions of the MABC-2.
Introduction
One of the most reputable motor tests used in clinical and research settings for the assessment of motor coordination and the identification of Developmental Coordination Disorder (DCD) in children is the Movement Assessment Battery for Children—Second Edition (MABC-2; Henderson, Sugden, & Barnett, 2007). The primary function of the test is to categorize children according to the following levels of motor competence: (a) without movement difficulty, (b) a risk of having a movement difficulty, and (c) significant movement difficulty (Henderson et al., 2007). The total test score, converted to a percentile, provides more precise differentiation regarding children’s motor coordination, making the MABC-2 a valuable tool for judging a child’s stronger and weaker predispositions for learning and executing sensorimotor skills in educational and clinical settings and in daily living.
The diagnostic purpose of the MABC-2 is to help identify DCD, a neurodevelopmental disorder defined by substantially subaverage performance and learning of movement activities for a given person’s chronological age, intelligence, and developmental opportunity (American Psychiatric Association, 2013). DCD is manifested by problems executing various types of motor coordination tasks, including manual dexterity (MD; Michel, Cimeli, Neuenschwander, Röthlisberger, & Roebers, 2013), gross motor coordination in aiming, interceptive, and locomotor tasks (Deconinck et al., 2006; Henderson et al., 2007; Przysucha & Maraj, 2013), or balance (Geuze, 2005). Those movement problems can be based in impairments in perceptional functioning (Coleman, Piek, & Livesey, 2001; Wilson & McKenzie, 1998), motor planning (Gheysen, Van Waelvelde, & Fias, 2011), perceptual-motor integration (Williams, Thomas, Maruff, Butson, & Wilson, 2006), and online control of movement action (Van Waelvelde, De Weerdt, De Cock, & Smits-Engelsman, 2004).
The Test Items of the MABC-2 test—Age Band 2.
The Test Items of the MABC-2 test—Age Band 3.
The use of either product or process motor tasks is accompanied by an unanswered question as to what latent motor abilities are really captured by tests like the MABC-2. Motor competency, as a theoretical construct, is generally understood as a general motor predisposition underlying the performance of a wide variety of motor skills (e.g., see reviews by Rudd et al., 2016; Utesch et al., 2016). However, no consensus exists as to which motor abilities underlie motor competency. Motor competency tests typically administer simple movement tasks such as manual manipulations, and various types of locomotive actions (jumping, hopping, walking, running, balancing, catching, throwing, and others). Thus, motor competency is assessed through the performance of fundamental skills involved in these simple movement tasks, with the assumption that groups of similar fundamental movement skill tasks capture the assumed motor competencies. However, due to an absence of consensus about which motor abilities underlie motor competency, various motor tests may assess different motor qualities. This suggestion is strongly supported by low to moderate correlations found between different motor tests (reviewed by Cools, DeMartelaer, Samaye, & Andries, 2009; Utesch et al., 2016).
In the light of these theoretical problems, a specific question arises as to what motor abilities are really measured with the MABC-2, for which there are literature reports of only poor to moderate concurrent validity of its age bands AB2 and AB3 to such other motor tests as the Bruininsk-Oseretsky Test of Motor Proficiency, 2nd Edition; BOTMP-2 (Lane & Brown, 2015) and Motor Coordination and Dexterity Assessment (Cardoso & Magalhães, 2012). A study on factor validity of AB2 with the German normative sample supported the original assumption that the test captures the three specific latent factors (motor abilities) represented by the MD, AC, and Bal components (Wagner, Kastner, Petermann, & Bös, 2011). However, in contrast to this study, in a study with the U.K. normative sample, four latent factors—MD, AC, and separated static and dynamic balance—emerged as a good model fit of the AB2, whereas the three-factor model, with MD, AC, and Bal latent factors, was a better model for the AB3 (Schulz, Henderson, Sugden, & Barnett, 2011). This decreased number of factors in models fitting tests for children of increasing age does not correspond to an expected greater differentiation of motor abilities with maturation (Gallahue, Ozmun, & Goodway, 2012).
Two considerations might affect dissimilarities in the structural validity of the MABC-2 test examined in these studies. First, there may be cross-cultural variations in motor development, as reported in infants (Cintas, 1989) and preschoolers (Venetsanou & Kambas, 2010). Second, there may be variations in how well particular motor tasks (test items) contribute to the assessment of presumed latent motor components of motor competency. This latter concern relates to the factor validity of the test items, or how much they are each associated with the presumed latent factors meant to represent key motor abilities.
To help users of the MABC-2 (e.g., psychologists, physiotherapists, and paediatricians) better understand which motor abilities are captured by the MABC-2, we examined the factorial validity of its AB2 and AB3 versions. Examination of the real factorial structure of the test is essential for valid interpretation of test results and diagnostic decision-making. Based on the theoretical background used to develop the MABC-2 (Henderson et al., 2007) and the results of two previous psychometric studies (Schulz et al., 2011; Wagner et al., 2011), we examined two hypothesized models of the AB2 and AB3: the three factor model now presumed by the test, and a single, general factor model.
Method
Sample
This study used the two representative samples of children—7- 10-year-olds (n = 484, 248 boys and 236 girls) and 11- 16-year-olds (n = 674, 328 boys and 346 girls)—that comprise the Czech Republic’s normative samples. A stratified sampling plan had been developed to insure representativeness with respect to age, gender, geographic region, and size of municipality. Population census data from the Czech Statistical Office (2012) provided the basis for this stratification. Municipality categories were as follows: large sized (>90,000 inhabitants), medium sized (5,000–90,000 inhabitants), and small sized (<5,000 inhabitants). Children were randomly selected from public primary, secondary, and high schools within each geographical region.
Measurements
The normative samples had been tested with the AB2 and AB3 of the MABC-2, respectively (Tables 1 and 2), according to the guidelines and instructions specified in the Examiner’s Manual (Henderson et al., 2007) as transferred into the Czech version of the test (Psotta, 2014). All the children were tested in their schools by a team of trained examiners. Before testing, all the examiners underwent the user’s training program, focused on understanding the theoretical issues and gaining practical skills in administration and scoring of the test. Ethical approval for the whole project was obtained from the Ethical Committee of the Faculty of Physical Culture, Palacký University Olomouc (code No. 22/2012). The project was also approved by the Czech Science Foundation based on both international and national reviews.
Data Analysis
First, the descriptive characteristics and distribution of raw scores on each of the 11 test items were calculated using the Shapiro–Wilk test and skewness coefficients α and kurtosis coefficients β (α = 0.05). Second, before the confirmatory factor analysis (CFA), the raw data samples were adjusted. Due to strong disruption of Gaussian normality in the Bal test items, the multidimensional normality was corrected using the following procedures.
Performance in hopping with the better leg (Bal 3b test item) in both AB2 and AB3 was almost constant with most sample participants achieving close to the maximum possible score (five jumps), yielding extreme kurtosis and peakedness in data distribution. Therefore, the variable Bal 3b was excluded from the CFA.
In those test items approximating normal data distributions (all MD and AC test items), the extreme outliers were detected as those scores beyond the lower and upper boundaries lying at a distance corresponding to three times the interquartile range from the first and third quartile, respectively (Thode, 2002). These extreme scores were then excluded from the CFA. Thus, after the exclusion of the extreme outliers, the data from all the MD and AC test items came close to the Gaussian normality while meeting the criteria of nonsubstantial skewness and kurtosis (−1.00 > α, β < 1.00) by Bulmer (1979), with the exception of the MD 3 test items in both AB2 and AB3 and AC 1b in AB3.
Extreme scores from Bal 1 and Bal 2 test items in both AB2 and AB3 were not excluded. Most sample participants achieved maximum possible scores −30 s of balancing (Bal 1 items), and 15 correct steps during balance walking (Bal 2), respectively. Thus, elimining scores lying outside the upper interquartile range would cause both variables (test items) to become constants.
After exclusion of the extreme outliers detected in the MD and AC test items, the final sample size for the CFA was 422 (220 boys, 202 girls) aged 7–10 years (AB2) and 623 (301 boys, 321 girls) aged 11–16 years (AB3).
Confirmatory Factor Analysis
CFA, using IBM SPSS AMOS 22.0 version (Arbuckle, 2013), was performed on the raw item scores to examine two alternative hypotheses on the structural model of each of the age bands (AB2 and AB3):
The 11 test items are manifestations of three intercorrelated latent factors—MD, AC, and Bal—making the test a three-specific factor structure. Accordingly, we assumed that AB2 and AB3 would each predominantly assess three specific motor factors (MD, AC, and Bal), and thus, the total test score would be a combination of these motor factors; and The 11 test items are manifestations of a general factor model (G-factor), making the test a measure of general motor ability.
In the CFA, the factors were considered as independent latent variables, while performance on the test items were dependent observable variables. In the prescribed models, one-factor loading of each test item on a supposed latent factor was fixed. Asymptotically distribution-free estimates of the nonstandardized and standardized partial regression weights were carried out using the covariance matrix. The model fit was evaluated with the absolute fit indices and one incremental fit index, and the criteria for a good fitting model (in the parenthesis) were selected according to recommendations from Hooper, Coughlan, and Mullen (2008): the χ2 test (p> .05), relative χ2 (CMIN/df; <3.0), root-mean-square-error of approximation (RMSEA; <0.07), goodness of fit index (GFI > 0.95), adjusted GFI (AGFI > 0.95), and Tucker–Lewis index (TLI > 0.95).
If the models were significantly different from the data, they would be modified. The main discrepancies between the real and the fitted covariance structure were found using modification indices (MI), with MI > 4.0 as a significant discrepancy. The statistical significance of all the parameters was verified according to the Wald test (p = .05). Before making model modifications, we considered the specific tasks that were found not to fit the model. Factor loadings were classified according to the criteria for practical (clinical) significance of standardized factor loading (Tabachnick & Fidell, 2007) as follows: <0.32 as very poor, 0.32–0.44 poor, 0.45–0.54 fair, 0.55–0.62 good, 0.63–0.70 very good, and >0.70 excellent clinical significance. Tabachnick and Fidell (2007) recommended these criteria for items having different frequency distribution, as in this study.
The initial rigorous three-specific factor model of AB2 was found to be significantly different from the sample data (χ2 test (df = 32) = 67.158, p = .0003). Two correlations between errors and one relation from the factor of MD to the variable Bal 3o were added. To achieve an acceptable final model, two relations were excluded (from the factor MD to the variable MD 3 and from the Bal factor to the Bal 3o variable).
The initial three-specific factor model of AB3 was significantly different from the data obtained from the sample (χ2 test (df = 32) = 53.961, p = .009). However, only one correlation between errors and one relation from the factor of Bal to the variable AC 2 were added to achieve acceptable significance of the final model (p = .070).
Results
Descriptive Features of Data
Descriptive Characteristics of the Performance in the Test Items of the MABC-2 Test—Age Band 2 in 7- 10-Year-Old Children.
Note. MD 1 – Bal 3 = the test items (see Table 1); M = mean, SD = standard deviation; Min = minimal score; Max = maximum score; Mdn = median; IQR = interquartile range; α = coefficient of skewness; β = coefficient of kurtosis.
Descriptive Characteristics of the Performance in the Test Items of the MABC-2 Test—Age Band 3 in 11- 16-Year-Old Children.
Note. MD 1 – Bal 3 = the test items (see Table 2); M = mean, SD = standard deviation; Min = minimal score; Max = maximum score; Mdn = median; IQR = interquartile range; α = coefficient of skewness; β = coefficient of kurtosis.
Correlation Matrices of the MABC-2 test—Age Band 2.
Note. MD 1 – Bal 3 = the test items (see Table 1); r = .098 at α = 0.05 (two-tailed).
Correlation Matrices of the MABC-2 Test—Age Band 3.
Note. MD 1 – Bal 3 = the test items (see Table 2); r = .088 at α = 0.05 (two-tailed).
CFA for the Younger Age Band—AB2 (7–10 years)
After modifications to the initial strict model with three correlated factors with simple structure of the loading matrix and no error correlations, the covariance matrix of AB2 was found to be fitted with the χ2 test (df = 30) = 40.612, p = .094, CMIN/df < 1.354, RMSEA = 0.027, GFI = 0.980, AGFI = 0.964, and TLI = 0.972. This well-fitting model involves additional factor loading of the variable Bal 3o on the latent factor MD (−0.27, p < .0001) and of the variable MD 3 on the latent factor AC (−0.28, p = .009; Figure 1).
Three-specific factor model of the MABC-2 test—age band 2 for 7- 10-year-old children. χ2 test (df = 30) = 40.612, p = .094, CMIN/df < 1.354, RMSEA = 0.027, GFI = 0.980, AGFI = 0.964, and TLI = 0.972. MD1p–Bal3o = the test items (see Table 1); e_MD1p – e_Bal3o = error variables; latent factors: MD = manual dexterity, AC = aiming and catching, Bal = balance.
All factor loadings of the variables (test items) on the hypothesized latent factors indicated by the standardized regression weights were statistically significant (p < .05; Figure 1) except that factor loading of MD 3 on the MD factor was not significantly different from zero. The intercorrelations between the three latent factors were significant (p < .004; Figure 1).
The G-factor model of AB2 could not be meaningfully constructed. This model fitted data well—χ2 test (df = 27) = 38.736, p = .067, CMIN/df < 1.435, RMSEA = 0.030, GFI = 0.981, AGFI = 0.962, with the exception of critically lower TLI = 0.941. But three factor loadings on the G-factor were not significantly different from zero (p > .05), and the other four factor loadings were of very poor clinical significance (<0.32).
CFA for the Older Age Band—AB3 (11–16 years)
The data from the sample of 11- 16-year-old participants showed a good fit to the three-specific factor model after only two modifications of the initial strict three-specific factor model were performed (Figure 2). After these modifications, the good fitting was indicated by the χ2 test (df = 30) = 42.081, p = .070, CMIN/df = 1.403, RMSEA = 0.024, GFI = 0.984, AGFI = 0.970, and TLI = 0.958.
Three-specific factor model of the MABC-2 test—age band 3 for 11- 16-year-old children. χ2 test (df = 30) = 42.081, p = .070, CMIN/df = 1.403, RMSEA = 0.024, GFI = 0.984, AGFI = 0.970, and TLI = 0.958. MD1p–Bal3o = the test items (see Table 2); e_MD1p – e_Bal3o = error variables; latent factors: MD = manual dexterity, AC = aiming and catching, Bal = balance.
All the simple factor loadings of the particular variables (test items) on the relevant latent factors were significant (p < .05) as indicated by the standardized regression weights. The only exception was the nonsignificant loading of the variable Bal 3o on the latent factor of balance (0.15, p = .069). The intercorrelations between the three latent factors were also significant (p < .018; Figure 2).
The G-factor model of AB3 fitted the data quite well with the χ2 test (df = 23) = 26.605, (p = .071), CMIN/df < 1.157, RMSEA = 0.015, GFI = 0.990, AGFI = 0.975, TLI = 0.984, but four factor loadings did not significantly differ from zero (p > .05), and the other five factor loadings were of poor clinical significance (<0.45).
Discussion
The Global Factor Structure of the AB2 and the AB3
This study showed that the AB2 and AB3 of the MABC-2 have a three-specific factor structure with the test’s presumed latent factors of MD, AC, and balance (Bal). In contrast to previous reports (Schulz et al., 2011; Wagner et al., 2011), the factor analytic models fitting Czech Republic normative data for the AB2 and AB3 excluded the task of Hopping on mats with the better leg (Bal 3b). This test item produced an almost constant score, with 98.8% and 98.9%, respectively, of children in the sample achieving the maximum possible score (five hops). Thus, this test item did not have acceptable validity as an indicator of balance. Hopping is typically developed in children at around 3 to 5 years of age, and some form of skipping might be a more demanding motor task, as skipping is the last developed movement structure in the developmental sequence of locomotor skills (Haywood, Roberton, & Getchell, 2012). Also, hopping for a distance might be a more valid dynamic balance task, since children with DCD demonstrate larger asymmetry in their performance on this task as compared to typically developing children (Armitage & Larkin, 1993).
The significant correlations this study found between the three motor ability factors in both AB2 and AB3 models support the assumption that they interweave with each other (Henderson et al., 2007). However, the relations were rather weak with a mean correlation of 0.39 in the AB2 and 0.24 in the AB3 (Figures 1 and 2), and thus they show a higher discriminant validity as three subtests for the MD, AC, and Bal components. Although the total test score of the MABC-2 is a principle index of overall motor competency, derived from performance on all the tasks (Henderson et al., 2007), this study suggests that the AB2 and the AB3 better provide clinical workers with valid information on the level of specific motor abilities achieved by individual children. The weaker interrelations between the specific motor ability factors in the AB3 as compared with the AB2 corresponded quite well to the hypothesized greater differentiation of motor abilities as children mature (Gallahue et al., 2012).
The Substructure of the AB2 and the AB3 Related to the MD Factor
Placing pegs (both MD 1 items) and Threading lace (MD 2) in the AB2 and Turning pegs (both MD 1 items) and Triangle with nuts and bolts (MD 2) in the AB3 showed good to excellent significance of factor loading on the MD factor. On the other hand, Drawing trail (MD 3) had no loading on the MD factor in the well-fitting model for the AB2, and very low clinical significance in the model for AB3. This finding can be explained by a severe disturbance of data normality when 91.5% and 87.6% of the children in the AB2 and the AB3, respectively, executed the Drawing trail with no error and achieved the best possible score. These findings suggest poor internal validity of the task as a measure of MD. Low reliability of the MD 3 task in the AB2 and its low factor loading on the MD factor in the AB2 and the AB3 were also evident in the study by Wagner et al. (2011) and Schulz et al. (2011). The speed of goal-directed drawing, rather than the number of errors, could be a more sensitive measure of fine visual-motor coordination, especially in older children. It is known that the speed of handwriting develops almost linearly in primary school and continues after age 11 (Sugden & Wade, 2013). In addition, speed of drawing and writing is essential for a child’s further education, despite the modern rise in other methods of written communication using information technology (Feder & Majnemer, 2007).
The Substructure of the AB2 and the AB3 Related to the AC Factor
Catching a ball with two hands (AC 1) and Throwing a beanbag onto a mat (AC 2) in AB2 had fair and good clinical significance, respectively, as indicators of the AC factor (Figure 2). However, the significant correlation between performance in AC 1 and AC 2 was weak (r = .26; Table 3) and suggested that high specificity in motor control of AC was already evident in younger school-age children. Aiming at a distant target assumes preprogrammed motor control, with detection and use of visual information for control of final movements (Oudejans, van de Langenberg, & Hutter, 2002). Catching the ball is also controlled by a feedforward control circuit, but visual information is continuously processed and sent to the feedforward controller to anticipate the path of the ball and organize the appropriate movement. In addition, feedback control is used once the ball hits the hands for successful muscle action of hands and fingers (Ghez & Krakauer, 2000). Thus, one can deduce that the AC latent factor mainly represents the ability of feedforward control of body movement, including prediction of the trajectory of the flight of the object.
Surprisingly, in the fitting model of the AB2 the Drawing trail (MD 3 task) displayed poor but still significant loading on the AC latent factor (−0.28). One can speculate that possible shared aspects of motor control of Drawing trail and Aiming/catching could be the processing and use of visual information for predictive motor control, in addition to visual information used for the feedback error processing during drawing (Ogawa, Inui, & Sugio, 2006).
In the AB3, Catching the ball with both the better and the other hand (AC 1b and AC 1o) was loaded on the AC factor with excellent clinical significance. In the context of poor factor loading of Throwing a ball at a wall target (AC 2) on the AC factor, and strong correlation between AC 1b and AC 1o test items (0.86), 94% determination of performance in Catching with the other hand by the latent AC factor suggested that this task could be used alone to capture a level of gross visual–motor coordination associated with moving object control in children aged 11–16.
The Substructure of the AB2 and the AB3 Related to the Balance Factor
The static balance task consisting of one-leg standing on a balance board (Bal 1b and Bal 1o) was shown to have excellent factor validity as a measure of the balance ability in the AB2. On the other hand, two items of dynamic balance—Balance walking (Bal 2) and Hopping on the mats with the other leg (Bal 3o) emerged as having poor factor validity although still significant in the fitting model of the AB2 (as the Bal 3o in AB3). Poor factor validity of both dynamic balance tasks was mainly the consequence of most children achieving the maximum possible score (fifteen steps and five hops), and thus strongly distant from normality in data distribution. Thus, we deduce that the balance factor in the AB2 could mainly concern the control mechanisms underlying postural stability under static conditions. One-leg standing was suggested as being controlled by sensory feedback on the basis of a closed-loop system (Hatzitaki, Zisi, Kollias, & Kioumourtzoglou, 2002) while balancing under dynamic task conditions requires mainly the use of feedforward control (Horak & Nashner, 1986; Lacquaniti & Maioli, 1989).
In the AB3, the static balance item (Bal 1) and Dynamic balance walking item (Bal 2) were shown to be valid for capturing the balance abilities of older children in contrast to Zig-zag hopping (Bal 3). Walking toe-to-heel backward (Bal 2) seemed to have better differentiation power in older children (11- to 16-year-olds) as compared to the excessive ease of Walking heel-to-toe forward disturbing factor validity in younger children (7- to 10-year-olds).
Limitations
While this study’s strength was its use of standardization samples to examine the factor validity of AB2 and AB3 of the MABC-2, generalizability of these well-fitting models might be limited as the samples originated from mainstream schools. However, 90% of Czech 6- 15-year-old children attend these schools. Another potencial generalizability problem for these well-fitting models is the use of model modifications such as adding correlations between errors and cross-loading. Because model modification is data driven, it is inherently susceptible to capitalizating on chance characteristics of the data, and thus raises the question of whether model modifications generalize to other samples or to the population at large (MacCallum, Roznowski, & Necowitz, 1992). MacCallum et al. (1992) reported that model modification results may be very inconsistent over repeated samples, and cross-validation results may behave erratically. In the current study, this problem could be reduced with use of normative data from representative samples of children.
