Abstract
Morphological awareness contributes to literacy development through multiple components, yet few assessments with established evidence of internal construct structure are widely available and suitable for whole-class administration for efficiently characterizing elementary students’ suffix-based morphological knowledge in written sentence contexts. This study reports on the development and validation of ROAR Morphology, a brief classroom-based assessment of suffix-based morphological knowledge in written sentence contexts for students in grades 2–5, administered in under 10 minutes to whole classrooms with automatic scoring. Items were designed to capture a learning progression of suffix-based morphological knowledge, varying suffix type (inflectional/derivational) and suffix commonality (common/less common), with careful attention to cognitive processing demands including number of derivational distractors. Calibration of response data from 735 students using Rasch modeling yielded high reliability (α = .91; with fit indices ranging from .79 to 1.22). Item difficulty analyses confirmed that derivational morphology was more challenging than inflectional morphology, and less common suffixes were more difficult than common suffixes. Cognitive processing demands, specifically the number of competing derivational distractors, contributed additional variance in item difficulty beyond linguistic features. Based on item difficulty modeling, we established four empirically derived learning progression waypoints reflecting proficiency with suffix-based morphological structures of increasing complexity, from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes. Notably, base word characteristics did not drive item difficulty, confirming that the waypoints capture genuine differences in suffix-based morphological knowledge development. ROAR Morphology uniquely predicted literacy achievement beyond word reading and sentence reading measures (ΔR2 = 7.2%, p < .001), supporting its discriminant validity. These findings demonstrate that suffix type, suffix commonality, and cognitive processing demands systematically influence suffix-based morphological knowledge development in written sentence contexts, and that empirically validated waypoints may inform instructional planning during the critical grades 2–5 window when this knowledge is rapidly developing.
Keywords
Introduction
Morphological awareness, which involves understanding the smallest units of meaning within words (i.e., morphemes), plays a significant role in reading development by supporting efficient word recognition and integrating meaning and syntax for higher-level comprehension processes (Carlisle, 2000; Deacon & Levesque, 2024; Perfetti & Stafura, 2014). One such component is morphological knowledge—the internalized understanding of morphological patterns that students draw upon when processing morphologically complex words in written contexts (Apel, 2014; Nagy et al., 2014). Suffix-based morphological knowledge is particularly critical during elementary school, when students encounter increasing numbers of derived and inflected words in academic texts (Anglin, 1993; Nagy et al., 2014). Despite its importance, educators and researchers lack efficient classroom-based tools to characterize elementary students’ morphological knowledge development, which contributes to morphological awareness. This study addresses this gap through the design and validation of an efficient classroom-based assessment that provides meaningful information about students’ suffix-based morphological knowledge in a written sentence context with particular attention to the critical grades 2–5 window when this knowledge is rapidly developing (Anglin, 1993; Berninger et al., 2010; Nagy et al., 2014).
The Need for an Efficient Classroom Assessment of Morphological Knowledge
Few validated measures of morphological knowledge have established evidence of internal construct structure (see Apel, 2014 for a review of existing measures). While some comprehensive language assessments include morphological subtests, there is currently no standalone standardized measure specifically designed to assess morphological knowledge for classroom-wide online administration (Collins, 2023), and existing assessments present multiple barriers to classroom implementation. Most require individual administration (Berninger, 2007; Foorman et al., 2012; Newcomer & Hammill, 2008), making them time-prohibitive for whole-class assessment. Others target older students in grades 5–8 (Goodwin et al., 2021), missing the critical developmental window in grades 2–5 when suffix-based morphological knowledge rapidly develops and students encounter increasingly complex academic vocabulary (Anglin, 1993; Berninger et al., 2010; James et al., 2021). Additionally, many focus primarily on oral language tasks (Apel, 2014), limiting their utility for assessing the written morphological knowledge essential for comprehending complex academic texts.
Certain morphological features influence processing difficulty: students typically master inflectional forms before derivational forms and common suffixes before less common ones (Berko, 1958; Carlisle & Nomanbhoy, 1993; Grande et al., 2024). Yet most assessments treat morphological knowledge as a unitary skill, failing to differentiate among item types that vary in difficulty along the learning progression. Assessments that provide more detailed information about where students fall along the suffix-based morphological knowledge progression, and their corresponding waypoints (qualitatively situated distinct regions along the scale) may therefore be useful for instructional planning. Without the corresponding waypoints, a student’s score would indicate relative standing but not what morphological knowledge they have actually acquired (Mari et al., 2021).
Theoretical Framework
Understanding suffix-based morphological knowledge requires situating it within both the broader construct of morphological awareness and current reading theory. This section first establishes the component structure of morphological awareness and our assessment’s specific focus within that framework, then situates morphological knowledge within reading comprehension models.
Components of Morphological Awareness
Morphological awareness represents a multidimensional construct encompassing several distinct but related components. Apel (2014; 2022) proposed a four-component model that includes: (1) morphological analysis (decomposing words into constituent morphemes), (2) morphological synthesis (combining morphemes to create new words), (3) morphological judgment (determining whether morphologically complex forms are real words), and (4) morphological generation (producing appropriate morphological forms to complete sentences or meet syntactic requirements). Each component may develop independently, with students potentially demonstrating proficiency in one while struggling in another (Apel et al., 2022).
The present study focuses specifically on morphological knowledge as demonstrated through students’ ability to select appropriate morphological forms within a written sentence context
The decision to focus on this morphological component reflects both theoretical and practical considerations. Theoretically, the written sentence format captures suffix-based morphological knowledge as it operates during silent reading of academic texts, where students must process morphologically complex words within meaningful linguistic contexts—the primary setting for academic vocabulary encounters in grades 2–5 (Nagy & Anderson, 1984; Snow & Uccelli, 2009). The focus on suffixes specifically reflects the linguistic structure of English: because all inflectional morphemes are suffixes, and inflectional morphology is typically acquired before derivational morphology (Berko, 1958; Carlisle & Nomanbhoy, 1993), a suffix-based assessment naturally captures the foundational end of the morphological knowledge learning progression while extending to more complex derivational forms. Practically, this format enables efficient whole-class administration and automatic scoring, removing barriers that have limited morphological assessment in classroom settings. It is important to acknowledge, however, that performance on morphological generation tasks may not fully represent performance on other components of morphological awareness, and comprehensive evaluation of morphological competence may require multiple measures targeting different components of the construct.
Morphological Knowledge in Reading Theory
Current theories of reading provide a framework for understanding the dual role of morphological knowledge in the reading process. The Direct and Indirect Effects of Reading (DIER) model (Kim, 2020) and the Morphological Pathways Framework (Levesque et al., 2021) illustrate how morphological knowledge contributes through multiple pathways: directly affecting word reading through decoding support, while indirectly supporting reading comprehension through vocabulary knowledge and higher-level language skills. Meta-analytic findings support this dual pathway, showing morphological knowledge correlates positively with both word-level skills (word reading, r = .49) and text-level processes (reading comprehension, r = .54; Liu et al., 2024).
Perfetti’s (2007) lexical quality hypothesis emphasizes that successful reading requires well-specified orthographic representations linked to semantic, syntactic, and phonological information. Students with strong morphological knowledge can decode unfamiliar morphologically complex words by recognizing familiar morphemic patterns and use morphological analysis to infer meanings of unknown vocabulary words encountered in texts (Carlisle, 2000; Nagy & Anderson, 1984). These connections between morphological knowledge, vocabulary, and text comprehension underscore the importance of understanding students’ morphological knowledge development during the elementary years, when academic language demands increase substantially. This vocabulary-morphology connection is evident even in early elementary students (Nevo et al., 2024).
Learning Progression of Suffix-Based Morphological Knowledge
Morphological knowledge follows a learning progression across several dimensions. This trajectory represents a continuum of suffix-based morphological knowledge development that is particularly dynamic during grades 2–5, when students are consolidating foundational inflectional knowledge and progressively extending their morphological knowledge to more complex derivational forms (Anglin, 1993; Apel & Lawrence, 2011; Berninger et al., 2010). Typically, inflectional forms (e.g., marking grammatical features like “-ed” for past tense) are mastered earlier than derivational forms (e.g., creating new words, often changing word class, like “teach” to “teacher”; Berko, 1958; Carlisle & Nomanbhoy, 1993; Apel & Lawrence, 2011). Common suffixes (e.g., “-ing,” “-ed,” “-er”) are typically mastered earlier than less common ones (e.g., “-ous,” “-ity”; Deacon, 2008), aligning with statistical learning theories suggesting children’s linguistic development is guided by exposure to regularities within language (Erikson & Thiessen, 2015; Saffran, 2020). Despite evidence that both suffix type and suffix frequency impact performance, how these dimensions intersect when developing morphological knowledge remains unclear. A third potentially relevant dimension is morphological transparency—the degree to which base words and derived forms maintain consistent phonological and orthographic patterns—which has been examined in studies of morphological processing (Apel et al., 2023), though its interaction with suffix type and suffix commonality in written sentence contexts remains an area for further investigation.
Cognitive Processing Demands of Morphology
Morphological processing involves multiple mechanisms that operate along a continuum from automatic to strategic (Anglin, 1993). Real-word tasks presented in meaningful sentence contexts, such as those used in the present study, primarily tap tacit morphological knowledge—students draw upon internalized understanding of morphological patterns to select appropriate word forms, rather than engaging in conscious decomposition or explicit morphological analysis (Carlisle, 2010; Kuo & Anderson, 2006; Nagy et al., 2014). Understanding which processing mechanisms an assessment engages is important for interpreting what the measure reveals about students’ morphological competence.
In addition to this tacit processing, cognitive processing demands also influence morphological knowledge. Research in cognitive linguistics suggests that students’ ability to distinguish between competing morphological forms represents an important dimension of morphological knowledge (Carlisle, 2010; Crepaldi et al., 2010; Schreuder & Baayen, 1995; Wang & Zhang, 2023). When students encounter multiple derivational forms simultaneously (e.g., “teacher” and “teachable”), they must distinguish between competing morphological options that share semantic and orthographic features. The cognitive challenge increases with the number of derivational distractors (answer choices containing words with derivational suffixes that create plausible but incorrect alternatives), as each additional morphologically complex alternative requires the processing system to differentiate between similar morphological patterns. Drawing on cognitive load theory (Sweller, 1988) and research on morphological competition effects (Crepaldi et al., 2010; Rastle & Davis, 2008), we hypothesized that this feature would systematically influence item difficulty—a prediction we test empirically in the present study.
Present Study
There were two main goals of the present study. Our first goal was to develop and validate an efficient, classroom-based assessment for characterizing suffix-based morphological knowledge in written sentence contexts for students in grades 2–5 that could be administered to whole classrooms and scored automatically within a suite of other reading measures. The second goal was to explore the following research questions: (1) Can a brief, automated suffix-based morphological knowledge assessment, administered in a group classroom setting, reliably characterize elementary students’ morphological knowledge along an empirically derived learning progression, as evidenced by internal-structure validity? (2) Based on linguistic features that define waypoints, do cognitive processing demands (i.e., number of derivational distractors) contribute additional variance in suffix-based morphological knowledge item difficulty? (3) Does suffix-based morphological knowledge, as measured by this assessment, contribute uniquely to literacy achievement beyond other reading skills?
Additionally, to establish that our waypoints represent genuine differences in suffix-based morphological knowledge development rather than simply grade-related differences, we used explanatory item response modeling (Wilson & De Boeck, 2004) to examine grade-level effects on performance and the distribution of students across performance levels within grades. This analysis, coupled with the Rasch modeling (Rasch, 1960), allowed us to determine whether the waypoints capture meaningful developmental differences along the suffix-based morphological knowledge continuum independent of grade level.
Methods
To accomplish our goals, we developed the Rapid Online Assessment of Reading (ROAR) Morphology. It is a sentence-based assessment in which students select from real-word multiple-choice options to complete sentence stems. This format assesses suffix-based morphological knowledge in written sentence contexts. In developing this assessment, we followed the BEAR (Berkeley Evaluation and Assessment Research) Assessment System (BAS; Wilson, 2023), a framework for developing educationally useful assessments through four building blocks: construct maps, item design, outcome space (i.e., the set of possible response categories and scoring rules), and measurement models (e.g., Rasch-family models). Using this framework, we first consulted the extant literature and developed a construct map depicting a theoretically informed learning progression of suffix-based morphological knowledge. Then, we created a bank of assessment items. We purposefully designed items to assess suffix-based morphological knowledge in written sentence contexts for which students draw upon their internalized understanding of morphological patterns to select the appropriate word form. We calibrated items to provide maximum information about students’ development of suffix-based morphological knowledge through systematic variation of key morphological features, controlling for lexical characteristics to ensure that variance in item difficulty was attributable to morphological features rather than potentially confounding lexical variables. Once the item bank was developed, we collected data with students in grades 2–5 and analyzed the data using Rasch-family models to examine reliability and validity and answer our above research questions.
Construct Map Development
The construct map we developed reflects both qualitative and quantitative changes in the acquisition of suffix-based morphological knowledge from early to later elementary grades (Apel & Henbest, 2016; Berko, 1958; Carlisle & Nomanbhoy, 1993; Deacon & Dhooge, 2010; Ku & Anderson, 2003; Yamashita & Kusanagi, 2024). It integrates both morphological complex variables (suffix type, suffix commonality) and cognitive processing demands (number of derivational distractors). Figure 1 presents our hypothesized construct map, showing the progression across four distinct waypoints (i.e., qualitatively different levels of suffix-based morphological knowledge that represent meaningful developmental thresholds), from basic recognition of inflectional morphology with common suffixes to sophisticated processing of derivational morphology with less common suffixes in contexts with multiple derivational distractors. Suffix-based morphological knowledge construct map waypoint descriptions
This construct map provided the theoretical foundation for item development and guided our validation analyses. By empirically testing whether item difficulty aligned with our hypothesized progression, we could validate or refine our understanding of suffix-based morphological knowledge development. The map was specifically designed to support identifying distinct thresholds where students’ suffix-based morphological knowledge might break down, providing information about where in the learning progression a student’s suffix-based morphological knowledge may need additional instructional support.
We focused exclusively on suffix-based morphological knowledge for several practical and methodological reasons. Because all inflectional morphemes in English are suffixes, and inflectional morphology is typically acquired before derivational morphology (Berko, 1958; Carlisle & Nomanbhoy, 1993), a suffix-based assessment captures the foundational end of the morphological knowledge continuum while extending to more complex derivational forms. Existing research on morphological development suggests a developmental progression in suffix acquisition, from inflected to derived forms (Apel & Lawrence, 2011; Berko, 1958; Carlisle & Nomanbhoy, 1993), providing a theoretical foundation for creating a difficulty hierarchy along the learning progression. Limiting the assessment to one morphological structure type ensures that performance differences reflect suffix-based morphological knowledge development rather than varying familiarity across different structural types (suffixes vs. prefixes vs. compounds). This focused approach provides detailed information about suffix-based morphological knowledge but limits generalizability to the full scope of morphological knowledge development, which also encompasses prefixes, compound words, and more complex morphological structures.
Item Design
To operationalize the construct map, we developed assessment items that manipulated key morphological features while controlling for potentially confounding variables. Items were designed to capture suffix-based morphological knowledge as it operates in written sentence contexts, with careful attention to both the linguistic features that define the developmental continuum and the distractor logic that shapes cognitive processing demands.
Morphological and Lexical Features
Morphological Features
Items were classified by suffix type as either derivational (e.g., “teach” → “teacher”) or inflectional (e.g., “walk” → “walked”), reflecting a fundamental distinction in morphological theory (Carlisle, 2000; Dodur & Miray, 2021; Tyler & Nagy, 1989). Derivational morphology creates new words, often changing word class, while inflectional morphology marks grammatical features without changing the word’s basic meaning or part of speech (Carlisle & Nomanbhoy, 1993). Suffixes were categorized as either common or less common based on frequency norms from prior research (Carroll, 1971; Honig et al., 2000; White et al., 1989). Common suffixes (e.g., “-ing,” “-ed,” “-er,” “-ly”) appear more frequently in elementary texts than less common suffixes (e.g., “-ous,” “-ity,” “-ance”), and previous research indicates they are typically mastered earlier (Deacon, 2008; Nagy et al., 2014). Together, suffix type and suffix commonality define the learning progression captured by the construct map, from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes.
A linguistic constraint shaped our design: inflectional morphemes necessarily employ high-frequency suffixes (common suffixes like -ed, -ing, -er), while derivational morphemes vary more broadly in suffix commonality. This is not a design artifact but rather reflects the linguistic structure of English, inflectional morphology is inherently more frequent in the language input children encounter. Consequently, in our item bank, all inflectional items feature common suffixes, while derivational items include both common suffix (e.g., -ful, -ly) and less-common suffix forms (e.g., -ity, -ous). To examine whether word-type differences in difficulty are independent of this inevitable confounding between suffix type and suffix frequency, we conducted analyses both with and without covarying suffix commonality for our item difficulty modeling (see Research Question 2 results). This approach allows us to present a comparison of inflectional versus derivational items while also clarifying what aspects of that difference are attributable specifically to suffix type versus to the correlated variable of suffix commonality.
Test Design Features
Test design features that influence cognitive processing demands were manipulated and coded. The number of derivational distractors, answer choices containing derivational suffixes, was systematically varied to investigate how competing morphological forms affect processing difficulty. All items used simple sentences (i.e., one main clause with one finite verb), following Su’s (2009) definition. Two raters independently classified all 37 items for sentence complexity, achieving 100% inter-rater agreement.
Base Word Features
We coded items for base word features known to influence lexical processing. These included: age of acquisition (AoA), the average age at which children typically learn the word (Kuperman et al., 2012); prevalence, the number of occurrences in age-appropriate texts (Brysbaert et al., 2019); phoneme count, the number of sound units in the base word; concreteness, how easily the word’s meaning can be experienced through the senses (Brysbaert et al., 2014); and number of senses, the polysemy of the base word based on WordNet (Fellbaum, 1998). These features were tracked as control variables to ensure that variance in item difficulty was attributable to morphological characteristics rather than word-level factors, an important validity consideration given that base word familiarity could otherwise confound interpretation of morphological difficulty.
Transparency
In response to prior research highlighting the role of transparency in morphological processing (Apel et al., 2023), items were coded post-hoc for morphological transparency, the degree to which base words and derived forms maintain consistent phonological and orthographic patterns. Items were classified into four categories: transparent (no phonological or orthographic shift between base and derived form), orthographic shift only, phonological shift only, or double shift (both phonological and orthographic shift present). Transparency was not a primary focus of item development and was not systematically manipulated; rather, its distribution reflected naturally occurring patterns of English morphology. Of the 37 final items, 25 (67.6%) were transparent, 4 (10.8%) showed orthographic shift only, 2 (5.4%) showed phonological shift only, and 6 (16.2%) showed double shift. To establish coding accuracy, all 37 items were independently coded by two independent raters. Initial agreement was 91.9% (34/37 items); the three discrepant items were discussed and resolved, yielding 100% final agreement.
Assessment Format
To create an assessment that would efficiently characterize students’ suffix-based morphological knowledge along a developmental continuum, we adapted Tyler and Nagy’s (1989) and Goodwin et al.’s (2021) Real-Word Suffix task format, which employs a cloze procedure in which respondents select the correctly transformed word from four options to complete a sentence stem. This format allows for quick whole-class administration and automatic scoring while providing meaningful information about students’ suffix-based morphological knowledge in written sentence contexts. For example, in the sentence, There is a ______ between mild and spicy food, answer choices included the correct choice difference along with distractors that represented incorrect morphological transformations: differ (untransformed base), differing (wrong syntactic form), and different (wrong derivational suffix). Each item had four answer choices: the target (or correct) word, the base word, and two morphologically complex distractors. Figure 2 presents additional sample items illustrating variation in suffix type, suffix commonality, and distractor composition. Sample items illustrating morphological feature variations
This design systematically controlled distractor options while maintaining consistent item structure. The number of derivational distractors was systematically varied to examine how cognitive processing demands influence suffix-based morphological knowledge development. As discussed in the literature review, distinguishing between competing morphological forms represents an important dimension of morphological processing (Carlisle, 2010; Crepaldi et al., 2010). Drawing on cognitive load theory (Sweller, 1988) and psycholinguistic evidence that morphological family size influences lexical processing difficulty (Schreuder & Baayen, 1995), we hypothesized that cognitive challenge would increase as a function of the number of derivational distractors present. Therefore, our assessment framework explicitly incorporates this cognitive processing component alongside linguistic features, allowing examination of how these factors jointly influence item difficulty. We focused on written suffix-based morphological knowledge since readers encounter morphologically complex words in academic texts (Nagy & Anderson, 1984) from which they must make meaning in grades 2–5. The systematic variation of morphological features, combined with empirically derived waypoints, allows educators and researchers to understand where students fall within the learning progression of suffix-based morphological knowledge, providing interpretive power beyond a single proficiency score.
Item Calibration
Distribution of 37 ROAR-Morphology Items by Primary Morphological Features
Sample
The sample for this study consisted of 735 students in grades 2–5 from four school districts in Northern California. From an initial sample of 745 students, nine were excluded because they did not complete the assessment, and one was excluded due to an unusually small number of completed items (5 out of 45 possible items). The assessment was administered to students on the ROAR platform (https://roar.stanford.edu/) between April and June 2024, in classroom settings proctored by teachers or reading specialists. Students took an average of 6.86 minutes to complete. Teachers provided standard assessment instructions and monitored student engagement, but students completed the assessment independently on digital devices, reading items silently and selecting responses without adult assistance. The ROAR platform automatically scores responses, requiring no manual scoring or teacher input beyond standard test administration procedures. Students were assessed using a single form administered across all grades to examine the learning progression on the same suffix-based morphological knowledge continuum, rather than grade-specific versions that might conflate developmental differences with item difficulty differences.
The demographic composition (see Tables 7–10) of the convenience sample used in this study differs notably from national averages in several ways: our sample included higher proportions of English Learners (21.9% vs. 10.6% nationally), Asian students (23.5% vs. 5.4%), and students speaking a language other than English at home (47.2%), reflecting the linguistic diversity of Northern California. The sample was underrepresented in Black students (2.9% vs. 14.9% nationally) and White students (30.1% vs. 44.6% nationally).
External Validation Sample and Measures
A subset of 364 of the 735 students in grades 3–5 also completed ROAR-Word, a measure of single-word reading, and ROAR-Sentence, a measure of students’ ability to silently read and understand sentences quickly and accurately, and their schools provided their Smarter Balanced Assessment Consortium (SBAC) English Language Arts (ELA) scores for external validity analyses. The SBAC is a summative assessment administered annually in grades 3–8 and high school
Analytical Approach
To address Research Question 1: Can a brief, automated, morphology measure, administered in a classroom setting, reliably characterize varying levels of morphological knowledge in elementary students?, we used the Rasch model (Rasch, 1960, 1980) to calibrate the response data, estimating item-difficulty and person-ability parameters. This analysis generated fit statistics indicating how well items conformed to model expectations and produced a Wright Map (item-person map) showing the distribution of items and persons on the same interval scale. We then used an explanatory item response model, specifically the latent-regression that incorporates person-level covariates (Van den Noortgate & Paek, 2004). To examine the dimensionality of our construct, we employed a between-items multidimensional Rasch model (Adams et al., 1997; Briggs & Wilson, 2003) that treated derivational and inflectional items as two separate dimensions and estimated a latent correlation between them. This analysis allowed us to determine whether inflectional and derivational morphology represent distinct constructs or different aspects of the same underlying suffix-based morphological knowledge ability.
To address Research Question 2: How do linguistic features (suffix type, suffix commonality), cognitive processing demands [number of derivational distractors], and lexical characteristics influence item difficulty?, we conducted item difficulty modeling using multiple-linear regression (Ferrara et al., 2022). We examined two primary categories of predictors: (1) linguistic features including suffix type (inflectional vs. derivational) and suffix commonality (common vs. less common), (2) cognitive processing demand reflected in the number of derivational distractors. Lexical characteristics (i.e., age of acquisition, base word phoneme count, prevalence, dispersion, concreteness) were examined as control variables to ensure that variance in item difficulty was attributable to morphological features rather than potentially confounding word-level factors. Transparency was also examined as an exploratory post-hoc variable given prior research highlighting its potential role in morphological processing (Apel et al., 2023), though it was not a primary predictor of interest. We tested these predictors both individually and in combination to determine their relative contributions to item difficulty. This modeling approach allowed us to determine which features most strongly predict item difficulty and whether morphological features contribute unique variance beyond general lexical characteristics.
Given the confounding of suffix type with suffix commonality described above, we employed a two-model approach for Research Question 2. Model 1 examined the relationship between suffix type and item difficulty alongside other predictors, and Model 2 included suffix frequency as a covariate to determine what aspects of the word-type effect remained independent of suffix frequency. This approach allows readers to see both the unadjusted comparison (reflecting natural linguistic patterns) and the suffix-frequency-adjusted comparison (revealing word-type effects beyond frequency). The comparison between these models illuminates whether derivational items show higher difficulty purely as a function of less common suffixes, or whether there are additional factors (such as structural complexity or cognitive processing demands) that contribute to word-type differences in item difficulty.
To address Research Question 3: Does morphological knowledge, as measured by this assessment tool, contribute uniquely to reading comprehension beyond other reading skills?, we conducted external validity analyses using data from the subset of students who completed additional reading measures. Specifically, we examined correlations between ROAR-Morphology and other reading measures (ROAR-Word, ROAR-Sentence, and SBAC-ELA) to establish concurrent validity. We then conducted hierarchical regression analyses to determine whether ROAR-Morphology predicted SBAC-ELA scores beyond what was explained by ROAR-Word (i.e., word reading) and ROAR-Sentence (i.e., sentence reading) measures, thereby testing its discriminant validity and unique contribution to reading outcomes.
Results
Research Question 1
Rasch Model Calibration
The Rasch model analysis provided strong evidence for the technical quality of the ROAR-Morphology assessment. All 37 items demonstrated acceptable fit within the range of .79–1.22 (Wu & Adams, 2013), indicating consistency with the unidimensional measurement model. To further evaluate the absolute fit of the unidimensional Rasch model and ensure no violation of local item independence, we examined the Standardized Root Mean Square Residual (SRMR) and adjusted Yen’s Q3 statistics (aQ3). The global model fit was strong, with an SRMR of 0.060, falling well below the conventional threshold of 0.08 for acceptable fit (Hu & Bentler, 1999; Maydeu-Olivares, 2013). Furthermore, the assumption of local independence was strongly supported. The mean absolute deviation of the adjusted Q3 statistic (MAD aQ3) was 0.04, and the maximum observed aQ3 value between any two items was 0.17. This maximum value is well below the standard threshold of 0.20 (Chen & Thissen, 1997), indicating that no meaningful residual correlations exist between items after accounting for the primary underlying morphological construct. These absolute fit indices provide further justification for retaining the parsimonious unidimensional model. The assessment showed excellent internal consistency with a Coefficient alpha of 0.91 and strong WLE person-separation reliability of 0.84, suggesting the measure can reliably distinguish between different levels of suffix-based morphological knowledge within our sample of 2nd to 5th grade students.
The Wright Map (Figure 3) provides a visual and empirical representation of the construct map, placing both items and persons on the same interval scale measured in logits. On this map, persons appear on the left side and items on the right, with the zero-point anchored at the mean of the sample. Items positioned lower on the scale are easier (more students answer them correctly), while higher-positioned items are more difficult. When a person and item align at the same position on the scale, that person has approximately a 50% probability of answering the item correctly. Wright Map showing distribution of persons and items on the suffix-based morphological knowledge construct with green items having 2 derivational distractors
The Wright Map reveals a deliberate concentration of items at the lower end of the scale, reflecting our assessment design goal of providing greater measurement precision for students along the developing portion of the suffix-based morphological continuum during the critical 2–5th grade window. This targeted item distribution allows for finer discrimination among students at developing levels of suffix-based morphological knowledge, making the assessment particularly valuable for characterizing students at various waypoints. Future iterations of the assessment will include additional items at higher difficulty levels to provide enhanced measurement precision for students with more advanced suffix-based knowledge morphological knowledge, creating a more comprehensive learning progression.
Dimensionality Analysis
To evaluate whether inflectional and derivational morphology represent distinct latent dimensions, we estimated a between-items multidimensional Rasch model (Adams et al., 1997) and compared it to a unidimensional Rasch model.
Model comparison statistics indicated that the multidimensional model provided a statistically superior fit (Δχ2(2) = 1554.4, p < .001; ΔBIC = 1541.2). However, the latent correlation between the two dimensions was extremely high (ρ = .95), indicating that approximately 90% of their latent variance is shared. In psychometric research, correlations of this magnitude are commonly interpreted as evidence of functional or essential unidimensionality when the intended score use include broad assessment of morphological knowledge (e.g., Reckase, 1979; Stout, 1987). To verify this, we conducted a Principal Component Analysis (PCA) of the standardized residuals from the Rasch model (Linacre, 1998). The eigenvalue of the first residual contrast was 1.82, accounting for only 4.9% of the variance. Because this value falls below the established 2.0 threshold, it indicates that the residual variance is unpatterned noise, providing compelling evidence of essential unidimensionality. Given the substantial overlap between dimensions and the PCA results, we retained the unidimensional Rasch model for subsequent analyses. All items across both inflectional and derivational categories showed acceptable fit within the unidimensional model (infit MNSQ range: 0.79–1.22), confirming that items from both categories function cohesively to measure a single underlying construct.
This high latent correlation aligns with our learning progression framing. Inflectional and derivational morphology are not fundamentally different constructs but rather represent earlier and later points along the same continuum of suffix-based morphological knowledge development (Apel & Lawrence, 2011; Berko, 1958; Carlisle & Nomanbhoy, 1993). The stronger reliability of the derivational dimension (.87 vs. .83 for inflectional) is consistent with the greater range of difficulty among derivational items, which span both common and less common suffixes, compared to inflectional items which employ predominantly common suffixes.
Exploratory Post-Hoc Analyses
Transparency Effect
We conducted post-hoc analyses examining whether morphological transparency predicted item difficulty. A one-way ANOVA comparing item difficulty across four transparency categories (transparent, phonological shift, orthographic shift, opaque) revealed no significant differences, F(3, 33) = 0.86, p = .467, η2 = .061. This null finding should be interpreted cautiously given that transparency was not systematically manipulated, item distribution across transparency categories was uneven, and the sentence-based cloze format may have reduced reliance on transparency cues by providing semantic or syntactic cues.
Even when transparency was used as a predictor variable in a linear regression analysis with item difficulty estimates as the dependent variable, there was no significant effect and negligible variance explained. Orthographic transparency (β = −.062, p = .857, R2 = .01) and phonological transparency (β = −.069, p = .860, R2 = .01) were both non-significant.
Establishing Performance Levels
Based on our item difficulty modeling results, we used a combinatorial approach to calculate predicted mean locations for each item type represented on our construct map (Blum et al., 2024). Rather than imposing purely a priori cut scores, this approach allowed us to empirically refine our hypothesized learning progression, establishing four waypoints that correspond to naturally occurring clusters of suffix-based morphological knowledge development (see Figure 1): Initial (−2.26 logits and below; e.g., “higher,” “biggest,” “walking”), Emerging (−2.27 to −1.87 logits; e.g., “worried,” “washable,” “scientist”), Developing (−1.88 to −1.04 logits; e.g., “darkest,” “loyalty,” “ticklish”), and Advancing (−1.05 to 0 logits; e.g., “persuasive,” “vocalize,” “avoidance”).
Students scoring above 0 logits (n = 376, 51.2% of the sample) performed at or above the average item difficulty, demonstrating proficiency with all morphological structures assessed. These empirically refined waypoints provide educators with meaningful thresholds for characterizing students’ suffix-based morphological knowledge along the learning progression. It is important to note, however, that the assessment was deliberately calibrated to provide maximum measurement precision at the lower end of the ability spectrum, where distinctions among developing readers are most critical for informing instruction, a design decision reflected in the Test Information Curve alongside the Wright Map. Future iterations will include additional items at higher difficulty levels to provide enhanced measurement precision across the full learning progression.
Figure 4 shows the Wright Map with these empirically derived performance levels, illustrating how items with different morphological knowledge features distribute across the suffix-based morphological knowledge difficulty continuum. Wright Map with item features, waypoints, and test information curve illustrating the distribution of suffix-based morphological knowledge items across the learning progression
Differential Item Functioning
To examine assessment fairness, we investigated uniform differential item functioning (DIF) between students whose primary language is English and those whose primary language is not English, using a subsample of students for whom primary language data were available from district administrative records (n = 667; 58.2% English primary language, 41.8% non-English primary language). DIF analyses were conducted using the Extended Rasch Models package (eRm; Mari et al., 2021) and evaluated according to ETS DIF criteria (Zwick, Thayer, & Lewis, 1999).
Four of 37 items (10.8%) showed slight to moderate DIF. Two items were harder for non-English primary language students: freed (difference = 0.80 logits, Category C—moderate to large), which requires knowledge of past tense formation for verbs ending in ee, and collection (difference = 0.52 logits, Category B—slight to moderate), which involves a derivational transformation with orthographic-phonological complexity. Two items were harder for English primary language students: editor (difference = −0.60 logits, Category B), which uses the agentive suffix -or rather than the more frequent -er, and countless (difference = −0.58 logits, Category B), which combines an abstract meaning with semantic complexity. The bidirectional pattern of DIF, with no systematic disadvantage for either language group and only 4 of 37 items flagging, supports the overall fairness of the ROAR-Morphology assessment while highlighting specific items that warrant attention in future development.
Research Question 2
Item Difficulty Modeling
To validate our hypothesized learning progression of morphological knowledge, we conducted a series of regression analyses examining how various item features predicted Rasch-calibrated item difficulty estimates. These analyses allowed us to empirically test whether the factors we identified in our construct map (e.g., suffix type, suffix commonality, and number of derivational distractors) significantly influenced item difficulty in the predicted directions.
Effects of Primary Morphological Features
Regression Models Predicting Item Difficulty
Note. Bold estimates indicate statistically significant predictors. *p < .05.
Test Design Features
Model 3 (M3) examined whether the number of derivational distractors in the answer choices influenced item difficulty. This test design feature had a significant effect (p = .005), explaining 21.3% of the variance in item difficulty. Items with more derivational distractors were more challenging for students, supporting our hypothesis that distinguishing between multiple morphologically complex forms represents a meaningful aspect of suffix-based morphological knowledge. This finding confirms that cognitive processing demands, specifically the need to differentiate between competing derivational forms, contribute meaningfully to item difficulty beyond the linguistic features of suffix type and suffix commonality alone.
Simplified Morphological Model
Simplified Morphological Model
Note. *p < .05.
Age of Acquisition Effects
Age of Acquisition Model
Note. *p < .05.
Controlling for Base Word Features
Models Controlling for Base Word Features
Note. *p < .05.
Grade Effects
To address whether our developmental waypoints reflect genuine differences in suffix-based morphological knowledge development rather than general grade effects, we examined grade level as a predictor of suffix-based morphological performance. While grade significantly predicted ability estimates (β = .213, p < .05), it explained minimal variance (R2 = 2.4%), indicating that grade level alone does not account for the suffix-based morphological knowledge patterns observed.
Distribution of Students Across Performance Levels by Grade
Note. Percentages within each grade show the distribution of students at each performance level within that grade; row percentages sum to 100%.
As shown in Table 6, students within each grade were distributed across multiple performance levels. For example, Grade 2 students ranged from Waypoint 0 (Initial, 12.8%) through Waypoint 3 (Advancing 29.9%) to Average and Above (39.3%), demonstrating that second graders showed varied levels of suffix-based morphological knowledge development. Similarly, Grade 3 students spanned all performance levels, with 5.3% at the Initial level and 62.3% at Average and Above. This pattern continued in Grades 4 and 5, where students at the same grade level demonstrated proficiency ranging from Initial to Average and Above.
Conversely, students at each performance level came from multiple grade levels. For example, students at Waypoint 3 (Advancing) included second graders (n = 96), third graders (n = 42), fourth graders (n = 24), and fifth graders (n = 23). This overlap across grades further supports the interpretation that our waypoints capture meaningful differences in suffix-based morphological knowledge development that are not simply a function of age or grade level. These patterns confirm that suffix-based morphological knowledge development follows an individual trajectory that, while generally progressing with age and schooling, varies considerably among students and cannot be reduced to grade-level expectations.
It is notable that over half of the sample (51.2%) performed at or above average, reflecting our deliberate assessment design to provide fine-grained measurement of developing suffix-based morphological knowledge. As discussed in the construct map development, we intentionally calibrated items to provide maximum measurement precision at the lower end of the ability spectrum, where distinctions among developing readers are most critical for informing instruction. Students performing at Average and Above demonstrate proficiency with the suffix-based morphological structures targeted by this assessment, though future iterations will include additional items at higher difficulty levels to better characterize advanced suffix-based morphological knowledge. This design choice aligns with the assessment’s primary purpose: providing detailed information about students’ developing suffix-based morphological knowledge during the critical developmental window in grades 2–5.
Research Question 3
External Validity
Grade Distribution Across Samples
Note. Second graders were not included in the external validity sample because they do not participate in state standardized testing. The DIF sample reflects the subset of students for whom primary language data were available from district administrative records; DIF analyses are reported in the Results section.
Race/Ethnicity Distribution Across Samples
English Learner Status Distribution Across Samples
Note. EL = English learner; EO = English only; IFEP = initially fluent English proficient; RFEP = reclassified fluent English proficient.
Primary Language Distribution Across Samples
Figure 5 presents scatter plots showing the relationships between SBAC-ELA scores and each of the three ROAR measures. As expected, all three ROAR measures were moderately correlated with SBAC-ELA scores (r = 0.57–0.66), with ROAR-Morphology showing the highest correlation (r = 0.66). The correlation matrix (Table 11) shows moderate to strong correlations among all measures, providing evidence for concurrent validity. The finding that ROAR-Morphology showed the strongest correlation with SBAC-ELA scores (r = 0.66) among the three ROAR measures suggests that suffix-based morphological knowledge may be particularly relevant to broader literacy achievement. Scatter Plots showing Relationships between SBAC-ELA scores and the three ROAR-measures: ROAR-Morphology (A), ROAR-Sentence (B), and ROAR-Word (C), n = 364 students) Correlations Among SBAC-ELA and the Three ROAR Measures (Morph, Word, and Sentence) Note. All correlations are Pearson’s r.
Results From the Multiple Regression Models Predicting SBAC-ELA Scores with Three ROAR Measures
Note. B = unstandardized regression coefficient.
As shown in Table 12, ROAR-Morphology had a positive and statistically significant effect on SBAC-ELA scores even after controlling for the other two ROAR measures (Model M10). Adding ROAR-Morphology to the model increased explanatory power by 7.2 percentage points (from adjusted R2 = 47.5% in M9 to 54.7% in M10, a statistically significant improvement (p < .001). These results provide strong evidence that suffix-based morphological knowledge, as measured by ROAR-Morphology, contributes uniquely to literacy achievement beyond word recognition and sentence reading skills, supporting both the concurrent and discriminant validity of the measure.
Discussion
Our study reports on the development and validation of ROAR Morphology, a brief assessment designed to characterize suffix-based morphological knowledge development in students in grades 2–5 along a learning progression from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes
The 31.6% of variance explained by suffix-based morphological features represents a substantial and theoretically meaningful contribution to item difficulty. While this might initially appear modest (Embretson, 1996), this level of variance is expected given the multifaceted nature of reading comprehension in authentic contexts. Because our assessment deliberately taps suffix-based morphological knowledge within a written sentence context, multiple linguistic and cognitive factors simultaneously influence performance. Importantly, when controlling for base word characteristics, suffix-based morphological features (e.g., suffix type and suffix commonality) continued to explain significant variance in item difficulty, confirming that item difficulty reflects morphological knowledge rather than word familiarity.
Our assessment demonstrated strong psychometric properties and predictive validity; however, it represents only one component of the broader morphological awareness construct. Specifically, ROAR Morphology measures suffix-based morphological knowledge within written sentence contexts, focusing on students’ ability to select appropriate suffix-based word forms to complete meaningful sentences. This component is particularly relevant for literacy development, as it captures how suffix-based morphological knowledge is accessed when students encounter morphologically complex words in written academic texts.
However, students may demonstrate uneven development across different components of morphological awareness (Apel et al., 2022). A student who performs well on suffix-based morphological processing tasks might struggle with morphological analysis or synthesis. Future research should examine how performance on ROAR Morphology relates to other components of morphological awareness and whether different components contribute uniquely to reading outcomes. Educational applications should recognize that comprehensive morphological assessment may require multiple measures targeting different components of the construct.
The moderate variance explained by suffix-based morphological features is consistent with our assessment design approach and with theoretical models positioning morphological knowledge as one component within an integrated linguistic system (Kim, 2020; Perfetti & Stafura, 2014). Items that assess morphological knowledge in authentic written sentence contexts will necessarily engage other linguistic and cognitive processes, as they do during actual text reading. A finding of near-total variance explained by morphological features alone would suggest items were artificially isolated from the contextual demands that characterize real reading.
Theoretical Implications
While confirming the general inflectional-to-derivational trajectory (Apel & Lawrence, 2011; Carlisle & Nomanbhoy, 1993), results reveal important gradations with significant assessment and instructional implications. Specifically, the learning progression captured by ROAR Morphology, from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes, reflects the naturally occurring progression of suffix-based morphological knowledge development during the critical grades 2–5 window. The effect of suffix commonality provides empirical support for usage-based theories of language acquisition (Tomasello, 2003), suggesting that linguistic structures become entrenched through exposure and frequency effects and that this entrenchment follows a predictable developmental sequence from common to less common suffix forms.
Our most novel contribution may be demonstrating that cognitive processing demands play a role in suffix-based morphological knowledge application. The finding that derivational distractors explained 21.3% of variance suggests that selecting among competing morphological forms represents a meaningful aspect of suffix-based morphological knowledge, though whether this primarily reflects tacit morphological knowledge or engages broader cognitive control mechanisms remains an open question. This finding extends prior work on cognitive processing demands in morphological tasks (Schreuder & Baayen, 1995; Sweller, 1988) by demonstrating that distractor composition, specifically the number of competing derivational forms, contributes meaningfully to item difficulty in a written sentence context, beyond the effects of linguistic features alone.
The finding that our assessment primarily taps tacit rather than strategic morphological knowledge processing is also theoretically significant. Real-word tasks presented in meaningful sentence contexts, such as those used in ROAR Morphology, engage students’ internalized understanding of suffix-based morphological patterns rather than requiring conscious morphological analysis (Anglin, 1993; Carlisle, 2010; Nagy et al., 2014). This distinction is important for interpreting what ROAR Morphology reveals about students’ morphological knowledge. It captures automatized suffix-based morphological knowledge as it operates during reading, rather than explicit morphological awareness.
The high latent correlation between inflectional and derivational dimensions (r = .95) suggests these represent points on a continuum rather than fundamentally different processes, consistent with our learning progression framing and supporting unified models of morphological processing (Bond & Fox, 2015; Rueckl, 2016; Wilson, 2003). Rather than representing qualitatively distinct constructs, inflectional and derivational suffix knowledge appear to reflect earlier and later points along the same learning progression of suffix-based morphological knowledge, with the stronger reliability of the derivational dimension (.87 vs. .83) reflecting the greater range of difficulty among derivational items. The unique contribution to literacy achievement beyond word recognition and sentence reading (ΔR2 = 7.0%) positions suffix-based morphology as a bridge between word-level and text-level processes, aligning with recent meta-analytic evidence (Liu et al., 2024). This unique contribution is particularly noteworthy given that ROAR Morphology focuses specifically on suffix-based morphological knowledge in written sentence contexts, suggesting that even this focused component of morphological awareness contributes meaningfully to broader literacy achievement beyond other reading skills.
Methodological Contributions
This study demonstrates how construct mapping, psychometric modeling, and practical considerations can yield theoretically sound, practically useful measures. The construct mapping approach (Wilson, 2023) required explicit articulation of developmental waypoints before item creation, ensuring our assessment was theoretically grounded from inception. Specifically, by articulating a learning progression of suffix-based morphological knowledge (from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes) before item development, the assessment was designed to capture meaningful differences along this continuum rather than simply discriminating between high and low performers. This framework facilitated systematic investigation of what makes suffix-based morphological knowledge difficult for developing learners. We explicitly mapped how linguistic features (suffix type, suffix commonality) and cognitive demands (derivational distractors) were expected to influence item difficulty, then tested these hypotheses through regression analyses.
Beyond psychometric validation, construct mapping bridges assessment development and educational practice. The alignment between hypothesized and empirical item difficulties supports our validity argument (particularly in terms of its internal-structure), while our empirically validated waypoints provide meaningful descriptions that may inform instructional planning. A particularly important validity finding is that when controlling for base word characteristics (e.g., age of acquisition, phoneme count, concreteness, prevalence, and number of senses) suffix-based morphological features continued to explain significant variance in item difficulty. This demonstrates that item difficulty reflects suffix-based morphological knowledge rather than incidental word-level characteristics, strengthening confidence in the construct validity of the measure. This approach bridges assessment development and educational practice and has proven valuable across diverse disciplines (Blum et al., 2020, 2026; Brondfield et al., 2021; Narins et al., 2023).
Assessment Design and Validation
By synthesizing research on suffix type and suffix commonality into one comprehensive framework, we provided a structured approach to understanding suffix-based morphological knowledge on a vertical scale, representing a methodological advancement over assessments that examine these factors in isolation. By systematically controlling sentence structures and target word positioning, we minimized syntax demands that might contaminate suffix-based morphological knowledge measurement.
Through item response analysis, we identified specific features that predict difficulty along the suffix-based morphological knowledge continuum. Our construct map explicitly incorporates test design features like derivational distractors, recognizing these as meaningful construct characteristics rather than methodological artifacts. This is an important methodological contribution—by treating cognitive processing demands as a construct-relevant feature rather than nuisance variance, we were able to demonstrate that the number of competing derivational forms contributes meaningfully to item difficulty beyond morphological structure alone. When examined independently, suffix commonality explained 26.7% of variance in item difficulty (M2) and derivational distractors explained 21.3% (M3), demonstrating that both morphological structure and cognitive processing demands independently predict item difficulty. To further isolate the contribution of morphological features, when controlling for base word features (phonemes, concreteness, senses, prevalence), only suffix-based morphological variables provided significant signal (R2 = 17.3%). Adding test design features increased explanatory power to 25.3% (adjusted R2), demonstrating that morphological features, not base word characteristics, drive item difficulty. The finding that item difficulty reflects the morphological structure of items rather than their lexical properties provides strong evidence for the construct validity of ROAR Morphology as a measure of suffix-based morphological knowledge.
The clear difficulty separation between inflectional words with common suffixes, derivational words with common suffixes, and derivational words with less common suffixes supports our assessment framework’s validity and reveals the hierarchical nature of suffix-based morphological knowledge development along the learning progression. Our empirically derived waypoints align with our hypothesized construct map while adding precision through test design feature integration, suggesting that meaningful suffix-based morphological assessment must consider both linguistic complexity and cognitive processing demands. The convergence between our hypothesized learning progression and the empirically derived mean item difficulty ordering provides particularly strong support for the validity of the construct map as a representation of suffix-based morphological knowledge development during the critical grades 2–5 window.
External Validity and Practical Utility
By demonstrating that suffix-based morphological knowledge contributes unique variance to literacy achievement beyond word recognition and sentence reading, we establish concurrent validity of our measure in addition to discriminant validity and predictive utility. The finding that ROAR-Morphology explains an additional 7.2 percentage points of variance in SBAC-ELA scores beyond other reading measures underscores the importance of including suffix-based morphological assessment in comprehensive reading evaluations. Notably, ROAR-Morphology showed the strongest correlation with SBAC-ELA scores (r = 0.66) among the three ROAR measures, suggesting that suffix-based morphological knowledge may be particularly relevant to broader literacy achievement beyond what is captured by word reading and sentence reading measures alone.
Our initial construct map provided a strong foundation but the regression analyses allowed us to refine our understanding of which factors most significantly impact suffix-based morphological knowledge along the learning progression. This iterative process, central to the construct mapping approach, ensures that assessments remain responsive to empirical evidence while maintaining theoretical coherence. The success of this approach suggests that construct mapping could address similar challenges in other areas of literacy assessment by articulating distinct waypoints in the development of literacy skills and systematically varying item features to target these waypoints. For ROAR Morphology specifically, the alignment between the hypothesized construct map and the empirical item difficulty ordering, combined with the unique contribution to literacy achievement, provides a strong validity argument for the assessment’s use in characterizing students’ suffix-based morphological knowledge development during the critical grades 2–5 window.
Educational Applications
ROAR Morphology’s empirically validated waypoints translate directly into practical benchmarks for educational settings, providing educators with meaningful information about where students fall along the suffix-based morphological knowledge learning progression to guide word study and morphological instruction in comprehensive literacy classrooms. The substantial within-grade variation in suffix-based morphological knowledge levels (see Table 6) suggests that this knowledge reflects genuine linguistic differences rather than simply chronological age, supporting the utility of waypoint-based differentiation within a single grade level.
It is important to note that ROAR Morphology provides information about one specific component of morphological awareness rather than a comprehensive assessment of morphological competence (Apel et al., 2022). Educators should interpret waypoint classifications as useful indicators of students’ suffix-based morphological knowledge development that may inform instructional planning and identify students who could benefit from further assessment, rather than as diagnostic determinations of morphological difficulty.
Exploratory Findings: Transparency
Because prior research highlighting the role of morphological transparency in processing difficulty (Apel et al., 2023), we conducted post-hoc analyses examining whether transparency predicted item difficulty in our sentence-based assessment. Transparency was not a primary focus of item development and was not systematically manipulated; its distribution reflected naturally occurring patterns of English morphology. Results indicated that transparency did not significantly predict item difficulty. Several factors likely limited our ability to detect transparency effects: items were not systematically varied for transparency, the distribution across transparency categories was uneven, and the sentence-based cloze format may have reduced reliance on transparency cues by providing rich semantic and syntactic context. Notably, our task format is similar to Task 4 in Apel et al. (2023), who did find transparency effects in a comparable format—suggesting that our null finding may reflect the specific constraints of our post-hoc approach rather than a true absence of transparency effects in sentence-based morphological tasks. Future research should systematically examine transparency effects in written sentence contexts, varying transparency as a primary design feature to provide more definitive evidence.
Limitations and Future Directions
Several important limitations should be considered when interpreting results and planning instructional applications. ROAR Morphology measures suffix-based morphological knowledge in a specific context: students’ ability to select appropriately suffixed word forms to complete meaningful sentences in a written format. The assessment focuses on the receptive/recognition aspects of suffix-based morphological knowledge demonstrated through a multiple-choice format, primarily tapping automatized suffix-based morphological knowledge as it operates during reading rather than strategic morphological processing or explicit morphological awareness (Anglin, 1993; Carlisle, 2010; Nagy et al., 2014). As such, ROAR Morphology does not measure all components of morphological awareness (e.g., morphological decomposition, semantic morphological problem-solving, morphological awareness in non-contextualized formats, or explicit awareness of morphological relationships and patterns). A student who performs well on ROAR Morphology might struggle with other components of morphological awareness, and a student who struggles on ROAR Morphology might demonstrate strengths in other morphological components (Apel et al., 2022). Comprehensive evaluation of students’ morphological awareness may therefore require multiple measures targeting different components of the construct.
Additionally, the assessment focuses exclusively on suffix-based morphological knowledge, omitting prefixes, compound words, bound bases, and more complex morphological structures. This design decision reflected our goal of creating a brief, developmentally appropriate assessment for grades 2–5 that captures the learning progression from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes. However, it limits generalizability to the full scope of morphological awareness development. Future research should examine how suffix-based morphological knowledge relates to other morphological structures and whether similar developmental progressions characterize prefix knowledge, compound word knowledge, and more complex morphological structures.
Further, as a brief (<10 minute) assessment administered in a single session, ROAR Morphology provides a snapshot of students’ developing suffix-based morphological knowledge rather than a comprehensive, in-depth assessment of morphological knowledge. The assessment was deliberately designed with items concentrated at the lower end of the ability spectrum to provide fine-grained measurement of students at developing levels of suffix-based morphological knowledge, which may result in ceiling effects for students with advanced suffix-based morphological knowledge. Future iterations of the assessment will address this limitation by including additional items at higher difficulty levels to provide enhanced measurement precision across the full learning progression.
The sample also presents limitations as the assessment was developed and validated with a convenience sample from Northern California that differs from national demographics. Specifically, our sample overrepresented English Learners (14.5% vs. 10.6% nationally) and Asian students (31.8% vs. 5.4%), and underrepresented Black students (0.4% vs. 14.9% nationally) and White students (26.9% vs. 44.6% nationally). While the overrepresentation of English Learners in our sample is noteworthy given prior research suggesting morphological awareness may develop differently for multilingual learners (Goodwin & Ahn, 2013; Mendes & Kirby, 2024), it also reflects the linguistic diversity of the Northern California context in which the assessment was developed. Generalizability to other populations should be examined in future research, particularly with more demographically representative samples and with students from different linguistic backgrounds. Additionally, because the sample was one of convenience, information on disability status was not systematically collected. Consistent with population prevalence estimates for dyslexia (Wagner et al., 2020), students with dyslexia were likely represented in the sample, though we were unable to verify their proportion. (Wagner et al., 2020); however, we were unable to verify whether students with identified disabilities were over- or underrepresented. Future research should examine ROAR-Morphology performance specifically in students with dyslexia, developmental language disorder, and other reading-related challenges, as morphological knowledge profiles may differ meaningfully across these populations.
Finally, the cross-sectional design of this study leaves questions about developmental trajectories for longitudinal investigation. While the substantial within-grade variation and cross-grade overlap in performance levels support our interpretation that the waypoints capture genuine developmental differences in suffix-based morphological knowledge rather than grade-related maturation, longitudinal research is needed to confirm the developmental sequence and examine how waypoint progression relates to broader literacy development over time.
These limitations point to several promising directions for future research. Validation studies with more demographically representative samples and with specific populations—including students with dyslexia and multilingual learners—would strengthen confidence in the assessment’s utility across diverse educational settings. Intervention studies examining whether instruction aligned with students’ waypoint classifications improves suffix-based morphological knowledge and broader literacy outcomes would provide important evidence for the assessment’s educational utility. Research examining how suffix-based morphological knowledge, as measured by ROAR Morphology, relates to other components of morphological awareness would advance our understanding of the construct’s multidimensional nature and help clarify the relative contributions of different morphological components to literacy development (Apel et al., 2022). Additionally, research examining relationships between suffix-based morphological knowledge and other language skills—including phonological awareness, working memory, and syntactic awareness—using person regression models (Van den Noortgate & Paek, 2004) would inform more comprehensive models of literacy development.
Conclusion
ROAR Morphology fills a critical gap in classroom-based assessment of suffix-based morphological knowledge through a theoretically grounded, empirically validated measure. The significant and unique contribution to literacy achievement beyond word recognition and sentence reading (ΔR2 = 7.2%, p < .001) underscores the importance of including suffix-based morphological assessment in comprehensive literacy evaluations. With administration under 10 minutes, ROAR Morphology addresses practical barriers that have limited morphological assessment in classroom settings. The empirically derived developmental waypoints, representing a learning progression from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes, provide educators with meaningful information about where students fall along the suffix-based morphological knowledge continuum during the critical grades 2–5 window. This focused scope enables efficient whole-class administration and automatic scoring, with particular precision at developing levels where instructional support may be most needed. As academic literacy demands increase during the elementary years, ROAR Morphology offers educators and researchers a validated, efficient tool for characterizing suffix-based morphological knowledge development at a critical point in the literacy acquisition trajectory.
Footnotes
Acknowledgments
We thank the teachers, administrators, and students in the participating school districts for their collaboration in this research.
Ethical Considerations
This research was approved by the Stanford University’s research compliance office.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by AERDF.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Data supporting the conclusions of this article are available upon reasonable request to the corresponding author.
