What Gets Measured? A Systematic Review of Assessment Practices in Spelling Intervention Research

Abstract

Spelling is a foundational literacy skill that supports both word reading and written expression. For students with or at risk of a learning disability (LD), difficulties in spelling often constrain the fluency and complexity of writing, making effective interventions essential. Yet, the conclusions drawn about intervention efficacy depend heavily on how outcomes are measured. This review synthesizes outcome measurement practices across 59 spelling intervention studies conducted over the past five decades. All outcome measures (n = 233) were coded by type (researcher-developed vs. norm-referenced) and by linguistic level (sublexical, lexical, sentence, discourse) using the Interactive Dynamic Literacy (IDL) framework. Descriptive analyses revealed that nearly four out of five outcomes were lexical, most often researcher-developed lexical-level spelling probes, with comparatively few outcomes at the sentence or discourse levels. Standardized assessments were similarly concentrated at the word level, with the Wide Range Achievement Test–Spelling subtest and Test of Written Spelling most commonly used. Finally, the pairing of proximal and standardized outcomes was inconsistent, particularly among group designs. Taken together, findings highlight a measurement bottleneck: spelling interventions are evaluated primarily through lexical-level accuracy, offering limited insight into whether gains transfer to the higher-level writing processes for students with or at risk for LD.

Keywords

spelling learning disabilities outcome measurement

Despite the availability of spell-check and predictive text, spelling remains a critical academic skill that underpins literacy development. Accurate and automatic spelling reduces transcriptional demands, which allows students to devote more cognitive resources to planning, idea generation, and organization in writing (Berninger et al., 2002; Graham et al., 2016). Well-specified word forms also contribute to efficient word recognition, linking spelling to broader literacy outcomes (Ehri, 2000). For students with or at risk for learning disabilities (LD), difficulties in spelling often constrain the length and complexity of their written products, which underscores the importance of effective spelling instruction and assessment (Graham, 1997). Recent meta-analytic evidence indicates that spelling interventions produce meaningful gains, with moderate effects on both spelling (g = 0.33) and word reading (g = 0.28) for students with or at risk of LD (Chandler et al., 2025). However, questions remain about how spelling performance is assessed. The present review addresses this issue by examining how outcome measures in spelling intervention research capture different levels of language and the extent to which current assessment practices reflect models of literacy development.

Theoretical Foundations to Guide Spelling Assessment

Spelling is best understood not as a single skill but as an integration of phonological, orthographic, and morphological knowledge (Apel & Masterson, 2011). From this perspective, assessment should move beyond global accuracy scores—simple sum of the number or percentage of words spelled correctly—to capture the linguistic strategies students use when spelling. A global score doesn’t allow for insight into why an error occurred—whether the student can represent phonemes with graphemes, apply orthographic conventions, or use morphological relationships to generate correct spellings. Theory-aligned assessment is important, not only for documenting growth, but also for guiding instruction: educators need to know which linguistic components are changing in order to target instruction effectively. This approach aligns with broader models of literacy development.

The interactive dynamic literacy (IDL) model (Kim, 2020), for example, organizes literacy according to linguistic grain size—sublexical, lexical, sentence, and discourse—and posits hierarchical, interactive, and dynamic relations among levels. Within this model, spelling is situated at the lexical level, constrained by sublexical processes (e.g., phoneme–grapheme mapping, handwriting fluency) and expected to exert upward influence on sentence fluency, discourse-level writing, and reading comprehension. Critically, the IDL underscores that observation of these relations depends on how outcomes are measured—different task formats and scoring dimensions can differentially tap component skills. To test this predicted cascade, intervention studies require outcome measures that are both theory-guided and transfer-sensitive (sentence and discourse-level writing). In this way, theory-driven assessment becomes the bridge between intervention design and instructional decision-making. It tells us not only whether interventions improve spelling of taught sublexical skills or words, but whether those improvements generalize into broader literacy outcomes.

The Challenge of Skill Transfer for Students With LD

For students with LD, the issue of generalization and transfer is especially relevant. Decades of research in special education have documented that the group of learners often demonstrate improvement on skills taught in isolation but struggle to apply those skills flexibly across tasks, contexts, or levels of language (Gersten et al., 2000; Swanson, 2014). In spelling, this means that gains observed on proximal tasks—such as researcher-developed word lists—may not translate into improved sentence construction or discourse-level writing. Insights from learning sciences similarly emphasize that transfer requires instruction and assessment conditions that mirror the complexity of authentic tasks (Bransford & Schwartz, 1999). In other words, if the goal is for students to apply spelling knowledge in sentence construction, content-area writing, or extended composition, then our outcome measures must likewise move beyond isolated word lists to capture performance within connected, meaningful writing tasks. If outcome measurement in intervention studies remains narrowly focused on lexical-level accuracy, we risk overstating the practical impact of interventions for learners who need the most support. For these students, theory-guided and transfer-sensitive assessments are essential to determine not just whether spelling instruction works, but whether it meaningfully improves their capacity to produce fluent, accurate, and complex writing.

Spelling Outcome Measurement

Despite nearly five decades of spelling intervention research, we know relatively little about whether spelling interventions produce gains that generalize beyond the tasks on which students are directly taught and tested. Most studies report outcomes on proximal lexical-level measures (e.g., Bazis et al., 2022; Darch et al., 2000; Graham & Freeman, 1986), which are sensitive to instruction but cannot establish whether improvements transfer to broader literacy tasks such as sentence construction or discourse-level writing. Standardized assessments, when used, are similarly concentrated at the word level and rarely capture writing performance (e.g., Ehri et al., 2009; D. C. Simmons et al., 2007). Moreover, it remains unclear how often researchers combine proximal and standardized outcomes to evaluate both sensitivity and generalizability. This narrow approach to outcome measurement creates a critical blind spot in spelling intervention research. Positive intervention effects may reflect student performance on tightly aligned lexical-level tasks, but do these interventions support broader skill development and improvements in the authentic writing abilities that matter most for students LD?

Purpose of the Present Review

The purpose of this review is to examine how outcome measures have been used in spelling intervention research over the past five decades. Specifically, we ask: What are the characteristics of outcome measures used in spelling intervention studies? To address this question, we consider (a) the overall landscape of assessment practices; (b) the standardized, norm-referenced assessments are most frequently used across this body of research; and (c) the extent to which studies report both proximal and standardized measures to evaluate intervention effects. We situate this review within Apel and Masterson’s (2011) call for theory-guided spelling assessment and Kim’s (2020) IDL model, recognizing that assessments must reflect the multidimensional nature of spelling and capture how improvements in spelling transfer across multiple levels of language.

In this manuscript, we use the terms sublexical and lexical to refer to sub-word-level and word-level processes, respectively, consistent with the terminology of the IDL model. When discussing outcome measures, these terms describe the linguistic grain size targeted by an assessment rather than the format or complexity of the task.

Method

The present review reports a secondary descriptive analysis of outcome measurement practices drawn from a previously published meta-analysis of spelling interventions for students with or at risk for LD (Chandler et al., 2025). That meta-analysis synthesized intervention effects across spelling, reading, and writing outcomes. In contrast, the current manuscript focuses specifically on how outcomes were measured, including the types of assessments used and the linguistic levels they targeted. No new studies were identified for the present analysis, and no additional data extraction beyond outcome measurement characteristics was conducted.

Literature Search and Study Identification Procedures

The dataset for this review was derived from a comprehensive systematic search conducted for the original meta-analysis. Studies were identified through electronic database searches of PsycINFO, ERIC, Academic Search Complete, and Education Source for research (including dissertations and theses) published in English between 1975 and December 2022. This date range was selected to capture all spelling intervention research conducted since the passage of the Education for All Handicapped Children Act of 1975 (now the Individuals with Disabilities Education Act [IDEA]).

Search terms were developed with assistance from a university-based librarian. Four electronic databases (i.e., PsycINFO, ERIC, Academic Search Complete, Education Source) were used with the following search terms: spelling or orthograph* AND “learning disab*” OR “learning disorder*” OR “learning diff*” OR “reading isability*” OR “reading disorder*” OR “reading diff*” OR dyslexi* OR (read* n1 strugg*) OR (read* n1 slow) OR (read* n1 delay*) OR “writing disabil*” OR “writing disorder*” OR “writing diff*” OR dysgraphia OR “special needs” OR “students with disabilities” OR “mild handicap*” OR “poor spellers” AND intervention OR strategy OR strategies OR treatment OR approach OR supplemental OR “pull out” OR “small group*” OR remedia* OR differentiat* OR program OR curricul* OR lesson OR teaching method OR instruction OR training. This initial search, conducted in January of 2023, returned 3,916 results.

Screening and Eligibility Criteria

After removal of duplicate records (n = 926), 3,012 abstracts were screened using the Covidence systematic review platform. Abstract screening was completed independently by the first author and a doctoral student, with 95% agreement and discrepancies resolved through discussion. Studies were eligible for inclusion if they met the following criteria:

At least 50% of the participants were K-12 students (i.e., ages 5–18) with or at risk of LD. Students with or at risk of LD had to be identified either through researcher-administered screening procedures or school-based identification procedures (e.g., identified as a student with a disability under IDEA, students have Individualized Education Programs [IEPs]).

The study reported at least one primary outcome related to spelling or reading (e.g., word reading, oral reading fluency, orthographic learning).

The intervention had to primarily target spelling proficiency, with at least 50% of the activities dedicated to spelling instruction or practice in at least one treatment condition. This criterion was assessed based on dosage, as reported by the authors in terms of time allocated to spelling tasks.

The intervention was delivered in English.

The study employed a randomized controlled trial (RCT), quasi-experimental design (QED), or single case design (SCD). We defined an RCT as any study in which participants are randomly assigned to one of at least two groups, with at least one receiving an intervention and one serving as a contrasting comparison condition (i.e., control). A QED was defined as any study comparing outcomes between at least one group of participants who received an intervention and at least one group serving as a contrasting control condition; however, group membership did not need to be determined through random assignment. For a SCD to be included, it had to have at least three opportunities for a demonstration of intervention effect at a minimum of three different points in time between adjacent conditions. SCDs that were accepted include withdrawal/reversal designs (i.e., ABAB) and multiple baseline or probe designs. SCDs with less than three demonstrations of effect (e.g., AB, ABA designs) were not included. Alternating treatment, adapted alternating treatment, changing criterion, and multielement SCDs were excluded unless there was a true baseline prior to intervention and a control condition throughout the experiment. Without such features, these designs cannot yet be used to calculate SCD effect sizes.

The study included sufficient data to calculate effect sizes (e.g., means and standard deviations, F, or t values).

Backward and forward reference searches, ancestral reviews of prior syntheses, and targeted hand searches of prominent special education and literacy journals were also conducted as part of the original review process.

Full-Text Review

We retrieved full-text articles for all studies deemed eligible following abstract screening (n = 229). Five studies could not be retrieved despite attempts to contact authors, resulting in 224 studies advancing to full-text review; all unretrievable studies were dissertations. Prior to full-text review, author team members completed a 2-hour training focused on inclusion criteria and sample studies. Each article was then independently reviewed by two team members.

To promote consistency and reliability, the team met after reviewing the first 20 articles and again after the next 100 articles to resolve discrepancies and refine inclusion criteria language. These refinements clarified what constituted an explicit spelling intervention and distinctions among intervention types, without altering substantive eligibility criteria. Following full-text review, disagreements were resolved through discussion until consensus was reached. Interrater reliability at this stage was high (93%).

Of the studies reviewed, 165 were excluded for failing to meet one or more eligibility criteria, including ineligible study design (n = 66), fewer than 50% of participants identified with or at risk for LD (n = 37), intervention delivered in a language other than English (n = 22), intervention not primarily focused on spelling (n = 20), insufficient data for effect size calculation (n = 10), wrong publication type (n = 6), publication in a non-English language (n = 2), or absence of a relevant spelling or reading outcome (n = 2). A total of 59 studies met all inclusion criteria and were retained for analysis.

Outcome Measure Coding for the Present Review

All outcome measures reported in the 59 included studies were coded using a structured codebook developed for the original meta-analysis. For the purposes of the present review, coding focused specifically on measurement characteristics, including type of measure and linguistic level assessed. First, outcomes were categorized by type of measure as either researcher-developed (proximal) assessments or norm-referenced (standardized) assessments with published reliability and validity evidence. When a study included both types, outcomes were coded accordingly. Second, outcomes were coded by linguistic level, guided by the IDL model (Kim, 2020). Sublexical outcomes included measures of phonological awareness, sound–spelling correspondences, orthographic knowledge, handwriting, or morphological awareness. Lexical outcomes captured word-level spelling, word reading, pseudoword decoding, and vocabulary. Sentence-level outcomes required application of spelling within connected text (e.g., sentence dictation or sentence writing), and discourse-level outcomes included paragraph- or passage-level writing, oral reading fluency, and reading comprehension.

Coding was completed by trained doctoral students and the first author. All studies were double coded, and discrepancies were resolved through discussion until consensus was reached. Interrater reliability for measurement-related variables was high, with mean agreement of 97%, consistent with reliability reported in the original meta-analysis.

Data Analytic Approach

To explore the landscape of assessment practices in spelling intervention research, we summarize the distribution of outcome measures by type of measure (i.e., researcher-developed, norm-referenced, or both) and linguistic level (i.e., sublexical, lexical, sentence, discourse). We also examine patterns by study design—group versus SCD studies—given their differing traditions of outcome measurement. In addition, we catalog the standardized assessments most frequently used across studies. Finally, we consider the alignment of outcomes with theoretical expectations from the IDL model, with particular attention to whether outcome batteries captured potential transfer from lexical-level spelling to sentence- and discourse-level written expression.

Results

Across 59 studies, a total of 228 outcome measures were coded. Table 1 summarizes the measure characteristics by study, and Table 2 summarizes the distribution by type of measure and linguistic level. The vast majority of outcomes (n = 164; 71.9%) were at the lexical level, such as word-level spelling accuracy, word identification, vocabulary, and pseudoword decoding tasks. Most of these measures were researcher-developed (n = 112; 68.3%), and fewer were norm-referenced standardized tools (n = 52; 31.7%). Outcomes at the sublexical level (e.g., phonological awareness, orthographic knowledge, handwriting, morphological awareness) were less common (n = 38; 16.7%). Within this small set, researcher-developed measures (n = 25; 65.8%) outnumbered standardized measures (n = 13; 34.2%).

Table 1.

Spelling Outcome Measure Characteristics.

Study	Sublexical					Lexical							Sentence	Discourse
Study	Phonological awareness	Morphological awareness	Orthographic knowledge	Handwriting	Sound spelling	Decoding	Word identification	Oral reading fluency	Vocabulary	Orthographic knowledge	Handwriting	Spelling	Sentence-level Writing	Reading comprehension	Discourse-level writing
Group Design
Allen & Lembke (2022)					◽							•	•		◽
Bazis et al. (2022)				◽							◽	⦿
Berninger et al. (1995)										◽	◽
Berninger et al. (2013)					•	•						•
Berninger et al. (1998)												⦿
Berninger et al. (2000)	◽											•
Berninger et al. (2002)				◽
Berninger et al. (2008)						•						•
Bryant et al. (1981)												◽
Cassar & Jang (2010)	•						•					•
Clark Draper (1990)												•
Colenbrander et al. (2022)						⦿	⦿		◽			◽		•
Darch et al. (2006)												⦿
Darch et al. (2000)												◽
Darch & Simpson (1990)												⦿
Davis (2000)						◽						◽
Ehri et al. (2009)						◽	•					◽		•
Englert et al. (1985)							◽					◽
Fuchs et al. (2006)					◽		•	•				◽
Fulk (1996)												◽
Georgiou et al. (2021)		◽				◽	•
Gettinger et al. (1982)												◽
Gordon (1992)					◽							◽
Graham & Freeman (1986)												◽
Graham et al. (2018)				◽					◽		•	⦿	•		◽
Graham et al. (2002)					◽	•	•					•	•	◽
Kirk & Gillon (2009)						•	⦿					⦿
Lee & Scanlon (2015)	◽					◽	◽					◽
MacArthur et al. (1990)												◽
Matlock (1998)						◽	◽		◽
O’Connor & Jenkins (1995)	◽					◽	◽					◽
D. C. Simmons et al. (2007)			◽			•						◽
K. D. Simmons (2007)												⦿
Spencer et al. (1989)												•
Telzer (1993)												◽
Trainin et al. (2014)	•							◽
Waddel (1998)												◽
Wanzek et al. (2017)				◽	◽						◽	⦿			◽
Wolter & Dilworth (2014)						•	•					⦿		•
Single-Case Design
Aguirre & Rehfeldt (2015)												◽
Alber-Morgan et al. (2016)							◽					◽
Aleman et al. (1990)												◽
Bazis et al. (2022)							◽					◽
Brown (1995)												◽
Edwards et al. (1995)					◽
Frank et al. (1987)												◽
Gettinger (1993												◽
Kinney et al. (2013)												◽
Hughes et al. (2002)												◽
Joseph (1999)							◽					◽
Keesey et al. (2015)	◽											◽
Kubina et al. (2004)												◽
McCallum et al. (2013)					◽							◽
Murphy et al. (1990)												◽
Ott (2019)							◽					◽
Ross & Stevens (2003)												◽
Stevens & Schuster (1987)												◽
Taylor & Alber (2003)												◽
Telecsan et al. (1999)												◽

Note. • = Norm-referenced measure; ◽ = Researcher-created measured; ⦿ = Both types of measures administered.

Table 2.

Distribution of Spelling Outcome Measures Across Type and Linguistic Levels.

Level of language	Researcher-developed	Norm-referenced	Total #	% of all measures
Sublexical	25 (65.8%)	13 (34.2%)	38	16.7%
Lexical	112 (68.3%)	52 (31.7%)	164	71.9 %
Sentence	1 (16.7%)	5 (83.3%)	6	2.6 %
Discourse	14 (70.0%)	6 (30.0%)	20	8.8 %
Overall	152 (66.7%)	76 (33.3%)	228	100%

Note. Percentages within columns represent proportions of measures at each level of language. “% of All Measures” reflects the proportion of the total 228 coded outcome measures across all 59 studies.

Assessment of sentence-level outcomes was particularly limited. Only six measures were identified (2.6% of all outcomes), with five norm-referenced tasks (83.3%) and one researcher-developed measure (16.7%). Similarly, discourse-level outcomes were scarce (n = 20; 8.8%). These included paragraph or passage writing tasks, oral reading fluency, and reading comprehension measures. Most of these measures were researcher-developed (n = 14; 70%), with six standardized (30%).

Patterns differed somewhat by study design. Group design studies (n = 39) more frequently included standardized assessments alongside proximal measures, particularly at the lexical level. In contrast, SCD studies (n = 20) relied almost exclusively on proximal outcomes, most often researcher-developed spelling probes administered repeatedly to capture functional relations over time. Although this approach is consistent with the methodological traditions of SCD research, it further reinforces the overall pattern: outcomes remain concentrated at the word level, and few studies—regardless of design—assess sentence- or discourse-level writing.

In addition to the distribution of measures by linguistic level, we examined the types of proximal outcome measures reported. Across studies, proximal measures were most often researcher-developed word-level spelling probes, typically scored as percent correct on dictated word lists, often distinguishing taught from untaught words. Sublexical tasks, such as phoneme–grapheme mapping or orthographic choice, were less common and only a handful of studies incorporated sentence dictation or sentence-writing tasks. A small subset of studies included discourse-level proximal outcomes, such as expository writing samples, though these were rare. Overall, proximal measures were dominated by lexical-level spelling probes, reinforcing the pattern that most outcome measurement remains narrowly focused on lexical-level accuracy. In contrast, standardized assessments were more varied by instrument name but similarly concentrated at the lexical level, with only limited coverage of sentence- and discourse-level writing.

Standardized Assessments Used in Spelling Intervention Research

Across the 59 studies, 16 unique norm-referenced assessments were identified that resulted in 76 outcome measures. As shown in Table 2, the use of standardized assessments was heavily concentrated at the lexical level (n = 52; 68.4%), with far fewer measures at the sublexical (n = 13; 17.1%), sentence (n = 5; 6.5%), or discourse levels (n = 6; 7.9%).

As represented in Table 3, standardized assessments were distributed across range of subtests rather than common use of a single measure. The most commonly used standardized spelling assessments were the Wide Range Achievement Test–Spelling subtest (WRAT Spelling), the Wechsler Individual Achievement Test (WIAT) Spelling, and the Test of Written Spelling (TWS). Word reading was also commonly assessed using standardized, norm-referenced measures. The most frequent were the Woodcock Reading Mastery Tests (WRMT) Word Identification and Word Attack subtests, and the Test of Word Reading Efficiency (TOWRE). Sublexical skills were most often measured using the Comprehensive Tests of Phonological Processing (CTOPP) and Dynamic Indicators of Basic Early Literacy Skills (DIBELS).

Table 3.

Most Common Standardized Assessments in Spelling Intervention Research.

Measure	n	Linguistic level
WRMT
Word Identification	7	Lexical
Word Attack	5	Lexical
Letter Identification	1	Sublexical
Passage Comprehension	1	Discourse
WRAT
Spelling	8	Lexical
Word Reading	4	Lexical
WJ
Writing Fluency	4	Sentence
Spelling	2	Lexical
Spelling Sounds	2	Sublexical
Word Attack	1	Lexical
Written Expression	1	Discourse
WIAT
Spelling	5	Lexical
Written Expression	1	Discourse
Alphabet Writing Fluency	1	Sublexical
Test of Written Spelling	6	Lexical
CTOPP	5	Sublexical
TOWRE	5	Lexical
PAL	5	Lexical
DIBELS	3	Sublexical

Note. WRMT = Woodcock Reading Mastery Test; WRAT = Wide Range Achievement Test; WJ = Woodcock-Johnson; WIAT = Wechsler Individual Achievement Test; CTOPP = Comprehensive Test of Phonological Processing; TOWRE = Test of Word Reading Efficiency; PAL = Phonological Awareness for Literacy; DIBELS = Dynamic Indicators of Early Literacy Skills.

Far fewer standardized measures targeted broader writing outcomes. Only the WIAT Written Expression subtest and the Woodcock-Johnson (WJ) Writing cluster were identified as discourse-level measures, while sentence-level standardized tasks remained rare. A wide range of other assessments—such as the Gray Oral Reading Test (GORT), the Kaufman Test of Educational Achievement (KTEA), and several vocabulary measures—appeared only once across studies, which highlights the limited and uneven use of standardized tools for assessing higher-level literacy outcomes.

The Use of Proximal and Standardized Outcome Measures

We next examined whether studies included both proximal (researcher-developed) and standardized (norm-referenced) outcome measures. Across the 59 studies, most relied exclusively on proximal assessments (k = 37; 62.7%). A smaller subset (k = 22; 37.3%) incorporated both proximal and standardized outcomes, allowing researchers to capture sensitivity to instruction while also evaluating generalizability. Only a handful of studies (k = 4) relied solely on standardized assessments without including any proximal tasks. Patterns differed by study design. Among group design studies (k = 39), 56.4% (k = 22) included both proximal and standardized measures, 13 relied exclusively on proximal measures, and only 4 relied solely on standardized measures.

Discussion

Across 59 studies and 228 outcomes, assessments were clustered overwhelmingly at the lexical level (71.9%), with very few sentence- or discourse-level outcomes. Even within the lexical band, most measures were researcher-developed, whereas standardized tools were used less often. This review highlights a persistent imbalance in how spelling intervention outcomes have been assessed. Across nearly five decades of research, outcome measurement has been overwhelmingly concentrated at the word level, with relatively few studies examining sentence- or discourse-level writing. This narrow focus limits our understanding of whether interventions support the broader goal of writing—and overall literacy—development. From a theoretical perspective, the IDL model predicts that growth in spelling should contribute to higher-order literacy outcomes, including sentence fluency and discourse-level composition. Yet, without assessments that capture these levels, we cannot determine whether interventions achieve transfer beyond explicitly taught sublexical- or lexical-level skills. For students with or at risk for LD—who often struggle with generalization—the lack of higher-order outcomes leaves a critical gap: interventions may show success on tightly aligned proximal tasks but fail to demonstrate improvements in authentic writing performance.

The Landscape of Outcome Measures in Spelling Intervention Research

Our descriptive synthesis showed that nearly four out of five outcome measures were lexical, with far fewer assessing sublexical-, sentence-, or discourse-level skills. Proximal measures were especially dominated by lexical-level spelling probes, such as dictated word lists scored for percent correct. While these outcomes are sensitive to instructional effects, they provide only a partial picture of spelling development. From the perspective of the IDL model, such measures capture growth at the lexical level but fail to assess whether improvements extend or cascade upward into sentence- and discourse-level writing. Similarly, from Apel and Masterson’s (2011) perspective, global lexical-level accuracy scores obscure the specific phonological, orthographic, and morphological strategies students use when spelling.

This pattern echoes broader findings in literacy intervention research. Reading intervention studies frequently demonstrate gains on proximal decoding tasks without corresponding improvements in comprehension (e.g., Wanzek & Vaughn, 2007; Wanzek et al., 2010), which highlights the difficulty of demonstrating transfer to higher-order outcomes. Research on writing intervention shows a similar imbalance: transcription-focused interventions often improve spelling or handwriting but rarely assess, or find weaker effects on, discourse-level composition quality (Graham et al., 2018). For students with LD, this gap is particularly consequential, as difficulties in spelling have been shown to constrain the length, complexity, and quality of written text (Berninger et al., 2002; Graham, 1997). The dominance of lexical-level outcomes in spelling intervention research therefore limits our ability to determine whether interventions contribute to the higher-order writing skills that are most critical for academic success.

What the Use of Standardized Spelling Assessment Reveals

Our review also showed that standardized assessments, when used, were almost exclusively concentrated at the lexical level, with WRAT Spelling, TWS, and WIAT Spelling emerging as the most common standardized spelling assessments. This reliance reflects a narrow tradition within the field—one that prioritizes global accuracy scores while overlooking more fine-grained dimensions of spelling performance and the underlying mechanisms of learning.

Although these measures are often treated as interchangeable, they differ meaningfully in their word lists, linguistic content, and psychometric properties. For instance, the proportion of polymorphemic words included varies considerably (57% in WRAT Spelling, 36% in TWS, and 63% in WIAT Spelling). Such differences introduce systematic variability in how intervention effects are quantified and compared across studies, complicating efforts to synthesize findings and draw generalizable conclusions. Moreover, because scoring is typically limited to correct/incorrect, or whole-word accuracy, these assessments conflate different error types (e.g., phonological vs. orthographic vs. morphological), making it difficult to isolate which aspects of spelling knowledge were strengthened by the intervention.

More importantly, none of these measures directly assess the theoretical processes central to spelling development—phonological, orthographic, and morphological knowledge—or the transfer of spelling to authentic, higher-level writing tasks. Thus, while standardized measures are often treated as distal outcomes, they may function more as loosely related proxy indicators rather than true measures of whether interventions improve students’ capacity to apply and transfer spelling knowledge to higher-order tasks.

Pairing Proximal and Standardized Outcomes

A final pattern in this review concerned how often studies combined proximal and standardized measures. Nearly two-thirds of studies relied exclusively on proximal, researcher-developed outcomes, while just over one-third included both, and only a handful used standardized measures alone. This distinction matters for how intervention effects are interpreted. Proximal outcomes, such as researcher-developed word probes, are highly sensitive to instruction and are well-suited for detecting immediate effects of targeted practice. However, when used in isolation, they cannot address whether students can apply spelling knowledge more broadly. Standardized assessments, by contrast, are often positioned as distal outcomes, yet our review shows that they are inconsistently used and limited largely to global accuracy at the word level.

A key consideration in interpreting these patterns is the study design. In SCD, repeated measurement of a proximal dependent variable is not a weakness but a methodological feature: frequent, instruction-aligned probes are central to demonstrating experimental control through changes in level, trend, and replication across phases (Horner et al., 2005). These outcomes maximize sensitivity to intervention effects, but they rarely extend to sentence- or discourse-level transfer.

By contrast, group designs provide a stronger opportunity to balance proximal sensitivity with distal generalizability. Yet, in our sample of 39 group design studies, while more than half (n = 22) included both proximal and standardized outcomes, 13 relied exclusively on proximal measures. This pattern highlights a missed opportunity: group studies, in particular, should be structured to capture both the skills directly taught and the extent to which those skills generalize to standardized and higher-order writing outcomes.

Taken together, these findings suggest that intervention research benefits from a design-aligned approach to measurement. Although SCD studies rely on repeated proximal probes, they could be further strengthened by occasional inclusion of sentence- or discourse-level outcomes to measure transfer. Group design studies, meanwhile, should routinely pair proximal and standardized assessments to ensure that intervention effects are tested for both sensitivity and generalization across multiple linguistic levels.

Implications for Intervention Research and Practice

Overall, the findings from this review underscore that outcome measurement in spelling intervention research has been narrowly defined, with the vast majority of outcomes focused at the word level and relatively few capturing sentence- or discourse-level writing. Standardized assessments, when used, were similarly concentrated on global lexical-level accuracy, and many group design studies relied solely on proximal measures without including corresponding standardized assessments. This creates a fundamental challenge for evaluating intervention effects. Instructionally sensitive outcomes are effective for detecting immediate learning, but their widespread use leaves limited evidence regarding whether spelling gains generalize to authentic writing tasks.

From a theoretical standpoint, this imbalance in measurement limits the field’s ability to test core predictions about literacy development. The IDL model highlights how growth at the lexical level should influence sentence and discourse performance (Kim, 2020), while Apel and Masterson’s (2011) linguistic perspective stresses the importance of assessing the specific strategies students use when spelling. Yet, most intervention studies provide little information about whether improvements in spelling contribute to higher-order writing outcomes or whether interventions alter the phonological, orthographic, and morphological processes that underlie spelling growth.

For students with or at risk of LD, this gap has practical consequences. This group of learners often struggles with transfer of skills learned in isolation, making it critical to determine not only whether an intervention improves spelling of taught words but also to what extent those gains extend to written expression. A comprehensive approach to outcome measurement (i.e., one that captures both proximal sensitivity and distal generalization) is therefore essential. Such measurement is necessary not only for evaluating the effectiveness of interventions in research, but also for guiding instructional decisions in practice. Without it, the field risks both overstating intervention effects and underestimating the role of spelling in writing and overall literacy development.

Limitations

Several limitations of this review warrant consideration. First, our analyses were descriptive and relied on the information reported in primary studies. Proximal measures were often described in general terms (e.g., “researcher-developed spelling probe”), which limited our ability to fully characterize their linguistic focus and psychometric quality. Second, some standardized assessments appeared only once or twice across studies, making it difficult to draw strong conclusions about their frequency of use or comparability. Finally, because this review drew on outcome coding from a prior meta-analysis, it reflects the scope and detail of that database rather than a newly conducted comprehensive search.

Conclusion

Despite decades of intervention research, outcome measurement in spelling intervention studies remains narrowly concentrated at the lexical level, with limited attention to sentence- and discourse-level writing. Standardized assessments most often reflect global word accuracy, and there is inadequate attention to the inclusion of both proximal and standardized measures within a study. For SCD studies, standardized measures are rarely included, whereas the pairing of proximal and standardized measures is inconsistent in group design studies. This measurement pattern leaves a critical blind spot: we know that interventions can improve spelling and word reading of taught subskills and words, but we know far less about whether these gains generalize to the authentic writing tasks. Situating these findings within Apel and Masterson’s (2011) call for theory-guided assessment and Kim’s (2020) IDL model, we conclude that spelling intervention research must more purposefully design assessment batteries that capture both the mechanisms of spelling development and the transfer of spelling gains into higher-order literacy outcomes. Only then can the field fully evaluate whether spelling interventions equip students with the skills necessary for fluent and accurate written expression.

Footnotes

ORCID iDs

Brennan W. Chandler

Jessica R. Toste

Christina Novelli

Emily B. Hardeman

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

*Aguirre

A. A.

Rehfeldt

R. A.

(2015). An evaluation of instruction in visual imagining on the written spelling performance of adolescents with learning disabilities. The Analysis of Verbal Behavior, 31, 118–125. https://doi.org/10.1007/s40616-015-0028-0

*Alber-Morgan

S. R.

Joseph

L. M.

Kanotz

Rouse

C. A.

Sawyer

M. R.

(2016). The effects of word box instruction on acquisition, generalization, and maintenance of decoding and spelling skills for first graders. Education and Treatment of Children, 39(1), 21–43. https://doi.org/10.1353/etc.2016.0002

*Aleman

McLaughlin

T. F.

Bialozor

R. C.

(1990). Comparison of auditory/visual and visual/motor practice on the spelling accuracy of learning disabled children. Reading Improvement, 27(4), 261–268.

*Allen

A. A.

Lembke

E. S

. (2022). The effect of a morphological awareness intervention on early writing outcomes. Learning Disability Quarterly, 45(2), 72–84. https://doi.org/10.1177/0731948720912414

Apel

Masterson

J. J.

(2011). The spelling sensitivity score: Noting developmental changes in spelling knowledge. Assessment for Effective Intervention, 36(1), 35–45. https://doi.org/10.1177/1534508410380039

*Bazis

P. S.

Hebert

Wambold

Lang

(2022). Integration of reading and writing instruction to increase foundational literacy skills: Effects of the “Write Sounds” intervention on handwriting, decoding, and spelling outcomes. Learning Disabilities: A Contemporary Journal, 20(2), 151–174. https://eric.ed.gov/?id=EJ1359625

*Berninger

V. W.

Abbott

R. D.

Whitaker

Sylvester

Nolen

S. B.

(1995). Integrating low- and high-level skills in instructional protocols for writing disabilities. Learning Disability Quarterly, 18, 293–309. https://doi.org/10.2307/1511235

*Berninger

V. W.

Lee

Y. L.

Abbott

R. D.

Breznitz

(2013). Teaching children with dyslexia to spell in a reading-writers’ workshop. Annals of Dyslexia, 63, 1–24. https://doi.org/10.1007/s11881-011-0054-0

Berninger

V. W.

Vaughan

Abbott

R. D.

Begay

Coleman

K. B.

Curtin

Hawkins

J. M.

Graham

(2002). Teaching spelling and composition alone and together: Implications for the simple view of writing. Journal of Educational Psychology, 94(2), 291–304. https://doi.org/10.1037/0022-0663.94.2.291

10.

*Berninger

V. W.

Vaughan

Abbott

R. D.

Brooks

Abbott

S. P.

Rogan

Reed

Graham

(1998). Early intervention for spelling problems: Teaching functional spelling units of varying size with a multiple-connections framework. Journal of Educational Psychology, 90(4), 587–605. https://doi.org/10.1037/0022-0663.90.4.587

11.

*Berninger

V. W.

Vaughan

Abbott

R. D.

Brooks

Begayis

Curtin

Graham

(2000). Language-based spelling instruction: Teaching children to make multiple connections between spoken and written words. Learning Disability Quarterly, 23(2), 117–135. https://doi.org/10.2307/1511141

12.

*Berninger

V. W.

Winn

W. D.

Stock

Abbott

R. D.

Eschen

Lin

S.-J.

Garcia

Anderson-Youngstrom

Murphy

Levit

Trivedi

Jones

Amtmann

Nagy

(2008). Tier 3 specialized writing instruction for students with dyslexia. Reading and Writing, 21, 95–129. https://doi.org/10.1007/s11145-007-9066-x

13.

Bransford

J. D.

Schwartz

D. L.

(1999). Rethinking transfer: A simple proposal with multiple implications. Review of Research in Education, 24(1), 61–100. https://doi.org/10.3102/0091732X024001061

14.

*Brown

B. O.

(1995). The effects of a spelling word study strategy on the spelling performance of high school students with learning disabilities [Unpublished doctoral dissertation]. University of Florida.

15.

*Bryant

Drabin

I. R.

Gettinger

(1981). Effects of varying unit size on spelling achievement in learning disabled children. Journal of Learning Disabilities, 14(4), 200–203. https://doi.org/10.1177/002221948101400405

16.

*Cassar

A. G.

Jang

E. E.

(2010). Investigating the effects of a game-based approach in teaching word recognition and spelling to students with reading disabilities and attention deficits. Australian Journal of Learning Difficulties, 15(2), 193–211. https://doi.org/10.1080/19404151003796516

17.

Chandler

B. W.

Toste

J. R.

Novelli

Rodgers

Hardeman

(2025). A meta-analytic review of spelling interventions for students with or at-risk for learning disabilities. Journal of Learning Disabilities. Advance online publication. https://doi.org/10.1177/00222194251364836

18.

*Clark Draper

I. L

. (1990). A study of the efficacy of a student-controlled errorless learning device in spelling remediation with urban elementary learning-disabled students [Unpublished doctoral dissertation]. Wayne State University.

19.

*Colenbrander

Parsons

Bowers

J. S.

Davis

C. J.

(2022). Assessing the effectiveness of structured word inquiry for students in grades 3 and 5 with reading and spelling difficulties: A randomized controlled trial. Reading Research Quarterly, 57(1), 307–352. https://doi.org/10.1002/rrq.399

20.

Darch

Eaves

R. C.

Crowe

D. A.

Simmons

Conniff

(2006). Teaching spelling to students with learning disabilities: A comparison of rule-based strategies versus traditional instruction. Journal of Direct Instruction, 6(1), 1–16. https://eric.ed.gov/?id=EJ755191

21.

*Darch

Kim

Johnson

James

(2000). The strategic spelling skills of students with learning disabilities: The results of two studies. Journal of Instructional Psychology, 27(1), 15–26. https://www.proquest.com/openview/450e3be32bdeb4c7fc4494cb4450b83b/1?cbl=2029838&pq-origsite=gscholar

22.

*Darch

Simpson

R. G.

(1990). Effectiveness of visual imagery versus rule-based strategies in teaching spelling to learning disabled students. Research in Rural Education, 7(1), 61–70. https://eric.ed.gov/?id=EJ418893

23.

*Davis

L. H.

(2000). The effects of rime-based analogy training on word reading and spelling of first-grade children with good and poor phonological awareness [Unpublished doctoral dissertation]. Northwestern University.

24.

*Edwards

B. J.

Blackhurst

A. E.

Koorland

M. A.

(1995). Computer-assisted constant time delay prompting to teach abbreviation spelling to adolescents with mild learning disabilities. Journal of Special Education Technology, 12(4), 301–311. https://doi.org/10.1177/016264349501200402

25.

Ehri

L. C.

(2000). Learning to read and learning to spell: Two sides of a coin. Topics in Language Disorders, 20(3), 19–36. https://doi.org/10.1097/00011363-200020030-00005

26.

Ehri

L. C.

Satlow

Gaskins

(2009). Grapho-phonemic enrichment strengthens keyword analogy instruction for struggling young readers. Reading & Writing Quarterly, 25(2–3), 162–191. https://doiorg.ezproxy.lib.utexas.edu/10.1080/10573560802683549

27.

*Englert

C. S.

Hiebert

E. H.

Stewart

S. R.

(1985). Spelling unfamiliar words by an analogy strategy. The Journal of Special Education, 19(3), 291–306. https://doi.org/10.1177/002246698501900306

28.

*Frank

A. R.

Wacker

D. P.

Keith

T. Z.

Sagen

T. K.

(1987). Effectiveness of a spelling study package for learning disabled students. Learning Disabilities Research, 2(2), 110–118. https://doi.org/10.1177/093889828700202a07

29.

*Fuchs

L. S.

Fuchs

Hamlet

C. L.

Powell

S. R.

Capizzi

A. M.

Seethaler

P. M.

(2006). The effects of computer-assisted instruction on number combination skill in at-risk first graders. Journal of Learning Disabilities, 39(5), 467–475. https://doi.org/10.1177/00222194060390050701

30.

*Fulk

B. M.

(1996). The effects of combined strategy and attribution training on LD adolescents’ spelling performance. Exceptionality, 6(1), 13–27. https://doi.org/10.1207/s15327035ex0601_2

31.

*Georgiou

G. K.

Savage

Dunn

Bowers

Parrila

(2021). Examining the effects of Structured Word Inquiry on the reading and spelling skills of persistently poor grade 3 readers. Journal of Research in Reading, 44(1), 131–153. https://doi.org/10.1111/1467-9817.12325

32.

Gersten

Chard

Baker

(2000). Factors enhancing sustained use of research-based instructional practices. Journal of Learning Disabilities, 33(5), 445–456. https://doi.org/10.1177/002221940003300505

33.

*Gettinger

(1993). Effects of invented spelling and direct instruction on spelling performance of second-grade boys. Journal of Applied Behavior Analysis, 26(3), 281–291. https://doi.org/10.1901/jaba.1993.26-281

34.

*Gettinger

Bryant

N. D.

Wayne

H. R.

(1982). Designing spelling instruction for learning-disabled children: An emphasis on unit size, distributed practice, and training for transfer. The Journal of Special Education, 16(4), 439–448. https://doi.org/10.1177/002246698201600407

35.

*Gordon

J. S.

(1992). The effects of phonemic training on the spelling performance of elementary students with learning disabilities [Unpublished doctoral dissertation]. University of Miami.

36.

Graham

(1997). Executive control in the revising of students with learning and writing difficulties. Journal of Educational Psychology, 89(2), 223–234. https://doi.org/10.1037/0022-0663.89.2.223

37.

Graham

Collins

A. A.

Rigby-Wills

(2016). Writing characteristics of students with learning disabilities and typically achieving peers: A meta-analysis: A meta-analysis. Exceptional Children, 83(2), 199–218. https://doi-org.ezproxy.lib.utexas.edu/10.1177/0014402916664070

38.

*Graham

Freeman

(1986). Strategy training and teacher- vs. student-controlled study conditions: Effects on LD students’ spelling performance. Learning Disability Quarterly, 9(1), 15–22. https://doi-org.ezproxy.lib.utexas.edu/10.2307/1510397

39.

*Graham

Harris

K. R.

Adkins

. (2018). The impact of supplemental handwriting and spelling instruction with first grade students who do not acquire transcription skills as rapidly as peers: A randomized control trial. Reading and Writing, 31(6), 1273–1294. https://doi.org/10.1007/s11145-018-9822-0

40.

*Graham

Harris

K. R.

Chorzempa

B. F.

(2002). Contribution of spelling instruction to the spelling, writing, and reading of poor spellers. Journal of Educational Psychology, 94(4), 669–686. https://doi.org/10.1037/0022-0663.94.4.669

41.

Horner

R. H.

Carr

E. G.

Halle

McGee

Odom

Wolery

(2005). The use of single-subject research to identify evidence-based practice in special education. Exceptional Children, 71(2), 165–179. https://doi.org/10.1177/001440290507100203

42.

*Hughes

T. A.

Fredrick

L. D.

Keel

M. C.

(2002). Learning to effectively implement constant time delay procedures to teach spelling. Learning Disability Quarterly, 25, 209–220. https://doi.org/10.2307/1511303

43.

*Joseph

L. M.

(1999). Word boxes help children with learning disabilities identify and spell words. The Reading Teacher, 52(4), 348–356. https://eric.ed.gov/?q=clay&ff1=pubGuides+-+Classroom+-+Teacher&ff2=souReading+Teacher&id=EJ575090

44.

*Keesey

Konrad

Joseph

L. M.

(2015). Word boxes improve phonemic awareness, letter–sound correspondences, and spelling skills of at-risk kindergartners. Remedial and Special Education, 36(3), 167–180. https://doi.org/10.1177/0741932514543927

45.

Kim

Y. S. G.

(2020). Interactive dynamic literacy model: An integrative theoretical framework for reading-writing relations. In Reading-writing connections: Towards integrative literacy science (pp. 11–34). Springer International Publishing. https://doi.org/10.1007/978-3-030-38811-9_2

46.

*Kinney

Hochstetler

McLaughlin

T. F.

Derby

K. M.

Kinney

(2013). The effects of cover, copy, compare to teach spelling to middle school students with learning disabilities and OHI. Educational Research Quarterly, 36(4), 25–48. https://eric.ed.gov/?id=EJ1061938

47.

*Kirk

Gillon

G. T.

(2009). Integrated morphological awareness intervention as a tool for improving literacy. Language, Speech, and Hearing Services in Schools, 40(3), 341–351. https://doi.org/10.1044/0161-1461(2008/08-0009)

48.

*Kubina

R. M.

Young

Killeen

(2004). Examining an effect of fluency: Application of letter sound writing and oral word segmentation to spelling words. Learning Disabilities: A Multidisciplinary Journal, 13(1), 17–23. https://eric.ed.gov/?id=EJ808001

49.

*Lee

S. H.

Scanlon

D. M.

(2015). The effects of the interactive strategies approach on at-risk kindergartners’ spelling. Reading and Writing, 28, 313–346. https://doi.org/10.1007/s11145-014-9526-z

50.

*MacArthur

C. A.

Haynes

J. A.

Malouf

D. B.

Harris

Owings

(1990). Computer assisted instruction with learning disabled students: Achievement, engagement, and other factors that influence achievement. Journal of Educational Computing Research, 6(3), 311–328. https://doi.org/10.2190/CP8T-03UR-UP3E-1DFK

51.

*Matlock

K. J.

(1998). Comparing reading instructional methods for mildly disabled students [Unpublished doctoral dissertation]. University of South Carolina.

52.

*McCallum

Schmitt

A. J.

Evans

S. N.

Schaffner

K. F.

Long

K. H.

(2013). An application of the taped spelling intervention to improve spelling skills. Journal of Evidence-based Practices for Schools, 14(1), 51–80. https://books.google.com/books?hl=en&lr=&id=DUOACwAAQBAJ&oi=fnd&pg=PA51&ots=9b7nEwuAqS&sig=wyztTPxbg7WUvZ0IylozDBM_VNE#v=onepage&q&f=false

53.

*Murphy

J. F.

Hern

C. L.

Williams

R. L.

McLaughlin

T. F.

(1990). The effects of the copy, cover, compare approach in increasing spelling accuracy with learning disabled students. Contemporary Educational Psychology, 15(4), 378–386. https://doi.org/10.1016/0361-476X(90)90031-H

54.

*O’Connor

R. E.

Jenkins

J. R.

(1995). Improving the generalization of sound/symbol knowledge: Teaching spelling to kindergarten children with disabilities. The Journal of Special Education, 29(3), 255–275. https://doi.org/10.1177/002246699502900301

55.

*Ott

J. C.

(2019). The effects of time delay procedures on the acquisition, maintenance, and generalization of spelling sight words for elementary students with high-incidence dis-abilities [Unpublished doctoral dissertation]. The Ohio State University.

56.

*Ross

A. H.

Stevens

K. B.

(2003). Teaching spelling of social studies content vocabulary prior to using the vocabulary in inclusive learning environments: An examination of constant time delay, observational learning, and instructive feedback. Journal of Behavioral Education, 12(4), 287–309. https://doi.org/10.1023/A:1026002710327

57.

Simmons

D. C.

Kameenui

E. J.

Harn

Coyne

M. D.

Stoolmiller

Santoro

L. E.

Smith

S. B.

Beck

C. T.

Kaufman

N. K.

(2007). Attributes of effective and efficient kindergarten reading intervention: An examination of instructional time and design specificity. Journal of Learning Disabilities, 40(4), 331–347. https://doi.org/10.1177/00222194070400040401

58.

Simmons

K. D.

(2007). Improving the spelling skills of elementary students with mild learning and behavior problems: A comparison between an explicit rule-based method and traditional method [Unpublished doctoral dissertation]. Auburn University.

59.

*Spencer

Snart

Das

J. P.

(1989). A process-based approach to the remediation of spelling in students with reading disabilities. The Alberta Journal of Educational Research, 35(4), 269–282. https://psycnet.apa.org/record/1990-15842-001

60.

*Stevens

K. B.

Schuster

J. W.

(1987). Effects of a constant time delay procedure on the written spelling performance of a learning disabled student. Learning Disability Quarterly, 10(1), 9–16. https://doi.org/10.2307/1510750

61.

Swanson

H. L

. (2014). Meta-analysis of research on children with learning disabilities. In Swanson

H. L.

Harris

K. R.

Graham

(Eds.), Handbook of learning disabilities (2nd ed., pp. 627–642). The Guilford Press.

62.

*Taylor

L. K.

Alber

S. R.

(2003). The effects of classwide peer tutoring on the spelling achievement of first graders with learning disabilities. The Behavior Analyst Today, 4(2), 183–200. https://doi.org/10.1037/h0100113

63.

*Telecsan

B. L.

Slaton

D. B.

Stevens

K. B.

(1999). Peer tutoring: Teaching students with learning disabilities to deliver time delay instruction. Journal of Behavioral Education, 9(2), 133–154. https://doi.org/10.1023/A:1022841001198

64.

*Telzer

E. G.

(1993). The effects of modeled strategies and attributions on students’ self-regulated learning and spelling achievement [Unpublished doctoral dissertation]. City University of New York.

65.

*Trainin

Wilson

K. M.

Murphy-Yagil

Rankin-Erickson

J. L.

(2014). Taking a different route: Contribution of articulation and metacognition to intervention with at-risk third-grade readers. Journal of Education for Students Placed at Risk, 19(3–4), 183–195. https://doi.org/10.1080/10824669.2014.972103

66.

*Waddel

J. D.

(1998). The effect of treatment structure and locus of control on spelling achievement [Unpublished Doctoral dissertation]. The University of Cincinnati.

67.

*Wanzek

Gatlin

Al Otaiba

Kim

Y.-S. G.

(2017). The impact of transcription writing interventions for first-grade students. Reading and Writing Quarterly, 33(5), 484–499. https://doi.org/10.1080/10573569.2016.1250142

68.

Wanzek

Vaughn

. (2007). Research-based based implications from extensive early reading interventions. School Psychology Review, 36(4), 541–561. https://doi.org/10.1080/02796015.2007.12087917

69.

Wanzek

Wexler

Vaughn

Ciullo

. (2010). Reading interventions for struggling readers in the upper elementary grades: A synthesis of 20 years of research. Reading and Writing, 23(8), 889–912. https://doi.org/10.1007/s11145-009-9179-5

70.

*Wolter

J. A.

Dilworth

(2014). The effects of a multilinguistic morphological awareness approach for improving language and literacy. Journal of Learning Disabilities, 47(1), 76–85. https://doi.org/10.1177/0022219413509972