Missed Connections: Fidelity and Tier Alignment in Single-Case Research Design Studies of Tier 2 Literacy Interventions

Abstract

Students with and without disabilities increasingly learn to read within multitiered systems of support that provide universal (Tier 1) instruction and targeted (Tier 2) intervention. Researchers have emphasized the importance of evaluating Tier 2 interventions in relation to the quality and coherence of Tier 1 instruction. Although large-scale studies increasingly report implementation fidelity and alignment across tiers, it is unclear to what extent single-case research design (SCRD) studies include such information. The SCRD encompasses a variety of within-subject experiments that are common in special education and have potential benefits for students with or at-risk for learning disabilities. In this systematic review, we examined how SCRD studies of Tier 2 literacy interventions (k = 9) reported instructional fidelity and alignment across tiers. While most studies documented fidelity for the implemented intervention, none described the nature or quality of Tier 1 instruction. Findings indicate that SCRD research rarely situates Tier 2 outcomes within the context of Tier 1, limiting interpretability and integration with the larger evidence base on reading interventions.

Keywords

multitiered systems of support response to intervention single-case design research systematic review

Since their introduction, tiered instructional frameworks—referred to as Response to Intervention (RTI) and most recently as Multitiered Systems of Support (MTSS)—have become a defining feature of elementary reading instruction in the United States (Berkeley et al., 2020; See Note 1). Although manifestations of MTSS vary across states, components such as universal screening, progress monitoring, delivery of standard instruction in Tier 1, and supplemental intervention at successive tiers (e.g., Tier 2) represent commonalities across systems (Hill et al., 2012; Miciak & Fletcher, 2020). The MTSS are a departure from flawed methods of allocating support for students (D. Fuchs & Fuchs, 2006; D. Fuchs et al., 2004) that have nonetheless (a) blurred the distinction between students with reading difficulty and those with learning disability (LD) and (b) complicated how instructional effectiveness is ascertained (Berkeley et al., 2020; Denton, 2012; King, Wang, Datchuk & Rodgers, 2023; Peng et al., 2020). The ubiquity of MTSS has been accompanied by challenges in identifying who benefits from intensive instruction and guidelines for implementing instructional tiers (Hintze et al., 2018; Miciak & Fletcher, 2020).

Because MTSS integrate assessment and intervention into a single system, research on instructional effectiveness depends on understanding how elementary literacy instruction at Tier 1 shapes outcomes at Tier 2 (Hill et al., 2012). The use of MTSS frameworks carries two implications for evaluations of instructional effectiveness. First, the extent to which students require intensive remediation is best determined after examining their performance in the context of effective Tier 1 instruction—in other words, their RTI (D. Fuchs et al., 2010). Without this context, assessment scores may be equally indicative of ineffective instruction or the need for more intensive intervention. Second, the effectiveness of supplemental Tier 2 interventions is a function of their impact on students who have received generally effective core instruction without benefit. That is, interventions may only meaningfully be characterized as effective “Tier 2” when they improve outcomes for students who did not benefit from appropriate Tier 1 instruction. Consequently, researchers who seek to evaluate Tier 2 instruction are obligated to provide insight into the quality of instruction provided at Tier 1 (Hill et al., 2012). The quality of Tier 2 intervention research carries particular significance for students with or at-risk for LD, as Tier 2 represents a source of support and a gatekeeping mechanism for LD identification (Miciak & Fletcher, 2020). Inasmuch as students who do not respond adequately to Tier 2 instruction are more likely to receive LD diagnoses and qualify for more intensive services, understanding what constitutes high-quality Tier 2 instruction is therefore critical for distinguishing between disability and inadequate instruction (Denton, 2012).

Our understanding of the typical relationship between Tier 1 instruction and Tier 2 intervention at the elementary level is primarily derived from large-scale group-design research and observational studies (Al Otaiba et al., 2025; Hill et al., 2012). However, the extent to which single-case research design (SCRD)—a within-subject experimental approach used in much of special education with unrealized potential for students with or at-risk for LD (King, Wang, Nylen & Enders, 2023; Peltier et al., 2021)—addresses this issue remains unclear. In this article, we describe and define the implications of two factors related to the context in which Tier 2 instruction is implemented: implementation fidelity and alignment across tiers. We then describe SCRD and consider its potential to capture contextual information about Tier 1 and Tier 2 instructions. Finally, we systematically assess how SCRD studies of Tier 2 literacy interventions report information related to fidelity and instructional alignment across tiers.

Connecting Instruction at Tier 1 and Tier 2: Implementation Fidelity and Alignment

Hill and colleagues identified two dimensions of instruction related to the implementation context of Tier 2 elementary literacy intervention. The first, implementation fidelity, traditionally refers to both measuring and maintaining the correspondence of instruction with a pre-established plan or protocol (O’Donnell, 2008; Sanetti & Luh, 2019). Recent analyses of reading intervention studies define implementation fidelity as a multidimensional construct encompassing adherence (i.e., instruction delivered as intended), differentiation (i.e., instruction is distinct from other practices), exposure (i.e., appropriate instructional duration), quality, and responsiveness (i.e., student engagement), reflecting both structural and process features of implementation (e.g., Gresham et al., 2009; van Dijk et al., 2023).

Multiple types of implementation fidelity data are needed to determine whether insufficient student progress results from inadequate intervention intensity or poor implementation (Sanetti & Collier-Meek, 2019). However, few reading intervention studies report fidelity of the primary reading intervention (Capin et al., 2018); when reported, data are generally presented as a single quantified index (e.g., percentage of instructional steps completed; van Dijk et al., 2023). Scholars have suggested that observations at Tier 1 should incorporate standardized observation tools, such as the Instructional Content Emphasis–Revised (ICE-R; Edmonds & Briggs, 2003), to provide insight into the quality and characteristics of instruction (Al Otaiba et al., 2025). Given the relationship between fidelity and student outcomes (e.g., Benner et al., 2011; Capin et al., 2018), one of the primary purposes of measuring implementation fidelity is to inform changes to instruction (Kretlow & Bartholomew, 2010). Accordingly, authors should also document training and other supports (e.g., coaching, feedback) designed to improve and maintain fidelity within each tier.

Highlighting instructional features through the ICE-R and similar tools relates to the second dimension of instructional content described by Hill and colleagues (2012): alignment. Alignment refers to whether similar instructional procedures, outcomes, and philosophies were employed across tiers. Intentional alignment of instruction across Tiers 1 and 2 has been shown to yield greater effects on student outcomes (e.g., Stevens et al., 2020, 2024). This alignment is particularly important because Tier 2 interventions are designed to provide targeted support for students who do not make expected progress with core instruction in the elementary grades (Wanzek et al., 2016). In contrast, aligning Tier 3 interventions with core instruction may be less feasible due to the highly individualized nature of intensive literacy interventions. For example, Al Otaiba and colleagues (2025) observed far more code-focused instruction in Tier 3 than in Tier 1 classrooms serving students in Grades 1–5. Thus, improved outcomes in subsequent tiers and beyond may be partially explained by low implementation fidelity in Tier 1 or by high variation in instructional content across tiers.

Systematic reviews suggest authors of Tier 2 intervention studies rarely report Tier 1 fidelity data or provide information necessary to determine alignment across tiers. Hill and colleagues (2012) examined how frequently group-design reading intervention studies (k = 22) that evaluated the efficacy of elementary Tier 2 instruction reported fidelity and alignment of instruction provided across tiers. In 75% of eligible studies, authors reported fidelity data for Tier 2 interventions, but Tier 1 fidelity data were reported in just 36% of studies. Approximately 32% of studies included information that could be used to assess alignment of instruction across tiers, and these data were more commonly reported when researchers manipulated Tier 1 instruction (e.g., Loftus et al., 2010). Overall, results indicated a marked increase in reporting of Tier 2 fidelity data as compared to prior reviews (i.e., Gresham et al., 2000; Swanson et al., 2013) but limited consideration of Tier 1 instruction.

Hill and colleagues (2012) established the limited extent to which earlier Tier 2 research (O’Connor et al., 2005; Scanlon et al., 2008) reported information related to alignment and fidelity. Although Hill and colleagues’ review has yet to be replicated, more recent group-design studies have explicitly emphasized the importance of both fidelity and alignment of Tier 1 instruction as key components of validating supplemental intervention delivered at Tier 2 (Young et al., in press). For example, Stevens and colleagues (2020) compared the effects of aligned Tier 1 and Tier 2 instructions to nonaligned instruction across tiers as well as a business-as-usual (BAU) Tier 2 reading intervention for readers who did not pass the state reading assessment at the end of their third-grade year. In the aligned Tier 1–Tier 2 condition, social studies teachers received initial training and ongoing coaching to implement the same instructional practices used in the researcher-provided Tier 2 intervention. In addition, the research team monitored adherence to essential instructional components and quality of implementation across tiers in all conditions. Findings indicated that students in the aligned condition significantly outperformed students who received nonaligned intervention or BAU instruction on measures of reading comprehension, content knowledge, and vocabulary. Based on these results, Stevens and colleagues noted,

It may be important for future research to include information about Tier 1 so that we can better understand the extent to which this may contextualize findings . . . it may be useful to . . . [view] Tiers 1 and 2 as connected and intentionally planning instructional delivery across the school day . . . rather than as separate pieces of the puzzle. (p. 446)

Emphasis on the importance of fidelity and alignment is also evident in recent studies where researchers did not manipulate Tier 1 instruction (e.g., Al Otaiba et al., 2014; Wanzek et al., 2017). In both studies, researchers implemented a Tier 2 reading intervention without modifying typical Tier 1 instruction. Yet they also observed and reported the content emphasis of Tier 1, allowing assessment of alignment. In addition, both research teams monitored implementation fidelity of the researcher-implemented Tier 2 intervention as well as school-implemented Tier 1 instruction. Together, these studies illustrate how researchers conducting group-design studies can evaluate a Tier 2 intervention as part of a full instructional package rather than as an isolated instructional component (Stevens et al., 2020). However, the contribution of SCRD, which represents most of the experiments in special education and increasingly appears in work involving students with- or at-risk for LD, has yet to be determined (Hott & Flores, 2023; King, Wang, Nylen & Enders, 2023).

Single-Case Design Research, LDs, and Contextualizing Tier 2

The SCRD refers to a family of within-subject experimental approaches that use repeated measurement; manipulations in intervention timing; and replication of effects across individuals, behaviors, or groups to demonstrate causal relations between instruction and outcomes (King, Wang, Nylen & Enders, 2023). Participants in SCRD generally serve as their own control, meaning that an intervention effect is determined by comparing a student’s performance during baseline (i.e., nontreatment) and intervention phases (Ledford & Gast, 2024). Despite representing more than half of experiments in special education, less than 15% of experiments published in journals focusing on LD are SCRD (King, Wang, Nylen & Enders, 2023). The rarity of SCRD in LD journals may stem from the historic tendency among researchers to interpret the design’s frequent use among small, heterogeneous populations as evidence that it is primarily suitable for examining intensive intervention, such as highly individualized Tier 3 interventions for students with LD and other disabilities (Hurtado-Parrado & López-López, 2015; King et al., 2024). Yet SCRD offers LD researchers opportunities to pilot new interventions, isolate components of effective instruction, attend to the needs of specific students, and conduct in-depth observations of the learning context across tiers (Hott, Flores et al., 2023; Peltier et al., 2021).

The small scale and flexibility of SCRD can feasibly accommodate detailed observations of instruction and other variables relevant to participant performance in Tier 2 intervention (Rila et al., 2025). These observations, in addition to facilitating the contextualization of Tier 2 reading interventions within MTSS frameworks, would be consistent with emerging standards of SCRD quality in LD research. Hott, Flores, et al. (2023) and Hott, Heiniger, et al. (2023) established the importance of providing replicable descriptions of conditions, as well as participant characteristics (e.g., previous instruction). Additional detail regarding typical classroom instruction for children with or at-risk for LD would therefore align with guidelines for SCRD research as well as calls for transparency regarding BAU instruction more generally (Gersten et al., 2005).

The intimacy and flexibility of SCRD uniquely position researchers to capture alignment, as—in contrast to the large samples and limited number of observations characteristic of most group designs—SCRD require repeated observations of a single student in their routine instructional environment and the intervention context (King, Wang, Nylen & Enders, 2023). Despite this capacity, SCRD has historically emphasized control and replication over contextual description, leaving unresolved questions about how to balance rigor with relevance (Ledford et al., 2023). Nonetheless, recent studies have demonstrated the potential of SCRD to capture instructional context.

King, Lemons, et al. (2022) and King, Rodgers, and Lemons (2022) studies concerning reading instruction for children with Down syndrome—although outside the LD context—illustrate how SCRD examining intensive reading interventions can provide detailed descriptions of Tier 1 instruction. Across studies, authors used teacher interviews and observations based on the ICE-R to document the amount, format, and content of students’ ongoing school-based reading instruction. They noted that participants typically received instruction derived from commercial curricula for at least 1 hr each day across special and general education settings. These studies provided further context for outcomes, in which students learned content aligned with the intervention but generally did not improve on curriculum-based measures of reading.

Detailed observations of participants’ typical context are unusual; however, as descriptions of conditions within SCRD typically pertain to sessions directly under the control of researchers (e.g., Hott, Heiniger, et al., 2023). As a result, the brief observation probes conducted during baseline and intervention conditions threaten to provide a false sense of participants’ behavior and potentially omit significant contextual information (Lambert et al., 2025). Rila and colleagues (2025) found that SCRD studies do not capitalize on the intimacy of the format, and many authors have called to broaden the forms of data collected and questions answered by SCRD (e.g., Hitchcock et al., 2010; Onghena et al., 2019). The extent to which this larger conversation regarding SCRD methods pertains to the issue of connecting Tier 1 and Tier 2 instructions remains uncertain, however, given that previous reviews of the literature have excluded SCRD (e.g., Hill et al., 2012).

Need for the Current Study

Researchers have increasingly recognized the importance of examining Tier 1 instruction—both in terms of its fidelity and alignment with subsequent intervention—when evaluating the effectiveness of Tier 2. More than a decade after Hill and colleagues (2012) articulated the need to assess the fidelity and alignment of instructional tiers in elementary Tier 2 intervention research, it remains unclear to what extent these elements are examined or supported in the broader literature, as no systematic review has revisited this question since the original search. Recent group-design research nonetheless suggests alignment across instructional tiers improves student outcomes (e.g., Coyne et al., 2022; Fien et al., 2015; Smith et al., 2016) or directly compares aligned and nonaligned interventions (e.g., Foorman et al., 2018; Stevens et al., 2020), underscoring alignment as a promising potential direction for future scholarship regarding MTSS. However, an emphasis on SCRD is warranted because of (a) their omission from previous examinations of this issue (Hill et al., 2012), (b) their prominence in special education and growing use among students with or at-risk for LD (Hott, Flores et al., 2023; King, Wang, Datchuk & Rodgers, 2023; Peltier et al., 2021), and (c) their potential to elaborate on the instructional contexts represented in intervention research (Rila et al., 2025). The purpose of this study is to extend Hill and colleagues’ original review by assessing the extent to which implementation fidelity and alignment appear in SCRD research evaluating Tier 2 literacy interventions for struggling readers in an MTSS context. Guiding questions include (a) To what extent is intervention fidelity reported across Tiers 1 and 2 and (b) How do studies establish alignment across Tiers 1 and 2?

Method

We addressed the research questions through a process consisting of multiple stages. First, we conducted electronic database and ancestral searches of studies reporting on supplemental reading instruction either included in or designed for inclusion in an MTSS model. We then adapted codes from Hill and colleagues (2012) pertaining to intervention fidelity and the instructional alignment of Tiers 1 and 2. We then applied all codes to identified studies over the course of a descriptive review.

Search Procedures

Studies were identified using a three-step process. Figure 1 provides a visual depiction of the search in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Page et al., 2021). In EBSCO, we performed abstract, title, and keyword searches of PsycINFO, Education Research Information Center (ERIC), and Education Source databases. Although we retained the emphasis of Hill and colleagues’ (2012) original search, terms were expanded in recognition of changes in the conception of MTSS and the breadth of literacy domains subject to intensive intervention. Specifically, the search referred to variations of MTSS (“response to intervention” or RTI or “multi-tiered system” or MTSS or “tiered support system” or “graduated intervention” or “Tier 2” or “tier two” or “Tier 1” or “Tier one” or “tiered systems” or “progress monitor*” or supplemental) and reading (read* or literacy or phon* or “reading fluency” or comprehen* or “language arts” or decod* or vocabulary or) in the title and keywords. We also conducted an abstract search with additional terms related to instruction (instruct* or teach* or taught or interven* or train*). We performed an additional search for MTSS and literacy terms in all fields except full-text in ProQuest Dissertations & Theses Global. Together, the searches produced 7,671 results, which were manually assessed for duplicates using the Zotero reference management system. The team then evaluated the remaining 4,646 records for inclusion. We also conducted an ancestral search, for which we manually reviewed references cited by each study included in the corpus.

Figure 1.

Visual depiction of literature search.

Inclusion Criteria

We identified records for full-text review following a review of abstracts, titles, and keywords. We established inclusion criteria to ensure alignment with the research questions and avoid undue assessment of articles that may have provided brief descriptions of intervention procedures (e.g., subsample of an original study).

Abstract Review

Records retrieved from the electronic search were reviewed by title, keyword, and abstract using six inclusion criteria. First, we included records describing peer-reviewed articles, dissertations, and research reports that were published between 2012 (i.e., the conclusion of Hill and colleagues’ original search) and October 2024 (i.e., the date of the search); all other documents were excluded. Studies published prior to Hill and colleagues’ (2012) review were excluded, as their review was the first to systematically operationalize and evaluate alignment between MTSS tiers in intervention studies. Applying alignment criterion retroactively to studies published before Hill and colleagues established the construct would yield findings of limited interpretive value. Studies also needed to include elementary-age students (determined by reported participants enrolled in Grades K–5) or, in the absence of grade information, participants ages 5 to 12 years. Studies with students within a single grade or year outside the established ranges were included, provided the sample also included otherwise eligible students. This preserved alignment with the original review; in addition, the content emphasis of typical secondary instruction, combined with the intensive supports necessary for students in secondary grades who experience consistent challenges in reading (Brozo, 2015), would reasonably reduce the salience of alignment in such studies.

Studies were also excluded if a nonexperimental design was used, including studies that were qualitative, descriptive/correlational, or within-subject group designs. This criterion was consistent with Hill and colleagues’ (2012) earlier review as well as the potentially diminished interest in reporting specific features (e.g., fidelity) in nonexperimental studies (King, Wang, Nylen & Enders, 2023). Studies were reviewed at this level for the presence of a literacy instructional intervention. Finally, studies needed to be published in English to be included for full-text review. Records for which there was any disagreement across any criterion were also included. Following the abstract review, we identified 823 studies for further evaluation.

Full-Text Review

After completing the abstract review, we then screened all remaining studies at the full-text level using four additional inclusion criteria. First, like the inclusion criteria set by Hill and colleagues (2012), participants in the study needed to receive supplemental literacy intervention pertaining to any aspect of reading or writing appropriate for inclusion in an MTSS or RTI model. This could be achieved by (a) the authors mentioning an explicit link made between the reading intervention and an MTSS framework maintained by the school or (b) the authors actively manipulating or influencing, by training or some other method, both Tier 1 and Tier 2-level instructions. Authors describing their intervention as suitable for Tier 2 without explicit reference to MTSS, RTI, or tiered instruction maintained by the school (e.g., Faggella-Luby & Wardwell, 2011; Furey et al., 2017) were insufficient for inclusion. Second, the intervention needed to be characterized as Tier 2 instruction given in a small-group or one-on-one setting. Interventions explicitly identified as Tier 3 were excluded to preserve consistency with Hill and colleagues (2012) and because the intensive nature of Tier 3 interventions might preclude considerations of alignment (Al Otaiba et al., 2016). Third, studies were reviewed to ensure that original data were used in the study. This meant that independent studies using extant data, or data previously published elsewhere, were excluded to prevent double-counting of findings and to avoid disadvantaging studies that may have reported procedural details across multiple manuscripts. For manuscripts for which both a dissertation and peer-reviewed article were available, only the latter was retained. Published articles are generally accepted as the authoritative source (e.g., National Information Standards Organization, 2008), and meaningful discrepancies between SCRD dissertations and peer-reviewed articles appear to be uncommon (Travers et al., 2026). Retaining the published article also avoided discrepant reporting across included studies, as dissertations falling outside the date range of the search would not have been available as an alternative source. Fourth, given that aspects of the MTSS may vary internationally, studies needed to take place in a U.S. school setting and primarily pertain to literacy instruction in English. Finally, only single-case designs examining the efficacy of Tier 2 instruction were retained. Group designs were excluded.

Upon completion of full-text screening of studies identified through the electronic search, nine studies were determined to be eligible for this review. Of these, two studies appeared as both a thesis or dissertation and a published article within the date range of the search (Boudreaux-Johnson, 2015; J. L. Kuhn, 2017); only the published article was retained. The ancestral search did not result in any additional articles.

Coding Procedures

We coded for indicators of alignment and fidelity across Tier 1 and Tier 2. We also coded whether Tier 2 instruction involved reading exclusively or also emphasized writing (e.g., written spelling, written summaries of passages), the involvement of researchers in the direct implementation of the Tier 2 intervention, and the situation of study procedures within a school-based MTSS or RTI framework. Operational descriptions of all codes appear in Table 1.

Table 1.

Coding Categories and Descriptions.

Indicator	Subcategories	Description
Alignment	Researcher Implemented Tier 1
	Content	Indicated whether authors described specific instructional content (e.g., fluency, vocabulary) or specific curriculum.
	Practices	Indicated whether authors described use of specific instructional techniques (e.g., direct instruction, think aloud).
	Arrangements	Authors provided an indication of student arrangements (e.g., whole-group, small-group, peer).
	Duration of Instruction/Components	Authors described duration of instructional period and the duration of individual components (e.g., arrangements, content sessions).
	School Implemented Tier 1
	Observation	Description of typical Tier 1 instruction supplemented with either interview, survey, standardized observation tool (e.g., ICE-R; Edmond & Briggs, 2003), or nonstandard observation.
Fidelity	Researcher/School-Implemented Tier 1
	Implementer Training	Article describes training provided to interventionists or teachers.
	Fidelity Definition Provided	Conception of fidelity within the content of the study is defined, either within the narrative or description of a specific tool. Description must extend beyond reference to a lesson plan or unspecified steps in a procedure.
	Coaching	Indicated whether authors describe any ongoing support or professional development received by interventionists or teachers over the course of the study (e.g., refreshment sessions).
	Monitoring	Authors report monitoring interventionist or teacher performance either with an external observation, solicitation of teacher self-reports.
	Assessment Tool	Authors indicate fidelity was documented using a standardized or researcher-created checklist.
	Feedback	Indicated whether interventionists or teachers received feedback regarding their in-situ performance beyond guidance provided during the initial training period.
	Score	Authors provided a quantitative index of fidelity, with or without information on method in which the score was derived.
	Frequency	Indicated whether authors described how often fidelity was obtained (e.g., number, percentage of sessions).
	Researcher Implemented Tier 2
	School Tier 2 Interventions	Authors described Tier 2 intervention program provided description of school-implemented Tier 2 program.

Note. ICE-R = Instructional Content Emphasis Revised. Alignment indicators across instructional tiers share coding subcategories unless otherwise indicated.

Alignment codes pertained to reports regarding instructional practices, content, duration, and arrangements that could be used to compare instruction across tiers. We recognized articles for providing this information even in instances where descriptions were not comprehensive. We applied alignment codes to Tier 1 instruction provided by schools (i.e., school-implemented Tier 1) and in studies where the research team indicated their involvement in Tier 1 (i.e., researcher-implemented Tier 1; e.g., direct implementation, professional development for instructors) separately. For the former, we further indicated whether descriptions of Tier 1 instruction were supplemented with interviews, surveys, or other observations.

Fidelity codes identified specific article features related to the documentation of the quality or consistency of instruction with an established protocol (e.g., frequency of observation) for researcher- and school-implemented variants of Tier 1 and research-implemented Tier 2 interventions. Specific codes pertained to reports regarding the precise definition of fidelity as well as training procedures, coaching, monitoring, and assessment tools used in assessment. We further indicated whether teachers received feedback, the frequency of observations, and procedures used to obtain fidelity scores. We also noted whether authors reported the use of Tier 2 interventions by participating schools.

Interrater Agreement

Interrater agreement (IRR) was collected across multiple levels of the research process, including screening of record titles, abstracts, and keywords; full-text review; and article coding. Teams led by two doctoral-level special education faculty and a postdoctoral scholar with experience in systematic reviews conducted each stage of the project. Training, led by the first three authors, generally consisted of reviewing selection criteria or codes with the research team, guided practices with articles from the larger sample, and routine evaluations of agreement—with feedback—provided over the course of the project. Agreement during training was determined using phase-specific calculations. For initial screening, coders received an initial 1-hr training delivered by the postdoctoral scholar, which included coding prescored examples. Coders were then required to maintain >90% agreement with the postdoctoral scholar on 10 records before receiving records to code independently. Independent task sets were assigned to coders in discrete sets (e.g., 20 records) to allow for routine assessment of agreement by the postdoctoral scholar and prevent observer drift. For full-text screening and coding, reviewers assessed the coding scheme and jointly reviewed articles outside the scope of the review until reaching 90% agreement on three consecutive articles.

Abstract Review

All initially retrieved records were double-screened by the postdoctoral scholar and three graduate researchers. We defined agreement as two screeners agreeing on the inclusion status of the same record and calculated an IRR coefficient by dividing the number of agreements by the total number of records screened. The average IRR was 92%. Any studies coded as a disagreement between screeners were reviewed again at the full-text level.

Full-Text Review

All full texts were double screened by two authors, and any disagreements were resolved by a third. Agreement was defined as two authors agreeing on the inclusion status of an article. The average IRR, calculated by dividing the number of agreements by the total number of studies screened, was 93%.

Coding

A team consisting of four authors double-coded 66.77% of the included studies. We resolved disagreements through consensus between the two primary coders, and consulted a third independent coder, if necessary. We calculated IRR by dividing the total number of agreements by the total number of rated items for each study. The average IRR was 96%.

Results

The review yielded nine SCRD studies (see Table 2). Of these, 33.33% (k = 3) were available only as dissertations. Eighty-nine percent of studies (k = 8) implemented interventions at Tier 2 exclusively, while Boulos (2016) reported implementing interventions at both Tiers 1 and 2. Studies included 57 total participants, with an average sample size of six participants (R = 3–11; SD = 2.8). Across studies, 78.95% (k = 45) of participants were boys. On average, students were in the second grade at the time of the study (R = 1-6; SD = 1.69). All participants were identified as being at risk for reading disabilities based on benchmark screenings, state test scores, or teacher observations; none were formally identified with LD. Additional disability classifications represented among participants included emotional and behavioral disorders (15.79%, k = 9), speech-language impairment (3.51%, k = 2), and other health impairments (i.e., attention-deficit disorder; 5.26%, k = 3).

Table 2.

Alignment and Fidelity of Studies.

Study	Tier 1 Alignment					School Tier 1 Fidelity							Tier 2 Fidelity
Study	CN	PC	AG	DR	OB	DF	CH	MN	TL	FB	SC	FQ	TN	DF	CH	MN	TL	F	SC	FQ	T2
Exclusively Tier 2
Boudreaux-Johnson et al. (2017)	N	N	N	N/N	N	N	N	N	N	N	N	N	N	Y	N	EO	RC	N	EC	Y	Y
Fuoco (2020)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	Y	Y	EO	N	Y	EC	Y	Y
Gettinger et al. (2021)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	Y	Y	EO	N	Y	EW	Y	N
Gettinger et al. (2024)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	N	N	N	N	N	N	N	Y
Kuhn & Albers (2022)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	Y	Y	EO, SR	ST	Y	EW	Y	N
O'Keeffe et al. (2013)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	Y	N	EO	N	N	EC	Y	Y
Spencer et al. (2024)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	N	N	EO	N	N	EC	Y	N
Thornton (2012)	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	Y	N	EO	N	N	EW	Y	Y
Tier 1 and Tier 2
Boulos (2016) ^a	N	N	N	N/N	N	N	N	N	N	N	N	N	Y	Y	N	EO	ST	Y	EW	Y	N

Note. Dissertation studies listed in italics. CN = content; PC = instructional practices; AG = instructional arrangement; DR = Duration of instruction/component duration; OB = observation; DF = fidelity definition; CH = coaching; MN = monitoring; TL = origin of assessment tool reported; F = feedback; SC = score; FQ = assessment frequency; TN = training; T2 = description of school T2 project; Y = reported; N = not reported; TCD = total and component duration; TO = total duration only; ST = standardized tool; NS = nonstandard observation; SR = self report; EO = external monitoring; RC = researcher created; OU = origin unknown; EC = external score, with calculation; EW = external score, without calculation.

Description of researcher-implemented Tier 1 program appears in text.

Except for Boulos (2016), all studies were explicitly situated within an MTSS framework. Researchers were directly involved in the implementation of the intervention in 33.33% of studies (k = 3; Boulos, 2016; J. Kuhn & Albers, 2022; Spencer et al., 2024). Multiple-baseline designs (66.67%, k = 6) or nonconcurrent multiple baseline designs (11.11%, k = 1)—the most commonly used SCRD configuration in special education (King et al., 2024)—appeared in the majority of studies. Alternating treatments designs (Boudreaux-Johnson et al., 2017; 11.11%, k = 1) and the related repeated acquisition design (Spencer et al., 2024; 11.1%, k = 1) appeared less frequently.

Alignment

School-Implemented Tier 1

No studies reported alignment across any of the categories assessed for school-implemented Tier 1 instruction.

Researcher-Implemented Tier 1

Boulos (2016) described a class-wide curriculum—specifically, Reading Mastery—as part of the study. Descriptions of specific grouping arrangements or the duration of components were not provided. However, reports regarding the specific techniques (e.g., rhyming exercises), total duration, and content of instruction were described to such an extent to permit assessment of alignment between instruction across tiers.

Fidelity

School-Implemented Tier 1

As with alignment, no studies reported fidelity data for school-implemented Tier 1 instruction. Lack of fidelity data presents problems in confidence that the procedures were implemented as intended.

Researcher-Implemented Tier 1

Boulos (2016) reported multiple components related to both the maintenance and measurement of Tier 1 fidelity. The training of the teacher was described. A definition of fidelity was also provided alongside an implementation checklist. Finally, the calculation of fidelity, as well as the schedule by which the researcher conducted observations, was provided.

Researcher-Implemented Tier 2

Eight studies (88.89%) described training interventionists prior to implementation. Training varied from unspecified sessions prior to implementation (e.g., Fuoco, 2020) to several hours of training spanning multiple days (e.g., O’Keeffe et al., 2013). In terms of maintaining fidelity, authors reported using coaching (e.g., individual supports provided as needed; J. Kuhn & Albers, 2022) in 33.33% of studies (k = 3). Feedback was included in 44.44% of studies (k = 4). Boulos (2016) shared the results of fidelity observations with interventionists but provided no other form of support (i.e., coaching).

Six studies (66.67%) provided an explicit definition of fidelity or otherwise provided access to their observation tool. Authors reported using observers to externally monitor fidelity in eight studies (89%), with J. Kuhn and Albers (2022) also collecting teacher self-reports regarding the intervention condition. Fidelity observation tools ranged from researcher-created (e.g., Boudreaux-Johnson et al., 2017) to standardized tools packaged with their reading programs (e.g., Reading Mastery, Boulos, 2016). Authors reported fidelity scores in 88.89% of studies (k = 8), with external monitoring as the main method to assess fidelity. Assessment frequency was reported in most of the studies (88.89%; k = 8). Although 88.89% of studies (k = 8) reported scores, only half of these described calculation procedures. Five studies (55.56%) described conventional Tier 2 programs provided alongside the intervention.

Discussion

The present review examined how SCRD studies evaluating elementary Tier 2 literacy interventions report instructional fidelity and alignment across tiers. Across nine studies, all but one implemented researcher-provided Tier 2 interventions exclusively, with limited or no description of concurrent Tier 1 instruction. Researchers consistently reported fidelity of the intervention they administered, yet none provided information regarding how these interventions related to, or were aligned with, the instruction students typically received in Tier 1 settings. The absence of information regarding the Tier 1 context limits the interpretability of SCRD findings in relation to Tier 2 interventions and constrains the integration of this literature with research on students with or at-risk for LD more broadly. Findings provide clear guidance on how future SCRD work involving Tier 2 interventions may advance.

Although motivated by similar questions, differences in targeted studies and coding schemes complicate direct comparisons between the outcomes of the current review and those of Hill and colleagues (2012). Nonetheless, the principal result—Tier 2 intervention studies generally monitor Tier 2 intervention without providing additional insight into Tier 1 instruction—remains the same. The absence of frequent reports regarding the implementation fidelity of the primary intervention distinguishes these studies from the wider reading intervention literature that often omits such information (e.g., Capin et al., 2018). The extent to which studies in this review reported fidelity exceeds the marked increase in reporting observed in SCRD across subject areas (e.g., inappropriate behavior, mathematics; Gage et al., 2020). However, the actual data encompassed by fidelity procedures varies considerably across SCRD studies (Gage et al., 2020) and reading studies more generally (van Dijk et al., 2023), which is consistent with our findings. Whereas most studies identified by Hill and colleagues (2012) provided some form of support to interventionists, only 44% of studies in the present review reported coaching or feedback beyond initial training. The discrepancy may be due to the scale of SCRD, where primary authors generally conducted all observations and can informally exert control over the quality of intervention.

Previous reviews suggest that reports of Tier 1 fidelity in research concerning more intensive intervention are relatively uncommon (Al Otaiba et al., 2025; Hill et al., 2012). Although more common in group-design studies (e.g., Stevens et al., 2020), the scale of many Tier 2 intervention studies may explain—if not excuse—limited reports of the wider instructional context (Hill et al., 2012). Given the relatively small number of participants across studies, the limited reports of fidelity and alignment of Tier 1 instruction observed in the current review arguably stem from the conventions of SCRD rather than their scale. As a related example, SCRD have only recently begun to incorporate the perspectives of participants into their work (i.e., social validity; Snodgrass et al., 2018; Thoele & DeAngelo, 2023; Wellons et al., 2024). When it occurs, participant perspectives are often collected via rating scales, a practice that is understandable among large samples but potentially reductive given that the median SCRD sample contains no more than four participants (King et al., 2024). Much like the rating scales used to gauge social validity, limited reporting of Tier 1 instruction in SCRD reflects conventions designed for efficiency and may not reveal enough about the conditions under which Tier 2 intervention is effective.

We hesitate to couch the discussion of alignment and fidelity in terms of quality, which can evoke arbitrary checklists that can seem disconnected from the objectives of research (e.g., Harris et al., 2019; Lanovaz & Rapp, 2016). Rather, we echo calls to acknowledge the utility of flexible, context-specific guidelines when conducting SCRD (Ledford et al., 2023). In the case of Tier 2 intervention research specifically, this requires greater attention to Tier 1 instruction. The SCRD studies that provide multiple safeguards related to internal validity, such as randomization and the inclusion of several data points within each experimental phase, no doubt have considerable value. These elements alone, however, do not address whether Tier 2 interventions function as intended within the ecology of MTSS. Changes in priorities of SCRD methods may be warranted as lines of research extend beyond whether a practice can produce changes in outcomes to how these practices function and interact with concurrent school supports (Ledford et al., 2023).

Limitations

This review has several notable limitations. First, none of the studies explicitly involved students with LD. The pattern of reading difficulties associated with participants is nonetheless consistent with recent research regarding students with or at-risk for LD, particularly following the advent of MTSS frameworks that have decreased formal LD diagnoses (Berkeley et al., 2020; King, Wang, Datchuk & Rodgers, 2023). Second, MTSS and reading terms were applied to titles and keywords, yet instruction-related search terms were limited to abstracts. Although this asymmetry may have limited the consistency of the search strategy, it also reduced the likelihood of including irrelevant records. Third, we limited the search to studies published after Hill and colleagues’ (2012) review, which did not include SCRD. While it is possible that a search of this earlier period may have revealed additional eligible studies, the few studies pertaining to RTI between 2004 and 2011 (i.e., the search range of the original review), combined with limited awareness of issues raised by Hill and colleagues, suggest that inclusion of records prior to 2012 would not have meaningfully changed our results. Fourth, we included only studies that clearly occurred within RTI or MTSS frameworks and excluded those in which the presence of such frameworks could not be verified. As studies that did not occur with an instructional framework were unlikely to mention Tier 1 instruction, these omissions avoided creating a skewed picture of the literature.

In addition, we excluded studies that may have occurred within qualifying MTSS frameworks that were not explicitly mentioned by the author. Certain states require the use of RTI or MTSS, and studies conducted within those states would presumably qualify for inclusion. As authors cannot presuppose readers’ exhaustive knowledge of the context beyond what is explicitly disclosed, these exclusions were justified. Given the variance in implementation of MTSS across states, we further recommend that authors explicitly note whether MTSS is employed and enumerate specific components encompassed by the MTSS framework, particularly if they are relevant to literacy.

Implications for Practice

Due to its scale and flexibility, SCRD may be readily integrated into the work of researcher-practitioners who incorporate systematic data collection and intervention implementation into their work (Blampied, 2013; Ninci, 2023). One consequence of more SCRD research involving students with or at-risk for LD and greater attention to Tier 1 instruction is that issues presumably amenable to Tier 2 intervention might be better resolved through changes to core instruction. As such, the collection of fidelity and alignment data is critical for implementing data-based decision-making in both research and practice. The process of observing and modifying Tier 1 instruction to improve student outcomes—rather than immediately prescribing a Tier 2 solution—represents a goal for future SCRD research with the potential to improve the professional judgment of special educators, instructional coaches, and others responsible for coordinating services for students with or at-risk for LD and related learning needs (Baker et al., 2010).

Assessing instructional fidelity and monitoring instruction to support students with significant learning needs present considerable challenges for many schools (e.g., Ruffini et al., 2016). While practitioners would likely benefit from efforts to disseminate flexible tools for monitoring Tier 1 reading instruction (e.g., Iowa Reading Research Center, 2025), there is a need for rigorous, standardized assessment in this area. These tools should also be coupled with guidance on how to improve instructional practice (Cuticelli et al., 2016). Both resource limitations and perceptions of fidelity monitoring can make attention to these contextual factors difficult to maintain (McKenna & Parenti, 2017). Minimizing the aversiveness of observations and increasing buy-in (e.g., Falletta-Cowden & Lewon, 2023) are more salient in a context where the use of specific literacy practices is increasingly compelled by state governments (Barnes & Peltier, 2022; Neuman et al., 2023). Addressing these gaps will depend on helping schools operationalize Tier 1 implementation fidelity that can be measured, supported, and improved in cooperation with practitioners.

Directions for Future Research

We view these findings as less indicative of shortcomings with individual studies or the scope of this review than a series of missed opportunities to address critical questions rooted in long-standing research conventions. The relative absence of SCRD in LD research observed here and elsewhere is significant given the clear potential for this design to address fidelity, alignment, and other issues relevant to students with LD (Peltier et al., 2021). The prohibitive costs of group designs—and the shifting priorities of many funding agencies (Northern & Opp, 2026)—further underscore the role of SCRD and other more feasible designs in addressing the range of questions pertinent to LD. Authors are therefore encouraged to reconsider the role of SCRD and what it can accomplish within LD research. Hott, Flores, et al. (2023) and Hott, Heiniger, et al. (2023) have called for more SCRD and more rigorous methodological reporting in LD research. We extend this call to encompass techniques that capture a broader range of data relevant to Tier 2 interventions and issues central to students with or at-risk for LD (e.g., culturally responsive instruction; Austin et al., 2024). To reduce the likelihood of missing critical connections between instructional tiers within MTSS, future SCRD should incorporate qualitative interviews and systematic observations—approaches that would strengthen its contribution to special education research and provide insight for future intervention implementation in school settings.

In terms of Tier 2 research, SCRD studies establishing what works—and what doesn’t—within well-documented MTSS contexts could influence larger studies attempting to identify intensive interventions that respond to students whose needs extend beyond access to conventionally effective instruction. A first step is to extend the logic of direct observation beyond conditions directly controlled by the researcher to the contextual factors that shape instructional change. This could include incorporating the results of structured observations (e.g., ICE-R) conducted prior to baseline as well as stakeholder reports (e.g., King, Rodgers, and Lemons, 2022). Embedding interviews or contextual observations throughout the research process would align with proposed frameworks for mixed-methods SCRD, providing a richer account of instructional context while maintaining causal precision (Onghena et al., 2019). As simply raising awareness has a poor track record of reform, such changes will likely require alterations to how SCRD is taught (Kubina et al., 2021, 2023). These changes coincide with a broader movement to reconceptualize SCRD outcomes (e.g., use of effect sizes, Maggin et al., 2019) and quality in relation to more ambitious research objectives (Lambert et al., 2025; Ledford et al., 2023). Expansive approaches to SCRD characterized by mixed methods will likely require concomitant changes in how studies are interpreted and aggregated (e.g., Flemming & Noyes, 2021; Nye et al., 2016).

Returning to Tier 2 intervention research specifically, we advise authors to recognize the complexity of implementation fidelity when designing and articulating observation procedures (van Dijk et al., 2023). Like Hill and colleagues (2012), we examined whether studies provided sufficient information for readers to determine how fidelity was defined and the extent of alignment between instructional tiers. This approach is suitable to track concerns that, based on our findings, represent a nascent concern within SCRD involving children with or at-risk for LD. Although our findings do not permit causal inferences, studies demonstrating improved outcomes under aligned Tier 1–Tier 2 conditions (e.g., Stevens et al., 2020) underscore the potential value of this practice. Establishing stronger connections between the dimensions of fidelity observed at Tier 1 and outcomes associated with interventions at Tier 2, however, will require greater attention to how fidelity across tiers is conceptualized and reported in original studies and aggregated in secondary analyses (Bason et al., 2025; van Dijk et al., 2023). We encourage future research teams to plan for fidelity proactively and to broaden the scope of data that can provide insight into the adequacy of intervention delivery.

Footnotes

ORCID iDs

Seth King

Kimberly McFadden

Funding

The authors received the following financial support for the research, authorship, and/or publication of this article: This project was completed with support from the Iowa Department of Education.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

References

Al Otaiba

Allor

Ortiz

Greulich

Wanzek

Torgesen

(2016). Tier 3 primary grade reading interventions: Can we distinguish necessary from sufficient? In Jimerson

S. R.

Burns

M. K.

VanDerHeyden

A. M.

(Eds.), Handbook of response to intervention (pp. 389–404). Springer. https://doi.org/10.1007/978-1-4899-7568-3_23

Al Otaiba

Connor

C. M.

Folsom

J. S.

Wanzek

Greulich

Schatschneider

Wagner

R. K.

(2014). To wait in Tier 1 or intervene immediately: A randomized experiment examining first-grade response to intervention in reading. Exceptional Children, 81(1), 11–27. https://doi.org/10.1177/0014402914532234

Al Otaiba

Stewart

Van Dijk

Conner

Freudenthal

D. R.

Rivas

Yovanoff

Allor

(2025). Comparing Tier 1 reading instruction with Tier 3 or special education intervention through an observational snapshot of school-implemented response to intervention across grades 1–5. Reading and Writing, 38(4), 1129–1151. https://doi.org/10.1007/s11145-024-10534-7

Austin

S. C.

McIntosh

Izzard

Daugherty

(2024). A systematic review of single-case research examining culturally responsive behavior interventions in schools. Journal of Behavioral Education, 33(3), 639–666.

Baker

S. K.

Fien

Baker

D. L.

(2010). Robust reading instruction in the early grades: Conceptual and practical issues in the integration and evaluation of Tier 1 and Tier 2 instructional supports. Focus on Exceptional Children, 42(9), 1–20. https://doi.org/10.17161/fec.v42i9.6693

Barnes

Z. T.

Peltier

T. K.

(2022). Translating the science of reading screening into practice: Policies and their implications. Perspectives on Language and Literacy, 48(1), 42–48. https://www.onlinedigeditions.com/publication/?i=740141

Bason

Balkcom

West

Contesse

V. A.

Burns

M. K.

Dean

Kern

L. E.

(2025). Reviewing literature on Tier 1 implementation fidelity in foundational reading skills: Trends, measurement methods, and facilitators vs. barriers. Reading & Writing Quarterly, 42(1), 18–35. https://doi.org/10.1080/10573569.2025.2527416

Benner

G. J.

Nelson

J. R.

Stage

S. A.

Ralston

N. C.

(2011). The influence of fidelity of implementation on the reading outcomes of middle school students experiencing reading difficulties. Remedial and Special Education, 32(1), 79–88. https://doi.org/10.1177/0741932510361265

Berkeley

Scanlon

Bailey

T. R.

Sutton

J. C.

Sacco

D. M.

(2020). A snapshot of RTI implementation a decade later: New picture, same story. Journal of Learning Disabilities, 53(5), 332–342. https://doi.org/10.1177/0022219420915867

10.

Blampied

N. M.

(2013). Single-case research designs and the scientist-practitioner ideal in applied psychology. In Madden

G. J.

Dube

W. V.

Hackenberg

T. D.

Hanley

G. P.

Lattal

K. A.

(Eds.), APA handbook of behavior analysis, Vol. 1. Methods and principles (pp. 177–197). American Psychological Association. https://doi.org/10.1037/13937-008

11.

Boudreaux-Johnson

(2015). An evaluation of close reading for fourth grade students receiving Tier 2 responsiveness to intervention services [Doctoral dissertation, Louisiana State University]. LSU Digital Commons. https://digitalcommons.lsu.edu/gradschool_dissertations/3222

12.

*Boudreaux-Johnson

Mooney

Lastrapes

R. E.

(2017). An evaluation of close reading with at-risk fourth-grade students in science content. Journal of At-Risk Issues, 20(1), 27–35. https://eric.ed.gov/?id=EJ1148245

13.

*Boulos

J. M.

(2016). Peer-assisted learning strategies for reading skills improvement by children with social, emotional, and behavioral disorders [Doctoral dissertation, Alliant International University]. ProQuest Dissertations and Theses Global.

14.

Brozo

W. G

. (2015). RTI and the adolescent reader: Responsive literacy instruction in secondary schools (middle and high school). Teachers College Press.

15.

Burns

Jimerson

VanDerHeyden

Deno

(2016). Toward a unified response-to-intervention model: Multi-tiered systems of support. In Jimerson

Burns

VanDerHeyden

(Eds.), Handbook of response to intervention (pp. 719–732). Springer. https://doi.org/10.1007/978-1-4899-7568-3_41

16.

Capin

Walker

M. A.

Vaughn

Wanzek

(2018). Examining how treatment fidelity is supported, measured, and reported in K–3 reading intervention research. Educational Psychology Review, 30(3), 885–919. https://doi.org/10.1007/s10648-017-9429-z

17.

Coyne

M. D.

McCoach

D. B.

Ware

S. M.

Loftus-Rattan

S. M.

Baker

D. L.

Santoro

L. E.

Oldham

A. C.

(2022). Supporting vocabulary development within a multitiered system of support: Evaluating the efficacy of supplementary kindergarten vocabulary intervention. Journal of Educational Psychology, 114(6), 1225. https://doi.org/10.1037/edu0000724

18.

Cuticelli

Collier-Meek

Coyne

(2016). Increasing the quality of Tier 1 reading instruction: Using performance feedback to increase opportunities to respond during implementation of a core reading program. Psychology in the Schools, 53(1), 89–105. https://doi.org/10.1002/pits.21884

19.

Denton

C. A.

(2012). Response to intervention for reading difficulties in the primary grades: Some answers and lingering questions. Journal of Learning Disabilities, 45(3), 232–243. https://doi.org/10.1177/0022219412442155

20.

Edmonds

Briggs

K. L.

(2003). The instructional content emphasis instrument: Observations of reading instruction. In Vaughn

Briggs

K. L.

(Eds.), Reading in the classroom: Systems for the observation of teaching and learning (pp. 31–52). Brookes.

21.

Faggella-Luby

Wardwell

(2011). RTI in a middle school: Findings and practical implications of a Tier 2 reading comprehension study. Learning Disability Quarterly, 34(1), 35–49. https://doi.org/10.1177/073194871103400103

22.

Falletta-Cowden

Lewon

(2023). The fundamental role of social validity in behavioral consultation in school settings. Psychology in the Schools, 60(6), 1918–1935. https://doi.org/10.1002/pits.22841

23.

Fien

Smith

J. L. M.

Smolkowski

Baker

S. K.

Nelson

N. J.

Chaparro

(2015). An examination of the efficacy of a multitiered intervention on early reading outcomes for first grade students at risk for reading difficulties. Journal of Learning Disabilities, 48(6), 602–621. https://doi.org/10.1177/0022219414521664

24.

Flemming

Noyes

(2021). Qualitative evidence synthesis: Where are we at? International Journal of Qualitative Methods, 20, Article 1609406921993276. https://doi.org/10.1177/1609406921993276

25.

Foorman

B. R.

Herrera

Dombek

(2018). The relative impact of aligning Tier 2 intervention materials with classroom core reading materials in grades K–2. The Elementary School Journal, 118(3), 477–504. https://doi.org/10.1086/696021

26.

Fuchs

L. S.

(2006). Introduction to response to intervention: What, why, and how valid is it? Reading Research Quarterly, 41(1), 93–99. https://www.jstor.org/stable/4151803

27.

Fuchs

L. S.

Compton

D. L.

(2004). Identifying reading disabilities by responsiveness-to-instruction: Specifying measures and criteria. Learning Disability Quarterly, 27(4), 216–227. https://doi.org/10.2307/1593674

28.

Fuchs

L. S.

Stecker

P. M.

(2010). The ‘blurring’ of special education in a new continuum of general education placements and services. Exceptional Children, 76(3), 301–323. https://doi.org/10.1177/001440291007600304

29.

*Fuoco

K. S.

(2020). Aligning a reading intervention across tiers for students with emotional and/or behavioral disorders in general education settings [Doctoral dissertation, University of Utah]. ProQuest Dissertations and Theses Global.

30.

Furey

W. M.

Marcotte

A. M.

Wells

C. S.

Hintze

J. M.

(2017). The effects of supplemental sentence-level instruction for fourth-grade students identified as struggling writers. Reading & Writing Quarterly, 33(6), 563–578. https://doi.org/10.1080/10573569.2017.1288591

31.

Gage

MacSuga-Gage

Detrich

(2020). Fidelity of implementation in educational research and practice. The Wing Institute. https://www.winginstitute.org/systemsprogram-fidelity

32.

Gersten

Fuchs

L. S.

Compton

Coyne

Greenwood

Innocenti

M. S

. (2005). Quality indicators for group experimental and quasi-experimental research in special education. Exceptional Children, 71(2), 149–164. https://doi.org/10.1177/001440290507100202

33.

*Gettinger

Kratochwill

T. R.

Eubanks

Foy

Levin

J. R.

(2021). Academic and behavior combined support: Evaluation of an integrated supplemental intervention for early elementary students. Journal of School Psychology, 89, 1–19. https://doi.org/10.1016/j.jsp.2021.09.004

34.

*Gettinger

Kratochwill

T. R.

Levin

J. R.

Eubanks

Foy

(2024). Academic and behavior combined support: A single-case practice-based replication study. Journal of School Psychology, 104, Article 101307. https://doi.org/10.1016/j.jsp.2024.101307

35.

Gresham

F. M.

MacMillan

D. L.

Beebe-Frankenberger

M. E.

Bocian

K. M.

(2000). Treatment integrity in learning disabilities intervention research: Do we really know how treatments are implemented? Learning Disabilities Research and Practice, 15(4), 198–205. https://doi.org/10.1207/SLDRP1504_4

36.

Gresham

F. M.

Sanetti

Kratochwill

(2009). Evolution of the treatment integrity concept: Current status and future directions. School Psychology Review, 38(4), 533–540. https://doi.org/10.1080/2372966X.2009.12527793

37.

Harris

K. R.

Stevenson

N. A.

Kauffman

J. M.

(2019). Negative effects of minimum requirements for data points in multiple-baseline designs and multiple-probe designs in the what works clearinghouse standards handbook, version 4.0. https://cecdr.org/sites/default/files/2021-01/_DR_Position_Statement_5_data_points_WWC_SCD_final.pdf

38.

Hill

D. R.

King

S. A.

Lemons

C. J.

Partanen

J. N.

(2012). Fidelity of implementation and instructional alignment in response to intervention research. Learning Disabilities Research & Practice, 27(3), 116–124. https://doi.org/10.1111/j.1540-5826.2012.00357.x

39.

Hintze

J. M.

Wells

C. S.

Marcotte

A. M.

Solomon

B. G.

(2018). Decision-making accuracy of CBM progress-monitoring data. Journal of Psychoeducational Assessment, 36(1), 74–81. https://doi.org/10.1177/0734282917729263

40.

Hitchcock

J. H.

Nastasi

B. K.

Summerville

(2010). Single-case designs and qualitative methods: Applying a mixed methods research perspective. Mid-Western Educational Researcher, 23(2), 49–58. https://scholarworks.bgsu.edu/mwer/vol23/iss2/8

41.

Hott

B. L.

Flores

M. M.

(2023). Single-case research design: Introduction to the special series. Learning Disability Quarterly, 46(1), 3–5. https://doi.org/10.1177/07319487211040493

42.

Hott

B. L.

Flores

M. M.

Morano

Randolph

K. M.

Peltier

(2023). Reviewing manuscripts reporting findings from single-case research design studies. Learning Disability Quarterly, 46(1), 46–58. https://doi.org/10.1177/07319487221089616

43.

Hott

B. L.

Heiniger

Justus

Randolph

K. M.

Al Shabibi

Beasley

Frank

Mitchell

Tennell

Wester

(2023). Reporting quality of single–case research published in learning disabilities journals. Learning Disabilities Research & Practice, 38(3), 224–238. https://doi.org/10.1111/ldrp.12317

44.

Hurtado-Parrado

López-López

(2015). Single-case research methods: History and suitability for a psychological science in need of alternatives. Integrative Psychological and Behavioral Science, 49(3), 323–349.

45.

Iowa Reading Research Center. (2025). Measure FIRST (Fidelity of Implementation in Reading Skills Teaching). University of Iowa. https://irrc.education.uiowa.edu/resources/apps-and-tools/measure-first

46.

King

Rodgers

Lemons

C. J

. (2022). The effect of supplemental Reading instruction on fluency outcomes for children with down syndrome: A closer look at curriculum-based measures. Exceptional Children, 88(4), 421–441. https://doi.org/10.1177/00144029221081006

47.

King

Wang

Nylen

Enders

. (2023). Prevalence of research design in special education: A survey of peer-reviewed journals. Remedial and Special Education, 44(6), 480–494. https://doi.org/10.1177/07419325231152453

48.

King

Wang

Datchuk

S. M.

Rodgers

D. B

. (2023). Meta-analyses of reading intervention studies including students with learning disabilities: A methodological review. Journal of Learning Disabilities, 56(3), 210–224. https://doi.org/10.1177/00222194221077688

49.

King

S. A.

Lemons

C. J.

Davidson

K. A.

Fulmer

Mrachko

A. A

. (2022). Reading instruction for children with down syndrome: Extending research on behavioral phenotype aligned interventions. Exceptionality, 30(2), 92–108. https://doi.org/10.1080/09362835.2020.1749631

50.

King

S. A.

Nylen

Enders

Wang

Opeoluwa

. (2024). Examining the impact of design-comparable effect size on the analysis of single-case design in special education. School Psychology, 39(6), 601–612. https://doi.org/10.1037/spq0000628

51.

Kretlow

A. G.

Bartholomew

C. C.

(2010). Using coaching to improve the fidelity of evidence-based practices: A review of studies. Teacher Education and Special Education, 33(4), 279–299. https://doi.org/10.1177/0888406410371643

52.

Kubina

R. M.

Jr. King

S. A.

Halkowski

Quigley

Kettering

(2023). Slope identification and decision making: A comparison of linear and ratio graphs. Behavior Modification, 47(3), 615–643. https://doi.org/10.1177/01454455221130002

53.

Kubina

R. M.

Jr. Kostewicz

D. E.

King

S. A.

Brennan

K. M.

Wertalik

Rizzo

Markelz

(2021). Standards of graph construction in special education research: A review of their use and relevance. Education and Treatment of Children, 44(4), 275–290. https://doi.org/10.1007/s43494-021-00053-3

54.

*Kuhn

Albers

C. A.

(2022). Early literacy intervention for culturally and linguistically diverse students with varying English language proficiency levels. Journal of Applied School Psychology, 38(4), 283–315. https://doi.org/10.1080/15377903.2021.1953660

55.

Kuhn

J. L.

(2017). Early literacy intervention for culturally and linguistically diverse students across English language proficiency levels (Order No. 10622192) [Doctoral dissertation, University of Wisconsin–Madison]. ProQuest Dissertations & Theses Global. https://www.proquest.com/dissertations-theses/early-literacy-intervention-culturally/docview/1954046420/se-2

56.

Lambert

Fettig

Ledford

, et al (2025). Improving the process and product of intensive intervention through formative triangulation (Version 1) [Preprint]. Research Square. https://doi.org/10.21203/rs.3.rs-7005035/v1

57.

Lanovaz

M. J.

Rapp

J. T.

(2016). Using single-case experiments to support evidence-based decisions: How much is enough? Behavior Modification, 40(3), 377–395. https://doi.org/10.1177/0145445515613584

58.

Ledford

J. R.

Gast

D. L

. (Eds.). (2024). Single case research methodology: Applications in special education and behavioral sciences (4th ed.). Routledge.

59.

Ledford

J. R.

Lambert

J. M.

Pustejovsky

J. E.

Zimmerman

K. N.

Hollins

Barton

E. E.

(2023). Single-case-design research in special education: Next-generation guidelines and considerations. Exceptional Children, 89(4), 379–396. https://doi.org/10.1177/00144029221137656

60.

Loftus

S. M.

Coyne

M. D.

McCoach

D. B.

Zipoli

Pullen

P. C.

(2010). Effects of supplemental vocabulary intervention on the word knowledge of kindergarten students at risk for language and literacy difficulties. Learning Disabilities Research & Practice, 25(3), 124–136. https://doi.org/10.1111/j.1540-5826.2010.00310.x

61.

Maggin

D. M.

Cook

B. G.

Cook

(2019). Making sense of single–case design effect sizes. Learning Disabilities Research & Practice, 34(3), 124–132. https://doi.org/10.1111/ldrp.12204

62.

McKenna

Parenti

(2017). Fidelity assessment to improve teacher instruction and school decision making. Journal of Applied School Psychology, 33(4), 331–346. https://doi.org/10.1080/15377903.2017.1316334

63.

Miciak

Fletcher

J. M.

(2020). The critical role of instructional response for identifying dyslexia and other learning disabilities. Journal of Learning Disabilities, 53(5), 343–353. https://doi.org/10.1177/0022219420906801

64.

National Information Standards Organization. (2008). Journal article versions (JAV): Recommendations of the NISO/ALPSP JAV technical working group. https://www.niso.org/publications/niso-rp-8-2008-jav

65.

Neuman

S. B.

Quintero

Reist

(2023). Reading reform across America: A survey of state legislation.  Albert Shanker Institute. https://www.shankerinstitute.org/read

66.

Ninci

(2023). Single-case data analysis: A practitioner guide for accurate and reliable decisions. Behavior Modification, 47(6), 1455–1481. https://doi.org/10.1177/0145445519867054

67.

Northern

A. M.

Opp

(2026). Reimagining the Institute of Education Sciences: A strategy for relevance and renewal. Institute of Education Sciences, U.S. Department of Education. https://ies.ed.gov/ies/2026/02/reimagining-ies

68.

Nye

Melendez-Torres

G. J.

Bonell

(2016). Origins, methods and advances in qualitative meta-synthesis. Review of Education, 4(1), 57–79. https://doi.org/10.1002/rev3.3065

69.

O’Donnell

(2008). Defining, conceptualizing, and measuring fidelity of implementation and its relationship to outcomes in k–12 curriculum intervention research. Review of Educational Research, 78(1), 33–84. https://doi.org/10.3102/0034654307313793

70.

O’Connor

R. E.

Fulmer

Harty

K. R.

Bell

K. M.

(2005). Layers of reading intervention in kindergarten through third grade: Changes in teaching and student outcomes. Journal of Learning Disabilities, 38(5), 440–455. https://doi.org/10.1177/00222194050380050701

71.

*O’Keeffe

B. V.

Slocum

T. A.

Magnusson

(2013). The effects of a fluency training package on paraprofessionals’ presentation of a reading intervention. The Journal of Special Education, 47(1), 14–27. https://doi.org/10.1177/0022466911404072

72.

Onghena

Maes

Heyvaert

(2019). Mixed methods single case research: State of the art and future directions. Journal of Mixed Methods Research, 13(4), 461–480. https://doi.org/10.1177/1558689818789530

73.

Page

M. J.

McKenzie

J. E.

Bossuyt

P. M.

Boutron

Hoffmann

T. C.

Mulrow

C. D.

Shamseer

Tetzlaff

J. M.

Akl

E. A.

Brennan

S. E.

Chou

Glanville

Grimshaw

J. M.

Hróbjartsson

Lalu

M. M.

Loder

E. W.

Mayo-Wilson

McDonald

. . .Moher

(2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, Article 71. https://doi.org/10.1136/bmj.n71

74.

Peltier

Morano

Shin

Stevenson

McKenna

J. W.

(2021). A decade review of single–case graph construction in the field of learning disabilities. Learning Disabilities Research & Practice, 36(2), 121–135. https://doi.org/10.1111/ldrp.12245

75.

Peng

Fuchs

L. S.

Cho

Elleman

A. M.

Kearns

D. M.

Patton

Compton

D. L.

(2020). Is “response/no response” too simple a notion for RTI frameworks? Exploring multiple response types with latent profile analysis. Journal of Learning Disabilities, 53(6), 454–468. https://doi.org/10.1177/0022219420931818

76.

Rila

King

S. A.

Bruhn

A. L.

Estrapala

. (2025). Demographic reporting of students and implementation teams in school-based single-case research for students with emotional and behavioral disorders: A systematic review. Journal of Positive Behavior Interventions, 27(4), 271–285. https://doi.org/10.1177/10983007251335369

77.

Ruffini

S. J.

Lindsay

McInerney

Waite

Miskell

(2016). Measuring the implementation fidelity of the response to intervention framework in Milwaukee Public Schools (REL 2017–192). Institute of Education Sciences, U.S. Department of Education.

78.

Sanetti

L. M. H.

Collier-Meek

M. A.

(2019). Supporting successful interventions in schools: Tools to plan, evaluate, and sustain effective implementation. Guilford Press.

79.

Sanetti

L. M. H.

Luh

H. J.

(2019). Fidelity of implementation in the field of learning disabilities. Learning Disability Quarterly, 42(4), 204–216. https://doi.org/10.1177/0731948719851514

80.

Scanlon

D. M.

Gelzheiser

L. M.

Vellutino

F. R.

Schatschneider

Sweeney

J. M.

(2008). Reducing the incidence of early reading difficulties: Professional development for classroom teachers versus direct interventions for children. Learning and Individual Differences, 18(3), 346–359. https://doi.org/10.1016/j.lindif.2008.05.002

81.

Smith

J. L. M.

Nelson

N. J.

Fien

Smolkowski

Kosty

Baker

S. K.

(2016). Examining the efficacy of a multitiered intervention for at-risk readers in grade 1. The Elementary School Journal, 116(4), 549–573. https://doi.org/10.1086/686249

82.

Snodgrass

M. R.

Chung

M. Y.

Meadan

Halle

J. W.

(2018). Social validity in single-case research: A systematic literature review of prevalence and application. Research in Developmental Disabilities, 74, 160–173. https://doi.org/10.1016/j.ridd.2018.01.007

83.

*Spencer

T. D.

Kirby

M. S.

Petersen

D. B.

(2024). Vocabulary instruction embedded in narrative intervention: A repeated acquisition design study with first graders at risk of language-based reading difficulty. American Journal of Speech-Language Pathology, 33(1), 135–152. https://doi.org/10.1044/2023_AJSLP-23-00004

84.

Stevens

E. A.

Stewart

Vaughn

Lee

Y. R.

Scammacca

Swanson

(2024). The effects of a Tier 2 reading comprehension intervention aligned to Tier 1 instruction for fourth graders with inattention and reading difficulties. Journal of School Psychology, 105, 101320. https://doi.org/10.1016/j.jsp.2024.101320

85.

Stevens

E. A.

Vaughn

Swanson

Scammacca

(2020). Examining the effects of a Tier 2 reading comprehension intervention aligned to Tier 1 instruction for fourth-grade struggling readers. Exceptional Children, 86(4), 430–448. https://doi.org/10.1177/0014402919893710

86.

Swanson

Wanzek

Haring

Ciullo

McCulley

(2013). Intervention fidelity in special and general education research journals. The Journal of Special Education, 47(1), 3–13. https://doi.org/10.1177/0022466911419516

87.

Thoele

J. M.

DeAngelo

(2023). An examination of social validity for students with emotional behavioral disorders: Has progress been made? Education and Treatment of Children, 46(4), 279–302. https://doi.org/10.1007/s43494-023-00109-6

88.

*Thornton

(2012). The neurological impress method as a reading intervention for students with emotional behavior disabilities [Doctoral dissertation, Northern Arizona University]. ProQuest Dissertations and Theses Global.

89.

Travers

J. C.

Tincani

Dowdy

Forbes

Johnson

J. V.

(2026). A registered report on selective reporting in single-case experimental research. Remedial and Special Education, 47(1), 36–52. https://doi.org/10.1177/07419325251350697

90.

van Dijk

Lane

H. B.

Gage

N. A.

(2023). How do intervention studies measure the relation between implementation fidelity and students’ reading outcomes? A systematic review. The Elementary School Journal, 124(1), 56–84. https://doi.org/10.1086/725672

91.

Wanzek

Petscher

Otaiba

S. A.

Rivas

B. K.

Jones

F. G.

Kent

S. C.

Schatschneider

Mehta

(2017). Effects of a year long supplemental reading intervention for students with reading difficulties in fourth grade. Journal of Educational Psychology, 109(8), 1103–1119. https://doi.org/10.1037/edu0000184

92.

Wanzek

Vaughn

Scammacca

Gatlin

Walker

M. A.

Capin

(2016). Meta-analyses of the effects of Tier 2 type reading interventions in grades K-3. Educational Psychology Review, 28(3), 551–576. https://doi.org/10.1007/s10648-015-9321-7

93.

Wellons

Q. D.

Roach

A. T.

Sanchez-Alvarez

(2024). Is social validity an afterthought in single-case design studies in school psychology research? Contemporary School Psychology, 28(4), 454–468. https://doi.org/10.1007/s40688-023-00460-w

94.

Young

M. K.

King

S.A.

Datchuk

S. M.

McFadden

Yoon

. (In press). Fidelity of implementation and instructional alignment in Tier 2 Interventions: An updated review of experimental studies. Learning Disabilities: Research and Practice.