Abstract
This article synthesizes observation studies investigating reading instruction for students with learning disabilities (LD) in Grades K–12. A systematic search of the literature between 1980 and 2014 resulted in the identification of 25 studies. In addition to replicating and extending E. A. Swanson’s synthesis, the research questions of studies from 1980 to 2014 were analyzed for trends and gaps in the research. Findings related to both E. A. Swanson’s replicated questions and several new research questions revealed that (a) only four observation studies met inclusion criteria between 2006 and 2014, (b) greater detail in observation data related to five critical components of reading were reported in studies since 2005, (c) the most frequently used grouping structure was whole-group instruction, and (d) the research questions and purposes of observation studies tend to be related to examining prevailing practices following legislative reform.
Observation studies have been used for decades to examine behaviors of interest and uncover imbedded problems in natural settings across a myriad of disciplines ranging from child socialization (e.g., Lytton, 1971) to nursing (e.g., Atwal & Caldwell, 2005). In special education, observation studies have generated evidence regarding the quality of and extent to which empirically validated practices, legislation, and reform align with prevailing practice. Legislative change and policy reform have purported to influence the quality of education for students with disabilities (SWDs) since the establishment of specialized education through the enactment of Education for All Handicapped Children Act of 1975 and the Individuals With Disabilities Education Improvement Act (IDEIA; 2004).
Over time, increasing numbers of SWDs were provided access to appropriate educational support, including greater access to the general education classroom. The U.S. Department of Education reported that only half of SWDs spent 80% or more of their day in general education classrooms in 2002 (U.S. Department of Education, Office of Special Education Programs, 2011). By 2011, 60% of SWDs spent 80% of their day in the general education classroom (U.S. Department of Education, Office of Special Education Programs, 2011). The increase in SWDs receiving the majority of their education in general education classrooms quelled concerns about segregated special education practices; however, it simultaneously raised questions about the extent to which this population receives appropriate services in general education classrooms. Researchers raised questions about the academic outcomes of students with learning disabilities (LD) educated in general versus special education settings (e.g., resource rooms). Observation studies provide a method for addressing policy to practice questions, determining the quality of instruction SWDs receive across contexts.
Observation study data describe prevailing practices and the extent to which they align with the growing research base on effective instructional practices. Several consensus reports contribute to this robust research base, providing evidence on the nature of effective reading instruction (e.g., National Reading Panel, 2000; Rand Reading Study Group, 2002; Snow, Burns, & Griffin, 1998). For example, direct instruction, strategy instruction, or a combination of the two are associated with the highest effect sizes in reading comprehension for students with LD (Edmonds et al., 2009; H. L. Swanson, 2001; SH. L. wanson & Hoskyn, 1998). Explicit, systematic instruction has also been recommended due to its association with improved reading outcomes (e.g., H. L. Swanson, 1999).
More is known about the components of effective reading instruction than ever before, but less is known about the extent to which these practices are implemented in classrooms. Unfortunately, the evidence afforded by reading outcome measures, such as the National Assessment of Educational Progress (NAEP), indicates there is room for improvement. For example, between 2011 and 2013, NAEP reading scores for Grade 4 students without disabilities increased from 225 to 227, whereas the scores for SWDs declined from 186 to 184 (National Center for Educational Statistics, NAEP, 2013).
Observation studies have reported the quality of and extent to which prevailing practices align with empirically validated practices (e.g., McKenna, Shin, & Ciullo, 2015; E. A. Swanson, 2008; Vaughn, Levy, Coleman, & Bos, 2002). Vaughn et al. (2002) synthesized 16 observation studies in reading for students with LD and emotional/behavioral disorders (EBD) between 1975 and 2000. Results indicated that although a substantial amount of time was designated for reading instruction, students spent a limited amount of time actually reading (i.e., 6–10 min per day of silent reading and 3–13 min per day of oral reading). Teachers’ primary method for comprehension instruction was reading aloud to students and asking questions. Students with LD and EBD spent more than half of the allocated reading time completing independent seatwork in general and special education settings. The authors found the overall quality of reading instruction was low.
E. A. Swanson (2008) synthesized reading instruction observation studies for students with LD (i.e., elementary, middle, and high school students) between 1980 and 2005. Similar to the Vaughn et al. (2002) findings, the results indicated reading instruction included little to no explicit instruction in phonics or comprehension strategies (E. A. Swanson, 2008). Students with LD spent very little time reading, and whole-class instruction was the most frequently used grouping procedure across settings (i.e., inclusion, resource, and general education), regardless of group size.
McKenna et al. (2015) conducted a synthesis of math and reading observation research for students with LD (i.e., elementary, middle, and high school students) from 2000 to 2013. Despite expert panel recommendations and policy reforms to implement multi-tiered systems of support (e.g., response to intervention [RTI]), results showed cognitive strategy instruction, differentiation, and opportunities for independent practice occurred infrequently. For example, only one study (Orosco & O’Connor, 2014) out of 11 reported explicit instruction as a consistent component of instruction. There was evidence of effective instructional practices in the areas of vocabulary (e.g., explicit and direct instruction) and fluency (e.g., repeated readings). Consistent with Vaughn et al. (2002) and E. A. Swanson (2008), comprehension strategy instruction was infrequently observed across settings for students. McKenna et al. (2015) sought to extend E. A. Swanson’s (2008) synthesis to report on the current state of observational research by adding observation studies of mathematics instruction for students with LD. Unlike E. A. Swanson’s (2008) synthesis and the synthesis presented in this article, inclusion criteria for McKenna et al. (2015) did not include a formal observation tool, describe training procedures in studies, report on student academic achievement data, or, as this article uniquely does, examine the evolution of research questions and purposes over time in reading observation studies for students with LD.
McKenna et al.’s (2015) synthesis, though published recently, applied inclusion criteria that did not align with E. A. Swanson’s (2008) criteria (i.e., a formal observation tool was not required, instruction could be in reading or math, unpublished dissertations were not included to address publication bias). The purpose of this synthesis was to replicate and extend E. A. Swanson’s (2008) observation synthesis of observation studies identified through 2005 to enhance understanding of reading instruction for students with LD and aggregate the findings across studies. In contrast to E. A. Swanson’s (2008) or McKenna et al.’s (2015) syntheses, this systematic review seeks to examine the trends in the purposes and research questions set forth by observation study teams. This synthesis also seeks to describe the degree to which the number and quality of observation studies conducted has been utilized to reflect on the impact of reform and legislative efforts. The following research questions replicate E. A. Swanson’s (2008) synthesis:
This synthesis extends Swanson’s work and uniquely contributes to the literature by highlighting trends in the purposes and research questions of observation studies from 1980 to 2014. The following additional questions are addressed in this synthesis:
Method
Cooper’s (2010) procedures were implemented to identify the corpus of studies for this synthesis. There were six criteria used to select studies. The selected studies (a) were published in English in a peer-reviewed journal or unpublished dissertation between 1980 and 2014, (b) were conducted in Grades K–12, (c) were observation studies conducted during reading instruction in either general education or special education resource room settings, (d) included disaggregated data pertaining to reading instruction if observation data included more than reading instruction, (e) used formal observation tools to observe reading instruction, and (f) included at least one participant identified with a learning disability. Studies were excluded if observations were conducted within a broader study to determine the effectiveness of interventions.
Literature Search
The comprehensive literature search was conducted using a number of steps. First, an electronic search was conducted of the ERIC, PsycINFO, and Education Source databases between 1980 and 2014. This range was selected as a way to replicate and extend E. A. Swanson’s (2008) observation synthesis. Key search terms or roots were used in a number of combinations to identify the greatest number of articles possible (i.e., observation, read*, disability*, disorder, difficult*, dyslexia, learning problems, remedial, at risk, struggling, brain dysfunction, resource room programs, resource teachers, special education, and special needs). In addition, the ProQuest Dissertation and Theses database was searched using similar combinations of the key words listed above. Second, a 2013 and 2014 hand search was conducted in six major journals commonly reporting research for students with LD (i.e., Exceptional Children, Journal of Learning Disabilities, The Journal of Special Education, Learning Disability Quarterly, Scientific Studies of Reading, and Annals of Dyslexia). The hand search ceased after 2 years because no additional studies were identified. Third, reference lists of identified observation studies that fit the criteria were searched (i.e., “footnote chasing”; White, 1994).
The initial search yielded 1,935 abstracts. The study details of 116 articles were examined after duplicates were eliminated and abstracts were reviewed. A total of 19 studies from the initial electronic search met criteria, and two additional studies were located during reference chasing. In addition to the three unpublished dissertations that met criteria for inclusion in E. A. Swanson’s (2008) synthesis, one additional unpublished dissertation was included (Ko, 2012). This additional dissertation resulted after reviewing the abstracts of 71 dissertations and the full text of eight. Therefore, a total of 25 studies are included in this synthesis (i.e., four studies extend the corpus of E. A. Swanson’s, 2008, review).
Coding Procedures
Coding procedures were employed to organize and extract relevant information from each of the studies that met criteria (N = 25). The lead author created a code sheet for this synthesis based on components specified in the What Works Clearinghouse Design and Implementation Assessment Devices (Institute of Education Sciences [IES], 2011) guidelines, as well as the elements described in E. A. Swanson’s (2008) synthesis coding procedures. Data were collected on (a) student and teacher participants, (b) reported components of effective reading instruction, (c) instructional grouping practices, (d) observation measures, (e) observer training and reliability procedures, (f) measures of student academic achievement, and (g) research questions/study purposes. The code sheet was comprised of a combination of forced-choice items (e.g., grade level) and open-ended items (e.g., observer training).
Interrater Reliability
The authors double coded all studies (N = 25) after training on the use of the code sheet and the establishment of interrater reliability. Each of the two coders independently coded a single article to establish interrater reliability, and responses were used to calculate the percent agreement. An interrater reliability of .96 was achieved. Next, the two raters independently coded each study. Discrepancies were resolved by consensus in meetings to discuss the coding.
Data Analysis
Analysis of the data aggregated via the code sheets was conducted to reveal findings from the corpus of studies and identify potential areas for future investigations and observation studies. In this synthesis, observational studies were included that described reading instruction for students with LD across settings and grade levels. It was not possible to conduct meta-analytic calculations for this synthesis because studies were excluded if observations were conducted to determine the effectiveness of interventions proposed by researchers.
A within and across study analysis (similar to Vaughn et al., 2002) was used to identify findings related to the corpus of research questions. Research questions and study purposes were analyzed across all studies from 1980 to 2014, which is a unique contribution of the present synthesis. Specifically, research questions and purposes were coded for key words and phrases and to identify convergent and divergent content in the questions and purposes. Data reported in the observation studies analyzed for this synthesis are categorized based on student academic achievement trends, training and interrater reliability, instructional practices and features across settings, overall quality of studies, and an analysis of research questions and study purposes over time.
Results
Four studies met the inclusion criteria for this article in addition to the 21 studies also identified for E. A. Swanson’s (2008) synthesis. Of the entire corpus of studies (N = 25), four studies included full school day observations, 13 were conducted in general education settings, 15 included resource room settings, and one included both resource room and inclusive settings for observations. Standardized measures of student achievement were included in seven of the studies, with one including a curriculum-based measure of oral reading fluency (Zigmond & Baker, 1994). Only one study from the extension (i.e., studies identified after 2005; E. A. Swanson & Vaughn, 2010) included student achievement data. Basic study information for the additional four studies extending E. A. Swanson’s (2008) synthesis is summarized in Table 1. Study observation procedures and findings are reported in Table 2 (see Tables 1 and 2 in E. A. Swanson, 2008, for basic study information and findings regarding the 21 studies in the previous synthesis).
Basic Study Information.
Note. LD = learning disabilities; sp. ed. = special education; gen. ed. = general education; T = teacher; S = student; RISE = Reading Instruction in Special Education Observation Instrument; NR = not reported in the study; ICE-R = Instructional Content Emphasis–Revised.
Doctoral dissertation.
Observation Procedures and Results/Findings.
Note. sp. ed. = special education; WJ III = Woodcock-Johnson III; DIBELS = Dynamic Indicators of Basic Early Literacy Skills; LD = learning disabilities; T = teacher; S = student.
Unpublished doctoral dissertation. bStudy reports data from school day. Only reading data is reported.
Student Academic Achievement
In addition to the six studies from the 2008 synthesis, one study presented in this extension included a measure of academic achievement (E. A. Swanson & Vaughn, 2010). E. A. Swanson and Vaughn (2010) reported results on pre- to posttest differences on the Woodcock-Johnson III (WJ III; Woodcock, McGrew, & Mather, 2001) letter–word identification (LWI), word attack (WA), and passage comprehension (PC) subtests. Average student scores on LWI and PC were more than 1 standard deviation (SD) below the normative mean at pretest, and the scores remained at least 1 SD below the normative mean at posttest. E. A. Swanson and Vaughn also reported results from the Dynamic Indicators of Basic Early Literacy Skills Oral Reading Fluency (DIBELS ORF; Good, Simmons, Kame’enui, Kaminski, & Wallin, 2002), which was administered at 6 points over the course of the observation study semester. Students gained an average of 0.37 words per week on passages one grade level below their assigned grade.
Quality of Studies
Observer training
Due to the importance of observation studies in a number of fields, observer bias can be reduced through observer training. For example, Hartmann and Wood (1990) recommended the use of an observation manual, practice sessions with observers, retraining, and debriefing following the study. The majority (n = 10) of the studies that met inclusion for E. A. Swanson’s (2008) synthesis reported on training, practice sessions, and retraining. Four studies implemented initial training and practice sessions, but none of the studies reported debriefing. Of the four additional studies identified in this article, each reported on the training processes implemented. Klingner, Urbach, Golos, Brownell, and Menon (2010) reported across-site training via videotape viewings and debriefing, as well as the development of a training protocol. “Anchor persons” were established to train observers to reliability before they were sent into the field. An additional observer was added for 18% of the lessons. E. A. Swanson, Solis, Ciullo, and McKenna (2012) conducted training with an overview of the study purpose and observation tool (Instructional Content Emphasis–Revised; ICE-R; Edmonds & Briggs, 2003) and practice sessions. Additional training was implemented in an attempt to prevent observer drift. E. A. Swanson and Vaughn (2010) provided 6 hr of training using the observation tool manual (i.e., ICE-R) combined with practice using scenarios and videos. Additional training was provided to avoid drift. Ko (2012) was the sole observer for her investigation. However, prior to implementing the researcher-developed observation tool, she piloted the observation instrument with a trained reviewer.
Interrater reliability
Establishing interrater reliability aids in determining whether the observation data collected reliably and accurately reflect the observed activities and behavior consistently across observers (Hintze, 2005). Of the five most popular ways to establish reliability (see Suen, Ary, & Ary, 1986), interrater percentage agreements were established in the identified observation studies using the gold standard method (Gwet, 2001) or a percent agreement index (Hintze, 2005) to establish the extent to which two or more observers agree. Four studies in E. A. Swanson’s (2008) synthesis used the gold standard method to establish interrater reliability, in which an expert established observation codes and observer scores were compared and calculated against the expert codes. Among the four studies in this extension, two studies used the gold standard method (E. A. Swanson et al., 2012; E. A. Swanson & Vaughn, 2010). E. A. Swanson et al.’s (2012) team began observations once agreement of 90% was established across all observers. E. A. Swanson and Vaughn (2010) established interrater agreement before initiating data collection, at which point observers reached 100% agreement and 90% agreement halfway through the observations. The other method for establishing interrater reliability is consistency between raters. Twelve studies used interrater percent agreement in E. A. Swanson’s (2008) synthesis, and two of the extension studies used interrater percent agreement (Klingner et al., 2010; Ko, 2012). Ko (2012) reached 88% interobserver agreement with an external reviewer while piloting the observation instrument to establish reliability and refine the tool before starting observations. Initial interrater scores for Klingner et al.’s (2010) observers were below the 75% threshold, so a training tape was developed. Observers were trained to 80% reliability before going into the field for observations. In contrast to the 33% of studies that did not report on criteria for establishing interrater reliability in E. A. Swanson’s (2008) synthesis, all studies in this extension (Klingner et al., 2010; Ko, 2012; E. A. Swanson et al., 2012; E. A. Swanson & Vaughn 2010) reported on interrater reliability criteria as described above.
Number of observations
The number of observations reported in the investigations in the E. A. Swanson (2008) synthesis varied from one (three of which were full school day observations) to 33 (Gelzheiser & Meyers, 1991). The number of observations in studies conducted since 2005 (i.e., four studies in this extension) ranged from 30 to 149 (see Table 1).
Instructional Practices and Features Across Settings and Grade Levels
Grade levels and settings
The studies identified by E. A. Swanson (2008) included only two studies with high school students and three with seventh- and eighth-grade students. The rest of the observations were conducted in elementary grades. Similarly, only one of the studies located since 2005 was conducted in a secondary setting (i.e., ninth grade; Ko, 2012). The remaining three studies were conducted in the elementary grades (i.e., second to fifth). All four studies were conducted in Tier 2 or Tier 3 intensified instructional settings (E. A. Swanson et al., 2012), special education (Ko, 2012; E. A. Swanson & Vaughn, 2010), and a combination of both resource rooms and inclusive settings (Klingner et al., 2010).
Components of reading instruction
Few studies reported disaggregated data regarding specific instruction in each of the five components of reading (i.e., phonics, phonemic awareness [PA], comprehension, fluency, and vocabulary), limiting the extent to which previous observation study syntheses (e.g., E. A. Swanson, 2008; Vaughn et al., 2002) examined the amount of time spent on each component. For example, only six of the 21 studies from E. A. Swanson’s (2008) synthesis reported word study or phonics instruction, four studies reported evidence of comprehension instruction, and three noted vocabulary and fluency instruction. In the extension studies, E. A. Swanson et al. (2012) reported the amount of time and type of instruction in each component. Of the five components, comprehension was most frequently observed (42% of total 6,096 min of reading instruction observed), followed by word study/phonics (22%), vocabulary (11%), fluency (8%), and phonological awareness (2%). The majority of the comprehension exercises (53% of total comprehension minutes) involved students responding to questions from the teacher or on a worksheet after reading a passage. Comprehension strategy instruction comprised 26% of the time, and the remaining observed time was divided between activating prior knowledge and making predictions. The word study activities (22%) involved applying letter–sound correspondences in reading and writing activities. Vocabulary instruction comprised of learning or practicing definitions (27%), followed by morphology study (21%) and context clue strategy instruction (21%). Repeated reading of connected text was the most prevalent fluency activity (66%), followed by silent reading fluency (24%) and letter–sound naming fluency (3%).
E. A. Swanson and Vaughn (2010) reported on reading instruction in five critical areas of reading by dividing each component into subcomponents (e.g., PA contained rhyming, blending, or segmenting tasks). Teachers taught PA for 2.8% of the total observation time. Although the quality of PA instruction varied, 40% was coded as “low average” or “weak.” Comprehension activities totaled 25.6% of the instruction observed, with the ratings of teacher quality rated as “high average” (23%) or “excellent” (46.5%). Phonics/word study and fluency comprised 31.96% and 8.9% of the instructional time, respectively; 27.3% of word study was coded as “excellent” and 46% as “high average,” but 25.2% was “weak.” Observers reported vocabulary in 9.6% of the instructional time with an overall “high average” quality rating.
Klingner et al. (2010) examined how, and to what extent, special education teachers taught reading comprehension. Out of the 124 lessons observed, 82 (66%) addressed reading comprehension. However, in 40% of those lessons, minimal comprehension instruction was provided (e.g., teachers frequently asked students about what they were reading or reviewed vocabulary words). Their findings suggest teachers did not promote interactive discussions, rarely instructed their students about text structure, and did not promote metacognition, aside from cueing students to think about what they were reading. Teachers prompted students to use the following strategies in order from most to least frequent: predicting, making connections, and looking back or rereading. The following strategies were observed 7 times or less (out of 124 observations): summarizing, finding the main idea, retelling, visualizing, previewing questions, generating questions, and paraphrasing. The two types of instruction observed in reading comprehension were direct explanations and modeling/think alouds. Overall, Klingner and her colleagues (2010) reported that instruction was minimally connected to current research.
Similarly, Ko (2012) investigated reading comprehension instruction in special education classrooms. Her findings indicated that 88% (n = 21) of the 24 lessons incorporated comprehension strategies, practices, or activities. However, lessons did not include explicit instruction in comprehension strategies. The special education teachers most frequently implemented read alouds (i.e., a student or teacher read text aloud), questioning, independent seatwork, activating prior knowledge, and using graphic organizers. When observations were compared with the lesson plans, the teacher often fulfilled the objectives intended for the students (e.g., summarizing for the students and identifying main ideas). Ko (2012) also noted that the comprehension strategies, practices, or activities were used with narrative text in 17 of the 24 lessons, expository text in three of the lessons, and both types in two lessons. Two lessons represented activities such as grammar worksheets and reviewing a PowerPoint presentation.
Instructional grouping
Whole-group instruction was the most frequently reported structure in both general education and special education settings across the 21 studies identified by E. A. Swanson (2008). In this update, Klingner et al., 2010 did not report instructional grouping, but most observations occurred in resource room settings. E. A. Swanson et al. (2012) reported that the class sizes ranged from one to eight students (M = 4), but they did not report on instructional grouping within those classes. E. A. Swanson and Vaughn (2010) reported whole-group instruction as the most common structure (45.8%), then individualized (27.3%), independent (19.8%), small group (4.6%), and pairs (2.6%). A range of one to seven students was observed in the classes. Ko (2012) reported the following grouping structures from the observed lessons (n = 24): 66% whole-group instruction, 23% individual seatwork, and 11% small groups (two to three students). Two of the eight teachers used whole-group instruction exclusively across all three observations.
Analysis of Research Questions and Study Purposes Over Time
Research questions and purposes in the studies shifted focus over time (i.e., 1980–2014; see Figure 1 for a summary of themes in observation study research questions and purposes from 1980–2014; see Table 3 for overview of studies by decade). In the 1980s and early 1990s, research ranged from investigating the nature of reading activities for students with LD (e.g., Leinhardt, Zigmond, & Cooley, 1981) to comparing the educational experience of students with LD in general education, mainstream, and special education settings (Gelzheiser & Meyers, 1991; Haynes & Jenkins, 1986). It was not until the late 1990s that questions about grouping structures (O’Connor & Jenkins, 1996; Vaughn, Moody, & Schumm, 1998) and teacher perceptions and practices (Rieth et al., 2003; Schumm, Moody, & Vaughn, 2000) were addressed. An important finding is that no observational studies conducted between 2006 and 2009 met the inclusion criteria for this synthesis. From 2010 to 2014, the study purposes narrowed as two studies specifically investigated the nature of reading comprehension instruction for students with LD (Klingner et al., 2010; Ko, 2012) and two examined the components of reading instruction (E. A. Swanson et al., 2012; E. A. Swanson & Vaughn, 2010). E. A. Swanson and Vaughn (2010) investigated grouping practices and academic progress. E. A. Swanson et al. (2012) explored special education teacher perceptions of RTI and the extent to which teachers used evidence-based practices; E. A. Swanson et al. (2012) identified six themes referenced in the teacher focus groups and interviews (i.e., benefits of RTI, challenges of RTI, universal screening, progress monitoring, and collaboration among teachers). Regarding RTI, teachers consistently identified early intervention as a benefit and increased paperwork as the most frequently mentioned challenge. Similar to E. A. Swanson et al.’s (2012) study, Ko (2012) supplemented observations with teacher interviews regarding the factors teachers identified as influencing reading comprehension instructional practices, strategies, and activities implemented during the observations. Teachers interviewed in Ko’s (2012) study demonstrated a limited understanding of the components of reading instruction. In addition, teachers’ knowledge and use of Individualized Education Programs (IEPs) to inform reading instruction varied widely, despite the fact that all teachers taught students with reading goals in their IEPs.

Research question content over time and number of studies (1980–2014).
Studies by Year (1980–2014).
Doctoral dissertation. bIncluded in synthesis extension.
Discussion
This synthesis reports the extent to which empirically validated components of reading instruction have been implemented from 1980 to 2014, describing the nature of reading instruction for students with LD across contexts. This synthesis updated E. A. Swanson’s (2008) observation study synthesis by examining the components of effective reading instruction, trends in student achievement data, and training and interrater reliability procedures. In addition, this systematic review extended E. A. Swanson’s (2008) review by providing an examination of the trends in the purposes and research questions posited by investigators from 1980 to 2014. Key findings related to the targeted research questions were generated: (a) only four observation studies met inclusion criteria between 2006 and 2014, (b) greater specificity and comprehensiveness of observations were reported in studies since 2005 (i.e., studies examined the five components of reading and included descriptive data regarding student engagement and teacher quality), (c) there is a trend toward whole-group instruction, and (d) the research questions and purposes tended to reflect legislation or policy change.
From 2006 to 2014, only four studies met inclusion criteria for this synthesis, despite implementation of key legislation, such as the No Child Left Behind (NCLB) Act of 2001 and the Individuals With Disabilities Education Act (IDEA) reauthorization, as well as expert panel recommendations (e.g., National Reading Panel, 2000; Rand Reading Study Group, 2002) urging evidence-based practices. An examination of the complex realities in classrooms reveals the extent to which research is connected to practice. The lack of reading observation studies for students with LD during this time impedes our understanding of the extent to which expert panel recommendations were implemented. Considering the span of 6 years in which observation studies are not identified, an opportunity may have been missed to examine contextual factors that both support and inhibit widespread implementation of empirically supported practices.
Another key finding is the greater specificity of data in observation studies since 2005. For example, two studies provided a detailed set of data regarding the five components of reading (E. A. Swanson et al., 2012; E. A. Swanson & Vaughn, 2010). In addition, E. A. Swanson and Vaughn (2010) included measures of student engagement and teacher quality of instruction during each of the components. Klingner et al. (2010) and Ko (2012) conducted investigations focused on how special education teachers teach reading comprehension. This level of detail reported expands our understanding of the implementation and quality of evidence-based practices in classroom contexts. Data across the four studies can be aggregated more readily because key variables were operationalized; this was not possible using the studies identified for inclusion by E. A. Swanson (2008). Finally, all studies reported interrater reliability, in contrast to the studies identified for inclusion by E. A. Swanson (2008), which signals greater consistency of the quality of data obtained (i.e., establishing reliability is one way to reduce observer bias).
Convergent with previous synthesis findings (E. A. Swanson, 2008; Vaughn et al., 2002), the trend toward whole-group instruction was consistent across studies. Many of the observations occurred in special education classrooms with a lower teacher–student ratio than is typically found in general education settings, even when instruction was delivered to the whole group. However, in Ko’s (2012) investigation, observed class sizes ranged from five to 14. In a group of 14 students, it is unlikely that individual students are performing at the same or similar level. It is possible that whole-group instruction may have diminished opportunities for differentiation of reading instruction than small group instruction. It is somewhat discouraging to see the trend of whole-group instruction continue, despite evidence that the needs of SWDs are better met with small group instruction (e.g., Elbaum, Vaughn, Hughes, & Moody, 1999). Small groups of three students produce similar gains to a one-on-one format, but small groups and individual instruction outperform groups of 10 (Vaughn et al., 2003). Small group instruction is particularly beneficial for SWDs because the reduced student–teacher ratio facilitates response opportunities, teacher modeling, and frequent feedback. Although findings are mixed regarding the ideal intervention group size (Vaughn, Wanzek, Murray, & Roberts, 2012), small group and one-on-one instruction are associated with increased outcomes in the literature for both elementary and secondary students (e.g., Vaughn et al., 2010; Wanzek & Vaughn, 2007).
To better understand the findings from observational data across 34 years, the research questions and study purposes were analyzed. Studies tended to examine practice that reflected changes in legislation. For example, following the Regular Education Initiative (REI) of 1986, observation studies were largely concerned with comparing reading instruction for students with LD in general education or special education settings. This trend continued until 1996 when cooperative learning and other grouping structures were examined, preceding a shift in the late 1990s and 2000s toward the investigation of prevalent instructional practices and teacher perceptions. Although recent studies (e.g., E. A. Swanson et al., 2012; E. A. Swanson & Vaughn, 2010) provide a more fine-grained analysis of instructional practices and the extent to which the five critical components of reading were observed, the dearth in observational data since 2005 precludes the opportunity to analyze data to influence policy and practice. Such data could be used to inform technical assistance, teacher training or coaching, and future legislative and policy decisions. Additional observation studies might illuminate features of classroom contexts in which evidence-based practices are widely implemented. Conversely, they might also reveal factors that inhibit implementation of empirically supported practices and promote the generation of study purposes and hypotheses for investigation.
Research questions and study purposes can be generated based on the trends and features of studies examined in this review. Considering pending educational reforms, such as the current reauthorization of IDEA, it might be useful to increase observation studies to extend the descriptive evidence base on the quality and content of reading instruction for students with LD. If the scope of studies across grade levels and geographic regions is expanded, it will potentially enhance our understanding of widespread practices. Additional data sources, such as interviews and focus groups, could increase the degree to which the current state of instructional practice implementation reflects research-based recommendations (similar to Schumm et al., 2000; E. A. Swanson et al., 2012).
Limitations
There are a number of limitations related to the synthesis presented in this article. Some observed components of reading instruction were defined differently across studies. For example, Klingner et al. (2010) defined activities that promoted metacognition, such as reminding students to think before or during reading, as read alouds. In contrast, Ko (2012) operationalized read alouds as a student or teacher reading aloud, and coded them as comprehension activities. One might infer the teacher promoted metacognition through the read aloud activity, but the data Ko (2012) collected do not provide the same detail as Klingner et al. (2010). This limits the degree to which metacognition findings can be generated across studies. This variability makes it difficult to generate robust cross-study findings. Another limitation is that although the number of observations ranged from 24 to 34 for three studies and 124 for one study (Klingner et al., 2010), these totals reflect only two to three observations per teacher. It is difficult to capture a sense of prevailing instructional practices by observing a teacher such a limited number of times. One study examined comprehension instruction for secondary students with LD but did not report data on the other components of reading (Ko, 2012). Similar to E. A. Swanson’s (2008) findings and recommendations, more observation studies at the secondary level are warranted. This synthesis presents four studies over a span of 9 years, which is a limitation because this is not a sufficient sample to reflect overall prevailing practice. Furthermore, similar author teams located in the same geographical region conducted two of the four studies, limiting the generalizability of the aggregated findings (E. A. Swanson et al., 2012; E. A. Swanson & Vaughn, 2010).
Implications for Practice
The findings across observation studies presented in this synthesis suggest that there is a persistent disconnect between research and prevailing practice in reading instruction for students with LD. Additional studies with greater numbers of observations would extend the descriptive evidence base regarding application of evidence-based practices. The source of this disconnect remains unknown. Further observation studies that include interviews, focus groups, or teacher surveys might suggest future research questions to explore possible reasons for this disconnect between research and practice. For example, is the lack of comprehensive implementation of evidence-based practice related to teacher attitudes, school resources, insufficient teacher preparation, or professional development? Are research-based practices aimed to facilitate teacher understanding and competency sufficiently disseminated to increase implementation of evidence-based practices? Findings across the observation studies synthesized here indicate that teachers may lack training on evidence-based reading instruction and technical assistance with the implementation of high-quality instructional practices. Providing higher quality teacher preparation and continued professional development with opportunities for modeling, guided practice, and technical assistance might be one way to promote the integration of effective practices. Before taking action steps to reduce the gap, we must first try to better understand it via more observation studies. Once we develop a better understanding of current practice, researchers and practitioners alike can begin to take steps toward addressing the research-to-practice gap, improving instruction for students with LD.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
