Abstract
Vocabulary knowledge at school entry is a robust predictor of later reading achievement. Many children begin formal reading instruction at a significant disadvantage due to low levels of vocabulary. Until recently, relatively few research studies examined the efficacy of vocabulary interventions for children in the early primary grades (e.g., before fourth grade), and even fewer addressed vocabulary intervention for students at increased risk for reading failure. In more recent work, researchers have begun to explore ways in which to diminish the “meaningful differences” in language achievement noted among children as they enter formal schooling. This article provides a review of a particularly effective model of vocabulary intervention based on shared storybook reading and situates this model in a context of tiered intervention, an emerging model of instructional design in the field of special education. In addition, we describe a quasi–experimental posttest–only study that examines the feasibility and effectiveness of the model for first–grade students. Participants were 224 first–grade students of whom 98 were identified as at risk for reading disability based on low levels of vocabulary. Results of a multivariate analysis of variance revealed significant differences on measures of target vocabulary knowledge at the receptive and context level, suggesting that students at risk for reading failure benefit significantly from a second tier of vocabulary instruction. Implications for classroom practice as well as future research are provided.
This article provides a review of the research literature relative to the vocabulary gap among young students as they enter formal schooling and its relevance to reading achievement. Specifically, we review the research base regarding the use of shared storybook reading techniques with young students as they enter formal schooling and its effects on vocabulary development. Additionally, we provide a brief introduction to the design and use of tiered interventions within the general education classroom setting to facilitate additional instruction for at–risk students. We then present a research study examining the efficacy of a shared storybook intervention, situated within a tiered context, with particular attention to the benefits of the additional tier of instruction for students at risk for reading disability.
Importance of Vocabulary to Reading Achievement
Vocabulary knowledge plays a critical role in an individual's process of becoming a reader (Beck & McKeown, 2007; Coyne, Simmons, Kame'enui, & Stoolmiller, 2004; National Institute of Child Health and Human Development [NICHD], 2000). Perhaps the strongest role vocabulary plays in the reading process involves its relationship to reading comprehension. Vocabulary knowledge has long been acknowledged as a critical component of both learning to read and reading to learn. In the primary–school years when children are learning to read, there is evidence that a child's early abilities to decode are dependent, in part, on oral vocabulary (Bass, Williams, & Goldstein, 2007; NICHD, 2000). In fact, the reader will only gain benefit (in terms of understanding text) from applying letter–sound correspondences to the printed material if the resulting, decoded word is in the learner's oral vocabulary. “When the word is not in the learner's oral vocabulary, it will not be understood when it occurs in print” (NICHD, 2000, p. 4–3). Vocabulary also plays a role, however, when learners are reading to learn in the later primary grades. Experts agree that acceptable levels of comprehension occur when the reader knows at least 90 to 95 percent of the words in the text (Hirsch, 2003). In fact, many researchers hypothesize that the “fourth–grade slump,” a sharp decline in reading scores which tends to occur between the third and fourth grades, particularly for low–income students, is actually the result of significant deficits in these students’ vocabularies (Chall, Jacobs, & Baldwin, 1990). Usually, during the third to fourth grades, reading assessments move away from evaluating early reading skills (e.g., decoding, fluency) and, instead, measure reading comprehension to a much greater degree than in previous grades. Gaps in students’ vocabularies may prevent them from being able to comprehend the more academic texts they encounter in the later primary grades, resulting in poorer reading achievement evidenced by low test scores (Chall et al., 1990).
Certainly, as children develop their emergent literacy skills or are “learning to read,” direct and explicit instruction in the alphabetic principle, phonemic awareness, phonics, and fluency is a critical component for a solid literacy foundation. However, as children move into the “reading to learn” stage, typically between the third and fourth grades, they require “both fluent word recognition skills and an average or above–average vocabulary” to facilitate reading comprehension (Biemiller, 2006, p. 41). Further, although the presence of both fluency and vocabulary does not guarantee a high level of reading comprehension, “the absence of either word recognition or adequate vocabulary ensures a low level of reading comprehension” (Biemiller, 2006, p. 41). It follows then, as the alphabetic and phonemic principles are taught in kindergarten and through third grade to establish a solid foundation for fluency, so too should vocabulary development be an area of focused instruction to ensure that all children have at least average vocabularies by the end of third grade. However, vocabulary instruction rarely occurs in the early grades (NICHD, 2000) even though many children, especially those from impoverished backgrounds, enter their formal schooling with significant deficits in their vocabulary repertoires (Biemiller, 2001; Coyne et al., 2004; Hart & Risley, 2003a; Hirsch, 2003; National Research Council, 1998).
The Vocabulary Gap
Virtually all early vocabulary development (e.g., that which occurs before formal schooling) occurs through incidental learning by way of the child's early oral context which ideally includes a language–rich environment in the home in which the child engages in reciprocal verbal exchanges with others, listens to others speak, and listens to books read aloud (Hirsch, 2003; NICHD, 2000; National Research Council, 1998). The “direct instruction” of vocabulary is not necessary for this early learning, nor does it typically occur at this time. However, not every child has access to a positive and rich early oral environment, and the effects of a poor early language environment have measurable short– and long–term effects on the young child (Morgan & Meier, 2008; NICHD, 2000). Thus, children come to school with immense differences in their vocabulary knowledge (Beck & McKeown, 2007; Biemiller, 2006; Coyne et al., 2004; Hart & Risley, 2003a; Honig, Diamond, & Gutlohn, 2008), and these early vocabulary deficits persist into and, without intervention, expand throughout the school years.
Hart and Risley (2003b) found that the differences in the vocabularies among children from professional families, working–class families, and very low–income (welfare) families were staggering. At the age of 3 years, they found that extreme differences in children's early language experiences correlated to extreme differences in the quantity and quality of language interactions in the home. Hart and Risley (2003b) refer to this as an “early catastrophe” in which there is a “30 million word gap” by age 3 (p. 4). When these children begin preschool and kindergarten, they are significantly behind in vocabulary development compared to their higher–income peers (Biemiller, 1999; Coyne et al., 2004; NICHD, 2000; Schechter & Bye, 2006).
Not only are these differences significant when children enter school, they appear to persist as children move through the elementary– and secondary–school years (Beck, McKeown, & Kucan, 2002; Biemiller, 1999; Stanovich, 1986; Torgesen, 2002). Already, in first grade, a high–performing student knows approximately twice as many words as a low–performing one, “and as the students go through the grades, the differential gets magnified. By 12th grade, the high performer knows about four times as many words as the low performer” (Hirsch, 2003, p. 16). Furthermore, Hart and Risley (2003b) found in the longitudinal analysis of their data that vocabulary use at the age of 3 years was correlated with language scores at the age of 9 to 10 years on the Peabody Picture Vocabulary Test—Revised (PPVT–R; Dunn & Dunn, 1981), a measure of receptive language (r =. 57), and on the Test of Language Development—2: Intermediate (Newcomer & Hammill, 1988), a measure of receptive and expressive language (r =. 72). There was also a considerable correlation between vocabulary use at age 3 and reading comprehension at age 9 to 10 years as measured by the Comprehensive Test of Basic Skills (r =. 56). These data suggest that low vocabulary achievement is correlated with low reading achievement. Moreover, these data strongly suggest that children from lower–income homes begin their schools years with “meaningful differences,” a phrase coined by Hart and Risley (2003a) in reference to these extreme language differences, and that these early differences exert significant effects across children's cognitive, linguistic, and academic developmental trajectories. These research findings suggest a significant need to address vocabulary development before an insurmountable vocabulary gap arises between students who are and are not at risk for academic failure.
Vocabulary Instruction for Young Children
When the National Reading Panel (NRP; 2000) reviewed the extant reading research relative to vocabulary, they found that very few studies addressed vocabulary instruction for students below fourth grade (NICHD, 2000). They further surmised this was most likely due, at least in part, to the lack of vocabulary instruction provided to children in preschool through second grade. In their concluding remarks regarding future directions for vocabulary research, the NRP asserted that much more knowledge is needed regarding how to effectively teach vocabulary to children of different ages and different ability levels. Many researchers have attempted to explore this issue in the past few years.
Shared Storybook Reading
In contrast to older children who can engage in independent reading activities to expand their vocabularies, young children, especially those in second grade and lower, depend very much on oral language experiences to provide exposure to novel words and enhance vocabulary development (Coyne et al., 2004). Shared storybook reading is one of the richest oral language experiences in which young children engage as the vocabulary found in children's books is far richer than that which they encounter in conversational exchanges (Coyne, McCoach, & Kapp, 2007). Several approaches to storybook reading have been explored in the professional literature, and research results suggest that shared storybook reading can be an effective vocabulary intervention (Arnold, Lonigan, Whitehurst, & Epstein, 1994; Coyne et al., 2004; Hargrave & Senechal, 2000; Justice, Meier, & Walpole, 2005; Pullen & Justice, 2003; Wasik & Bond, 2001; Whitehurst et al., 1994).
Incidental Exposure
Several studies have suggested that simply listening to a storybook being read aloud has positive effects on young children's emergent literacy skills and that a measurable degree of incidental vocabulary learning can occur within this context. Nicholson and Whyte (1992) found that 8– to 10–year–old students learned vocabulary from incidental exposure via storybook readings. Senechal and Cornell (1993) found that, after a single storybook reading in which 10 target vocabulary words were identified, 4– and 5–year–old children were able to significantly improve their expressive vocabulary knowledge of the words, although the 5–year–old children demonstrated significantly higher maintenance levels of the knowledge 1 week later than did 4–year–old children. Robbins and Ehri (1994) also demonstrated that kindergarten students exposed to novel vocabulary by listening to a story read aloud made significant gains in vocabulary knowledge of the target words found in the story.
Direct Vocabulary Instruction
Although children can make some vocabulary gains by simply listening to a story, other research suggests that further gains can be facilitated through direct instruction of unknown words encountered in the text. For example, Penno, Wilkinson, and Moore (2002) randomly assigned young children (M = 6.6 years of age; N = 47) to two storybook conditions. In the treatment condition, students heard the target words during the storybook reading and were provided with explanations of target word meanings during the reading. In the comparison condition, children heard the target words during the storybook reading but received no explanations of target word meaning. Penno et al. (2002) found that, although all children gained some vocabulary knowledge during the shared storybook activity, vocabulary learning was enhanced in the treatment group in which teachers directly explained the meanings of the target words. Another study also suggested that children gained vocabulary knowledge in a shared storybook context when the teacher provided explanations of unknown words. Justice and colleagues (2005) randomly assigned 57 kindergarten students to a treatment or control condition. In the treatment condition, students were exposed to 60 total target words. All 60 words were encountered in a shared storybook context; 30 of the words were elaborated upon by the teacher (e.g., the teacher provided the word meaning and used it in a sentence during the storybook reading activity), and the other 30 were not elaborated. Results of the study revealed significantly higher target word learning for elaborated words compared to nonelaborated. These results suggest that direct instruction of vocabulary word meanings is more effective than incidental exposure without explanation or elaboration of word meanings.
Explicit Rich Instruction
Stahl and Fairbanks (1986) conducted a meta–analysis to analyze the various effective instructional practices in reading. Concerning effective vocabulary instruction, results from their data analysis suggested direct instruction of vocabulary words was most beneficial to students when it focused on both definitional and contextual explanations of words. Furthermore, the results of the NRP study strongly suggested that storybook reading activities provide students with opportunities for active engagement and a rich context in which to understand vocabulary meanings, both of which enhance vocabulary outcomes (Loftus, 2008; NICHD, 2000). Beck, Perfetti, and McKeown (1982) have investigated a model of vocabulary instruction that they refer to as “rich instruction.” Beck and colleagues suggest that young children learn vocabulary most effectively when they are able to explore the word in multiple contexts and receive information about the words and how they are used. Shared storybook reading is a particularly robust method by which to introduce vocabulary in an engaging context and facilitate rich dialogues about words and their usages. Several teams of researchers have provided evidence that rich instruction is a particularly effective method of vocabulary instruction for students in early and later primary grades (Baker, Simmons, & Kame'enui, 1998; Baumann & Kame'enui, 2003; Beauchat, Blamey, & Walpole, 2009; Beck et al., 1982; Beck & McKeown, 2007; Kindle, 2009).
For example, Beck and McKeown (2007) investigated the vocabulary acquisition of kindergarten and first–grade students in the context of a rich instruction model versus a comparison condition in which incidental exposure to vocabulary occurred through daily read–alouds. Beck and McKeown found that students in the treatment group had significantly greater vocabulary learning than children in the comparison condition. In another study, Coyne et al. (2007) investigated the effectiveness of rich instruction, embedded instruction, and incidental exposure, in a small–group intervention context, on kindergarten student's vocabulary learning. In the rich condition, students were provided with extended explicit instruction with both definitional and conceptual explanations of the target words. The embedded instruction condition consisted of teachers providing short definitions of target words as they were encountered in the text, and the incidental exposure condition consisted of storybook reading with no target word instruction. Students in the rich instruction condition had significantly greater word learning than children in either the incidental or embedded conditions.
Differential Effects of Storybook Reading on Vocabulary
Much of the earlier research evidence suggests that there is a differential response to vocabulary instruction among high– and low–vocabulary students. In the Nicholson and Whyte (1992) study in which the vocabulary knowledge of 8– to 10–year–old students increased in response to a storybook reading intervention, gains were significantly higher for high–ability students than they were for low–ability students. Similarly, in the Robbins and Ehri (1994) study, in which kindergarten students demonstrated vocabulary growth after a shared storybook intervention, gains were greater for children with higher initial vocabularies than those with low entering vocabularies. It is important to note that in both of these studies, the storybook intervention relied on only exposure of the words (without elaboration) through the shared storybook activity to facilitate incidental learning of vocabulary. In light of this evidence, some researchers have begun to investigate which vocabulary instructional methods, above and beyond incidental exposure, are most effective for young students at risk for language difficulties and school failure. Results from this recent research suggest that interventions which include elaboration of word meanings and/or a rich, explicit approach to vocabulary instruction are effective for both high–achieving students and those at risk for language difficulties and reading failure.
For example, Coyne et al. (2004) conducted a study to examine the differential effects of vocabulary instruction for students with low initial vocabularies compared to those with high initial vocabularies. Kindergarten students (N = 96) were randomly assigned to one of three intervention groups: a storybook intervention, a phonologic and alphabetic intervention, and a sounds and letters intervention. Students in the storybook intervention received 108 half–hour lessons based on 40 different children's storybooks. Interventionists provided explicit vocabulary instruction of three target words in the context of the storybook reading activity for each story. Results of the analysis suggested that, after receiving explicit instruction in the target words, at–risk (AR) students gained as much vocabulary knowledge as their not at–risk (NAR) counterparts, thus halting the expansion of the vocabulary gap. In the Justice et al. (2005) study in which direct explanation of target word meanings were provided to students during the storybook reading, the AR students made greater gains in vocabulary knowledge than the NAR, suggesting that the AR students can learn vocabulary word meaning at a rate similar to their peers under the right instructional conditions.
Implications of the Research
The research evidence resulting from this body of literature suggests that certain types of shared storybook reading activities appear to have significant effects on children's vocabularies but that those effects may be somewhat contingent on a child's entering vocabulary level in certain instructional contexts. As Coyne and colleagues (2007) note, traditional storybook reading activities that lack a direct explanation component tend to widen the vocabulary gap between students with low levels of vocabulary knowledge and those with high levels. Unquestionably, students enter school with vast disparities in vocabulary knowledge. Although some children enter kindergarten with a history of rich oral language experiences and large vocabularies, others come from language–impoverished backgrounds and have very low vocabularies. Furthermore, there is little doubt that low vocabulary knowledge in the early primary grades is associated with later reading failure (Coyne et al., 2007 2004, 2007; NICHD, 2000; National Research Council, 1998; Scarborough, 1991) and that children with low vocabulary knowledge are at risk for language difficulties and reading failure. However, there is very little vocabulary instruction in the early primary grades (NICHD, 2000), and gaps between high– and low–vocabulary children tend to widen over the years (Biemiller, 2005; Coyne et al., 2004). Therefore, there is a great need for information regarding how to provide effective vocabulary instruction in the early primary grades to children at risk for reading failure.
The results of a growing number of studies suggest that the direct instruction of vocabulary word meanings during shared storybook reading activities provides more benefit to primary grade students than just incidental exposure of the words without explanation (Biemiller & Boote, 2006; Coyne et al., 2004; Elley, 1989; Justice et al., 2005). Furthermore, rich instruction includes this direct explanation of word meanings but extends instruction to include “explaining word meanings in student–friendly language, providing multiple examples and multiple contexts, and requiring students to process words deeply” (Beck & McKeown, 2007, p. 254), which research suggests is a solidly effective way in which to teach students vocabulary (Beck et al., 1982; McKeown, Beck, Omanson, & Pople, 1985; Stahl & Fairbanks, 1986). Although the explicit, rich instruction approach has primarily been investigated with students in third grade and higher, a few researchers have investigated its use with young students in kindergarten and first grade (Beck & McKeown, 2007; Coyne et al., 2004, 2007; Maynard, 2007).
Overall, the emerging research evidence suggests that students as young as kindergarten, including students with low vocabularies, can develop deep and meaningful understandings of complex vocabulary in the context of rich instruction (Coyne et al., 2007). However, the instructional methods employed should be characterized by direct teaching of vocabulary in which children are provided with both contextual and definitional explanation of words, opportunities for multiple exposures to target words in various contexts, and experiences which allow them to deeply process word meanings. This rich instruction model promotes excellent vocabulary learning outcomes for both high– and low–achieving students and is an effective way in which to teach vocabulary in the primary grades.
Tiered Instruction for Students at Risk for Academic Failure
There has been a recent shift in the field of special education toward a model of early identification and prevention of learning difficulties. Inherent to this preventive model is the use of scientifically validated instructional techniques in all classrooms to ensure that all students have access to high–quality instruction. This high–quality instruction is a foundational element of the response to intervention (RTI) model of special education which emerged after evidence suggested that many students identified as having special education needs were actually victims of poor, ineffective instruction (Division for Learning Disabilities, 2007). The term RTI holds different connotations for different stakeholders and is still under intense investigation as it applies to instruction delivered across classrooms. The following section provides a brief overview of RTI in the context of identification and instruction and the use of tiered interventions to differentiate instruction for different learners.
What is RTI?
Broadly, the RTI model can be applied to two important dimensions of education, the identification of students with learning disabilities and the prevention of academic failure among all students (Mellard, 2004). As a result of extensive concerns regarding learning disability identification procedures that relied primarily on evidence of a discrepancy between ability (e.g., IQ) and achievement (e.g., Gunderson & Siegel, 2001; Sternberg & Grigorenko, 2002), advocates of RTI suggest that students should be identified as having learning disabilities only if they do not “respond to” (i.e., learn from) scientifically based, effective class–wide instruction or successively more rigorous educational interventions designed to intensify instruction for students struggling to learn. In this way, the RTI model can be used in the process of identifying a child with a learning disability; that is, only students who do not respond to class–wide or intensified instruction should be comprehensively evaluated for special education services.
Furthermore, advocates of RTI assert that the usefulness of this educational model reaches beyond special education identification and extends into the realm of prevention of academic failure. When teachers employ an RTI model in the classroom, students have access to multiple levels of high–quality, differentiated instruction. For students struggling to learn from high–quality, class–wide instruction, teachers provide a more intense second, and even third, level of instruction to help these students learn the academic content. Students who in the past would have failed and/or been referred for special education evaluation, instead receive additional, differentiated instruction in the RTI model, increasing their chances of academic success (Overton, 2009).
Tiered Instruction
In most RTI models, a tiered intervention approach is implemented comprising three or more levels of instruction. Each level of instruction, beginning with instruction in the general education classroom is referred to as a tier of instruction. Thus, in a RTI instructional model, all students receive class–wide Tier 1 instruction; students who do not respond to this Tier 1 class–wide instruction receive a second tier of intervention designed to intensify instruction through evidence–based, small–group supplemental instruction. A third tier of intervention may be provided to students who continue to struggle after Tier 2 instruction. Tier 3 instruction is usually the most intensive level of intervention, and students who do not respond at this level are typically referred for a comprehensive special education evaluation (Mercer & Pullen, 2009). This three–tier model is the most common RTI model implemented in schools, although others have proposed models with additional tiers of intervention. If this model is implemented effectively, that is when evidence–based instruction is provided in successively intensive tiers of instruction, the assumption is that far fewer children will be identified as having learning disabilities and that academic failure rates will decrease across all students considerably (Fuchs & Fuchs, 2007; Mellard, 2004; Overton, 2009). Furthermore, theoretically, this model ensures that all students have access to scientifically validated instruction that meets their needs.
The investigation of the efficacy of tiered instruction in preventing academic failure and identifying children with learning disabilities is just getting underway in educational research. Promising preliminary results suggest that, when struggling students receive instruction in a tiered format, most of them experience accelerated learning and make encouraging learning gains (e.g., Gettinger & Stoiber, 2007; Simmons et al., 2008; Wanzek & Vaughn, 2008). The present study sought to expand upon the accumulating research evidence regarding explicit vocabulary instruction specifically provided in a tiered format for students at risk for failure. Specifically, the present study explored the response of AR first–grade students to a Tier 1 plus Tier 2 vocabulary intervention, embedded in a rich, explicit model of vocabulary instruction.
Method
Participants
Participants included 224 first grade children from three elementary schools in a medium–sized school district. The schools were selected to represent a diverse distribution of populations with a moderate percentage of students in the lower socioeconomic status. The school district office was consulted and suggested schools that would be appropriate to recruit. Each school invited agreed to participate in the program. The socioeconomic levels of the schools are based on the percentage of students receiving free and reduced priced lunch. Table 1 provides the percentages of students who receive free or reduced priced lunch as well as the percentage of non–White students in each school. The 12 classroom teachers were each trained as elementary education teachers and held current state licensure.
School Demographic Data
The student sample included 98 students AR for language difficulties and reading failure and 126 designated as NAR as determined by the PPVT–Fourth Edition (PPVT–4; Dunn & Dunn, 2007). For the purposes of the present study, children were identified as being AR if they scored below the 39th percentile on the PPVT–4 (Dunn & Dunn, 2007), a measure of receptive language that has been used to predict risk status for language difficulties and reading failure. Children who scored at or above the 39th percentile on the PPVT–4 were placed in the NAR control group condition. The selection criteria were based on the levels of risk identified by Reading First on the PPVT that suggest that students below the 22nd percentile are at risk for reading failure, those students between the 22nd and 38th percentiles are at some risk for reading failure, and those at or above the 39th percentile are at low risk for reading failure. Student participant demographic data are provided in Table 2.
Participant Demographics by Group Expressed in Percentages
Note: ESOL = English for Speaker of Other Languages.
Measures
Ppvt–4
We selected a standardized test of receptive language to serve as a screening measure to identify students who may be at increased risk of later reading failure. Prior to intervention, we administered the PPVT–4 (Dunn & Dunn, 2007) to assess baseline level of receptive vocabulary and identify participants as either at risk or not at risk for reading failure. The PPVT–4 is a standardized, individually administered, norm–referenced measure of receptive vocabulary of the English language. The PPVT–4 was normed on a sample of over 3,000 individuals whose demographic data closely resembles that of the 2004 U.S. Census. Furthermore, the PPVT–4 demonstrates reliability coefficients indicating that it is a sound psychometric measure of receptive vocabulary (test–retest reliability =. 93, split–half reliability =. 94). Additionally, the authors correlated the PPVT–4 with other measures of vocabulary and obtained coefficients indicative of acceptable criterion validity (e.g., PPVT–4 with the Clinical Evaluation of Language Fundamentals–4 scale: receptive language r =. 67 for younger children and. 75 for older children).
The PPVT–4 was selected to measure initial levels of receptive vocabulary in light of previous research indicating that initial receptive vocabulary levels are correlated with response to vocabulary instruction (Coyne et al., 2007; Loftus, 2008; Penno et al., 2002). Furthermore, research evidence links scores on all editions of the PPVT to later reading achievement (Carvajal, Hayes, Miller, Wiebe, & Weaver, 1993; Ollendick, Finch, & Ginn, 1974; Smith, Smith, & Dobbs, 1991; Williams & Wang, 1997). Therefore, scores on the PPVT may be interpreted as suggesting a “risk status indicator” for students who demonstrate low vocabulary. These data were used to assign students to either an AR or NAR group.
Posttest Measure of Target Word Knowledge
We selected a researcher–developed measure to assess students’ acquisition of target words taught in the intervention. Although the PPVT provided a standardized measure of receptive vocabulary, this measure is not sensitive enough to detect change in a short–term intervention. Furthermore, the researcher–developed measure is a form of curriculum–based assessment. This assessment was designed so that teachers could determine whether or not the vocabulary instruction they provide is effective. Such a measure is appropriate in a tiered approach to intervention. We administered this posttest at two different points of the study; the first as an immediate posttest measure following the completion of the intervention and again 4 weeks later to assess whether students maintained the vocabulary knowledge over time.
The posttest assessment consisted of three hierarchical measures of word knowledge. In accordance with multiple experts’ observations that word knowledge is a multilevel construct, the posttest assessed participant word knowledge at the receptive, contextual, and expressive levels of word knowledge. The posttest measure was modeled after the Coyne and Pullen (2007) measure in which participants are provided the opportunity to demonstrate word knowledge at multiple levels.
Measure of Expressive Level of Word Knowledge
We initially assessed the expressive level of word knowledge as it is hypothesized to be the deepest level of word knowledge (Coyne & Pullen, 2007). Furthermore, the assessment of receptive or contextual levels of word knowledge may have informed participants to some degree about the target word definition. Therefore, by assessing the expressive level of knowledge first, participants received no instructional gain from other items on the posttest measure. Items on this portion of the posttest simply requested that participants define each target word. For example, participants were simply instructed, “Tell me what the word saunter means.” The assessor recorded the student's response on the protocol form. The student's response was then scored as a “0” for incorrect or a “1” for correct. The total expressive answers correct indicated the expressive level vocabulary score. A total of 8 points were possible on the expressive portion of the target vocabulary measure.
Measure of Contextual Level of Word Knowledge
Second, we assessed the contextual level of word knowledge. Although demonstrating the contextual level of word knowledge is not as difficult as demonstrating the expressive level, it is more difficult than demonstrating receptive–level word knowledge (Coyne & Pullen, 2007). Therefore, the contextual level of word knowledge was assessed after the expressive level but before receptive level on the posttest. We asked participants to demonstrate contextual word knowledge by responding to contextualized questions about each of the four target words. For example, participants were asked, “When is a time that you would be quivering?” to elicit a response that demonstrated a contextual level of word knowledge. The assessor recorded the student's response verbatim. The student's response was scored on as a “0” for incorrect or a “1” for correct. The total context level answers correct indicated the context level vocabulary score. A total of 8 points were possible on the context–level portion of the target vocabulary measure.
Measure of Receptive Level of Word Knowledge
Finally, we asked participants to demonstrate their receptive level of word knowledge, which is the most basic of these levels of word knowledge (Coyne & Pullen, 2007). To test the receptive level of knowledge, the research assistant showed the student a page of pictures, modeled after the PPVT, with a picture in each quadrant, one of which illustrated one of the target words, whereas the other three acted as distracters. Participants were instructed to indicate which picture illustrated the target word being assessed. For example, research assistants instructed participants to, “Point to veranda.” The student's response was then scored as a “0” for incorrect or a “1” for correct. The total answers correct indicated the receptive–level vocabulary score. A total of 8 points were possible on this portion of the target vocabulary measure.
Design and Procedure
A partially randomized design was implemented in which three groups of children (i.e., AR treatment [ART], AR control ARC, and [NAR]) were compared on three hierarchically ordered measures of vocabulary following a 2–week intervention administered to the AR intervention group. The posttest measures were hierarchical in nature in that they measured depth of vocabulary knowledge, receptive knowledge, context knowledge, and expressive knowledge. Following this initial posttest, children were again evaluated on the same three vocabulary measures to examine the sustainability of the intervention 4 weeks beyond the end of the intervention (i.e., delayed posttest). The partial randomization component consisted of randomly assigning the children identified as AR (N = 98) to either an intervention (i.e., ARC; N = 49) or control (ARC; N = 49) condition. Students in the ART group received Tier 1 instruction from the general education teacher as well as supplemental Tier 2 instruction provided by a graduate student in education. Students in the ARC group received Tier 1 instruction only from the general education classroom teacher. Figure 1 illustrates the timeline of screening measures, assignment to group, implementation of the intervention, posttest, and delayed posttest administration. The mean standard scores and standard deviations on the PPVT–4 for the three groups (NAR, ART, and ARC) are provided in Table 3. The three hierarchically ordered dependent variables on which groups were compared included measures of depth of word knowledge at the receptive, contextual, and expressive levels.

Group assignment and instructional condition.
PPVT–4 Mean Scores and Standard Deviations by Group
Tiered Instruction
Tier 1 and Tier 2 lessons were designed around two storybooks appropriate for first–grade students. We selected four target vocabulary words in each book. We selected these words based on expert opinion for selecting words to teach. The words were selected to represent vocabulary that would likely be unfamiliar to the student yet important to the story (Coyne et al., 2004) and words likely to be used by mature language users—sophisticated synonyms for common concepts (Beck et al., 2002).
Tier 1 instruction exposed students to word–rich stories through class–wide storybook reading, provided direct vocabulary instruction for selected target words, utilized activities with kid–friendly definitions, provided students with multiple exposures to target words in varied and meaningful contexts and sentences (within and outside the story), engaged students in activities that allowed them to process words deeply, and provided students with multiple opportunities to actively interact with target words. Table 4 provides book titles and authors, target vocabulary words, and kid–friendly definitions.
Storybooks and Target Vocabulary for Tier 1 and Tier 2 Intervention
The purpose of Tier 2 lessons was to intensify the instruction for students who are at increased risk of reading disability. The intensified Tier 2 lessons included a small–group format of delivery (three to five students), a review of the kid–friendly definitions taught in Tier 1, additional opportunities for students to experience multiple exposures to target words in varied and meaningful contexts and sentences (within and outside the story), additional opportunities to engage students in activities that allowed them to process words deeply, and multiple opportunities to actively interact with target words. Individual turns were provided to each student during the small–group instruction. Like the students in the NAR group, students in the ARC group received Tier 1 instruction only. Figure 2 illustrates the group assignment and instruction each group received.

Group assignment and instruction timeline.
Tier 1 Procedures
All students (N = 224) received Tier 1, class–wide instruction delivered by a certified first–grade classroom teacher (N = 12). Each storybook was read aloud by the classroom teacher on days 1 and 3 of instruction (i.e., Monday and Wednesday) and postreading vocabulary activities were conducted in a large–group setting. The lessons lasted approximately 30 minutes. Examples from Tier 1 instruction are provided in Figure 3.

Example of Tier 1 activities.
Tier 2 Procedures
A research assistant provided Tier 2 intervention in small–groups settings on the day immediately following Tier 1 instruction (i.e., Tuesday and Thursday). The purpose of the Tier 2 instruction was to provide more intensive supplemental support for students at risk for reading failure. In Tier 2 instruction, the instructor reviewed target word definitions, and provided multiple opportunities to use and interact with the target words in meaningful ways both individually and chorally. The Tier 2 instruction was implemented in small groups of four to five students and lasted on average approximately 20 minutes. Examples of Tier 2 instruction are provided in Figure 4.

Example of Tier 2 materials.
Fidelity of Treatment and Interrater Reliability Data
To ensure procedural reliability, each teacher was observed a minimum of one time during the 2–week Tier 1 intervention; a total of 25 percent of the Tier 1 interventions were observed. For each observation, a fidelity checklist was completed that indicated whether each component of the lesson was implemented as trained and that the script was followed. The teachers demonstrated 100 percent procedural reliability for the Tier 1 instruction. Each teacher utilized explicit instruction, had the students repeat the target words in unison, plus they used the definition of the target word back in the context of the story.
Research assistants (i.e., graduate students in special education) who administered the Tier 2 intervention were observed for fidelity of implementation. Just as in Tier 1 instruction, a fidelity checklist was completed that indicated whether each component of the lesson was implemented as trained and that the script was followed. In addition, the checklist included an item to ensure that each child was given an opportunity to respond individually. A minimum of 25 percent of Tier 2 instruction was observed. The checklist resulted in quantifiable data in the form of counts; the fidelity of implementation score across Tier 2 instruction was 97 percent.
We further established procedural reliability by conducting multiple scorings of the posttest vocabulary measures. Interscorer agreement was calculated on 25 percent of the posttests. To calculate interscorer agreement, 25 percent of the posttests were selected randomly to be double–scored. Two independent research assistants determined the score for each item of the posttest. Interscorer agreement was calculated by determining percent agreement on individual items. The percentage agreement on the posttest measures was. 97.
Results
An analytic strategy for analysis was undertaken to accommodate the nature of the three interrelated measures of depth of word knowledge. This strategy included use of a multivariate analysis of variance (MANOVA), which evaluated whether groups differed on the combination of outcome variables, Roy–Bargman step–down analyses, and planned contrasts. Children in each of the three groups were compared on the combined dependent variables at each of the two occasions (i.e., posttest and delayed posttest) through MANOVA. Statistically significant MANOVA results were further evaluated through step–down analyses on the prioritized dependent variables in order to control for overlapping variance among the three measures of depth of word knowledge. Here, higher prioritized variables (e.g., receptive level of word knowledge) served as covariates for investigating group differences on subsequent variables (e.g., contextual level). In instances in which statistically significant group differences were revealed through step–down analyses, planned contrasts between the two AR groups (i.e., ART vs. ARC) and the ART versus NAR groups were evaluated. All analyses were conducted with SPSS version 14.0. All data were normally distributed. Reported means for the contextual level of word knowledge are adjusted means based on the modeled inclusion of receptive level of word knowledge scores serving as a covariate in order to control for overlapping variance among the dependent variables in these sequential tests. Means and standard deviations for the three dependent variables are presented separately for each of the three groups and two testing occasions in Table 5. The first set of analyses focused on the results obtained from the posttest data that were collected immediately following the 2–week AR group intervention. Thereafter, results are presented from the delayed posttest data that were obtained 4 weeks following the intervention period.
Posttest and Delayed Posttest Means and Standard Deviations
Delayed posttest occurred 4 weeks following intervention.
ART = At–risk treatment group; ARC = at–risk control group; NAR = not–at–risk group.
Posttest
Wilks’ criterion indicated a statistically significant difference between groups on the combined posttest reading variables, multivariate F(6, 404) = 10.33, p <. 001,
. Roy–Bargman step–down analyses further revealed between group differences on receptive level, F(2, 204) = 16.34, p <. 001, and contextual level of word knowledge, F(2, 203) = 14.62, p <. 001. Planned group contrasts for each of the significant step–down analyses revealed that the ART group performed better than the ARC group on both receptive level (MAt–Risk Treatment= 6.09 vs. M At–Risk Control= 5.43; p =. 05), and contextual level of word knowledge (MAt–Risk Treatment= 4.66 vs. M At–Risk Control= 4.03; p <. 05). Similarly, the NAR group performed better than the ART group on both receptive level (MNot At–Risk= 6.96 vs. M At–Risk Treatment= 6.09; p <. 05), and contextual level of word knowledge (MNot At–Risk= 5.43 vs. M At–Risk Treatment= 4.66; p <. 05). Although unadjusted between–group comparisons revealed statistically significant differences between groups at the expressive level of word knowledge, F(2, 204) = 25.45, p <. 001, these difference were already represented in the step–down analysis by the higher prioritized reading variables, step–down F(2, 202) =. 23, p =. 79.
Delayed Posttest
Wilks’ criterion indicated a statistically significant difference between groups on the combined posttest reading variables, multivariate F(6, 354) = 12.27, p <. 001,
. Roy–Bargman step–down analyses further revealed between group differences on receptive level, F(2, 180) = 21.83, p <. 001, and contextual level of word knowledge, F(2, 178) = 12.20, p <. 001. Planned group contrasts for both of the significant step–down analyses failed to reveal differences between the ART group and the ARC group on either receptive level (MAt–Risk Treatment= 5.85 vs. M At–Risk Control= 5.29; p =. 11), or contextual level of word knowledge (MAt–Risk Treatment= 4.19 vs. M At–Risk Control= 4.29; p =. 76). By contrast, the NAR group performed better than the AR intervention group on both receptive level (MNot At–Risk= 7.03 vs. M At–Risk Treatment= 5.85; p <. 05), and contextual level of word knowledge (MNot At–Risk= 5.36 vs. M At–Risk Treatment= 4.19; p <. 05). Although unadjusted between–group comparisons revealed statistically significant differences between groups at the expressive level of word knowledge, F(2, 179) = 35.32, p <. 001; these difference were already represented in the step–down analysis by the higher prioritized reading variables, step–down F(2, 177) = 2.67, p =. 07.
Per the American Psychological Association's position that good research includes reporting effect sizes, we investigated the differences in group mean scores between the ART group and ARC group to gain a clearer understanding of the treatment effects of the Tier 2 intervention on all levels of word knowledge for AR students. Using each group's posttest means and the pooled standard deviation, we calculated an estimated effect size (using Cohen's d) of Tier 2 treatment on each level of word knowledge at both immediate and delayed posttest. Effects were moderate at immediate posttest and small to moderate at delayed posttest. See Table 6 for effect sizes.
Cohen's d Effect Size Estimate at Immediate and Delayed Posttest
Discussion
Many children enter school with deficits in vocabulary knowledge that lead to poor reading comprehension. Intervening early may help to alleviate the widening of the gap that is evident in children with low and high vocabularies. Evidence suggests that children benefit from shared storybook reading targeting vocabulary for mature language users. However, much of the research examining vocabulary outcomes in shared storybook reading has been conducted in adult–child dyads or small–group instruction. To date, research has not established convincingly whether large–group class–wide instruction is sufficient to produce gains in vocabulary growth, particularly for those students most at risk for reading failure. In the present study, we sought to examine a tiered instructional approach that provided effective class–wide instruction along with small–group supplemental support to those students most in need of vocabulary intervention.
Children at risk for reading failure as determined by the PPVT–4 were assigned randomly to either a treatment or comparison condition. Those students in the ART group were provided Tier 2 instruction in addition to class–wide instruction. AR students who received the additional Tier 2 instruction achieved significantly higher posttest scores on receptive and contextual levels of word knowledge when compared to their AR peers who received only Tier 1. This finding suggests that AR students who receive Tier 1 instruction alone may not learn target vocabulary words at sufficient enough depth and that students gain statistically significant educational benefit when they receive Tier 2 instruction. That is, children with low baseline vocabulary scores did not acquire vocabulary knowledge with Tier 1 instruction alone. The class–wide instruction may not have been robust enough for these children to learn the target words. This suggests that the value added from Tier 2 supplemental intervention is worth the instructional time based on the improved outcomes for students.
However, in the present study, benefits were short lived. Four weeks after the intervention, ART students were statistically indistinguishable from ARCs. Both groups of students demonstrated losses on receptive and contextual levels of word knowledge, although NAR students maintained vocabulary learning over time. This finding suggests that, although AR students initially benefit from Tier 2 instruction, the effects of brief interventions are not robust enough to maintain the educational benefit over time. Perhaps the duration of instruction for AR children must be lengthened to maintain levels of word knowledge. It is important to note that, to ensure integrity of the intervention and to control for variability across classrooms, we instructed teachers to refrain from using the target vocabulary words outside of the planned instruction. In instruction outside of the parameters of a controlled study, we would encourage teachers to continue to provide opportunities to use and interact with the target vocabulary words to ensure maintenance of learning. The brevity of the intervention and the lack of continued use of the words proved to be particularly detrimental to students with low vocabulary. It may also be of interest, however, that, at delayed posttest, effect sizes based on Cohen's d calculations suggest that small to moderate effects are still evident.
It is also interesting to note that, after accounting for the variance attributable to receptive and contextual levels of word knowledge, no additional variance between the ART students and ARCs was noted at the expressive level, the most complex level of word knowledge. Does this mean that instruction in the expressive level of word knowledge does not benefit students in this grade level? Most likely, students at this age do benefit from instruction regarding the expressive level of vocabulary word knowledge; however, it is likely that this instruction initially informs their receptive and contextual levels of word knowledge. Had the intervention been sustained over a longer period of time, it is possible that differences at the expressive level of word knowledge would have been noted. This issue speaks to a larger concern in the conduct of vocabulary research regarding the length of interventions for both AR and NAR students, as well as long–term programming of vocabulary instruction. Further investigation of frequency, intensity, and duration of vocabulary instruction and its long–term benefits to diverse groups of students must be explored to inform the future development of holistic vocabulary instructional programming.
Limitations and Implications for Future Research
The results of this study should be interpreted with regard to the following limitations. The sample in the present study came from three schools in a moderately sized city in Virginia and was implemented with first–grade students only. Although first–grade students were the population of interest for this particular study, the small sample from one grade level within three schools limits the generalizability of the findings to students of other ages and geographical regions.
The focus of the current study was solely on vocabulary knowledge. Ultimately it is the goal of vocabulary instruction to improve reading comprehension. We did not examine the effects of vocabulary gains on related reading skills such as comprehension. In future studies, the impact of vocabulary knowledge on reading comprehension should be investigated.
The assessment of word knowledge is also a continual challenge for researchers and educators interested in children's vocabulary growth. All levels of word knowledge were assessed as either correct or incorrect. Although this categorical scoring approach was appropriate for the receptive level of word knowledge, it may have limited the ability to detect small gains in children's vocabulary knowledge. For example, when stating a definition of a word (expressive level of word knowledge), students were scored as either correct or incorrect and could not receive partial credit. Thus, partial, albeit incomplete, vocabulary knowledge was not captured. In terms of word learning, many experts agree that a word is not immediately learned or not learned; depth of word knowledge develops over time (Beck & McKeown, 2007; Honig et al., 2008). If a student did not demonstrate a complete contextual and expressive level explanation of word knowledge, students’ responses were marked as incorrect. We selected this approach of scoring to increase the reliability of the measure. Although interscore reliability was indeed high, a scoring rubric (e.g., a score of 1 on a scale of 0–2) may have provided a more accurate estimate of a child's knowledge of a word. This more flexible scoring system may have preserved interscorer agreement and also captured incremental student learning which went unmeasured in the present study. It is, in fact, quite possible that target word learning levels were underestimated in the present study as a result of the posttest measurements. In future studies, creating a rubric that credits students for partial understanding or emerging conceptualization of target words to more accurately assess incremental learning may provide more accurate measurements of incremental vocabulary growth.
The significant contribution of vocabulary to reading achievement underscores the importance of establishing the best methods of delivering instruction to young children. The present study examined a framework for providing tiered vocabulary instruction to students at risk for school failure. Results of the present study suggest that students at risk for reading failure who receive a second tier of vocabulary instruction gain significantly more vocabulary knowledge than AR students who receive only one tier of class–wide vocabulary instruction. However, results from delayed posttest also suggest that AR students lose knowledge over time regardless of whether they receive one or tier levels of instruction. Therefore, future studies must explore the frequency, intensity, and duration of vocabulary instruction necessary for AR students to maintain vocabulary learning at a rate similar to their NAR peers. Future investigations that seek to answer these questions will contribute significant information regarding how to stall and even close the vocabulary gap that contributes to the meaningful differences among young children.
