Relationship Between Reading Pretest and Social Studies Outcomes From PACT: Evidence for a Skill-by-Treatment Interaction

Abstract

Previous research on Promoting Adolescent Comprehension of Text (PACT) found significant student-level variability in outcomes. The current study examined a potential skill-by-treatment interaction with 1,376 eighth-grade students from 13 middle schools as part of a larger evaluation study. Treatment students scored higher than control students on a measure of social studies content knowledge among students who scored 1 SD below the mean, at the mean, and 1 SD above the mean on preintervention measures of reading comprehension and reading fluency. Using social studies content knowledge, pretest reading comprehension and fluency resulted in estimates of area under the curve of .77 (95% CI = [.76, .78]) and .56 (95% CI = [.55, .57]), respectively. Both sets of pretest data identified a threshold score that approximated the 30th percentile. Thus, there appeared to be a skill-by-treatment interaction for PACT based on preintervention reading skills.

Keywords

academic achievement content area instruction comprehension reading middle school(s)

College- and career-readiness standards require students to comprehend increasingly complex disciplinary texts in content courses such as history, science, and social studies (Heller & Greenleaf, 2007; Wei et al., 2016). However, adolescents are often only minimally prepared to comprehend challenging content text independently (Swanson et al., 2014), which is evidenced by the 29% of students in eighth grade who scored in the proficient range on the National Assessment of Educational Progress for reading (National Center for Education Statistics, 2022). Poor reading comprehension among adolescent students in the United States is a crisis that needs to be addressed (Salinger, 2011).

Students reading below a proficient level are highly likely to experience difficulty in content courses relying on sophisticated literacy skills (Salinger & Osher, 2018). The mismatch between student skills and task demands in social studies and other content courses can cause students to become increasingly disengaged from school and affect their academic self-efficacy (Beri & Stanikzai, 2018), both of which can negatively affect success in high school, post-secondary education, and the workplace (Daggett & Hasselbring, 2014). Although influencing adolescent students’ comprehension is challenging (Goldman et al., 2016), previous research investigating the effectiveness of interventions addressing the needs of adolescents and focusing specifically on reading skills has shown promising but mixed effects (see Edmunds et al., 2009; Herrera et al., 2016; Wanzek et al., 2013 for meta-analyses).

Promoting Adolescent Comprehension of Text

Vaughn et al. (2013) developed the Promoting Adolescent Comprehension of Text (PACT) framework to address the unique demands of content area texts. The research team aimed to supplement and support typical social studies content lessons by incorporating evidence-based reading comprehension and vocabulary practices within the instruction. Vaughn and colleagues (2013) studied PACT within three distinct American History units commonly included in eighth-grade standards (Colonial America, the Road to Revolution, and the Revolutionary War). Each unit cycle contains five components: (a) comprehension canopy that introduce the unit, (b) an essential words routine to teach vocabulary, (c) knowledge acquisition through text-based instruction and critical reading, (d) team-based learning comprehension checks that are first completed individually and then with group-based activities, and (e) team-based learning knowledge application in which students engage in text-based discourse to complete an activity that requires them to articulate new perspectives, solve problems, and present conclusions. Through its five components, PACT emphasizes students’ critical thinking and analytical skills rather than focusing solely on the memorization of factual information.

Several studies have examined the impact of PACT on whole classes of social studies students. Vaughn et al. (2013) found that PACT led to significantly higher scores, as compared to a control group, on measures of social studies content (ES = 0.17), social studies comprehension (ES = 0.29), and even general reading comprehension (ES = 0.20) and the group difference for content knowledge was maintained 4 weeks after the intervention ended (ES = 0.25). Vaughn and colleagues (2015, 2016; Roberts et al., 2023; Vaughn & Wanzek, 2024) also conducted large-scale randomized trials with eighth-grade general education American History classes, which led to significantly positive effects on social studies knowledge and social studies comprehension, with small effects on general reading comprehension as well. Previous research also found significant effects from PACT on social studies knowledge acquisition and comprehension among students with disabilities (ES = 0.26 and 0.34, respectively; Swanson et al., 2015), but Wanzek et al. (2016) found positive effects (ES = 0.26) for students with disabilities only on the social studies knowledge acquisition measure and not on social studies comprehension measures or for more general reading achievement. A randomized trial with 135 teachers and over 7,000 students at 48 schools again found significant effects for content acquisition (ES = 0.13) and social studies comprehension (ES = 0.35) that maintained 9 weeks later, but the data suggested differential effects at the student level (Roberts et al., 2023).

Vaughn et al. (2019) examined the effects of PACT on 359 eighth-grade students who were identified with reading comprehension difficulties as compared to 331 students with reading comprehension difficulties who did not participate in PACT. Although the students generally scored better on measures of content acquisition, content comprehension, and reading comprehension, the effect was strongest for classrooms with smaller numbers of students with reading difficulties and became negligible to small for classrooms with a high percentage of students with reading comprehension difficulties.

In general, previous research on PACT has shown positive impacts on students’ social studies knowledge and comprehension when their teachers implement PACT strategies. However, the findings from the extant research point to the need for further research to better understand the conditions and mechanism under which PACT can work and for whom.

Skill-by-Treatment Interaction

Educational researchers have long been interested in identifying students for whom specific interventions would have a higher likelihood for success, but most research efforts focused on assessing student aptitudes and were not successful in finding an aptitude-by-treatment interaction (Burns et al., 2016; Stuebing et al., 2009, 2015). A recent alternative to aptitude-by-treatment interaction is called skill-by-treatment interaction because it focuses on measuring skills that are malleable to instruction rather than stable psychological characteristics (Burns et al., 2010). The premise of skill-by-treatment interaction was based on previous research that found a low correlation between measures of cognitive ability and reading growth during reading interventions (r = −.11) but a significantly stronger correlation between reading growth and baseline measures of reading fluency (r = .37) and word attack skills (r = .36; Scholin & Burns, 2012).

The link between preintervention reading scores and intervention effectiveness is well documented (Kirby et al., 2012; Lervåg & Aukrust, 2010), and preintervention data have accurately identified reading interventions that had a higher likelihood for strong effects (Burns et al., 2022). Previous research studied the use of reading fluency data to predict effective fluency interventions (Parker & Burns, 2014; Szadokierski et al., 2017) and reading comprehension responses to predict reading comprehension interventions (McMaster et al., 2012). Course grades in history were also predicted by reading comprehension (r = .40) and reading fluency (r = .35) among students in fourth through ninth grades (Bigozzi et al., 2017). However, no previous skill-by-treatment interaction research examined the effect on content acquisition for content courses like social studies, and examining the relationship between preintervention reading variables like fluency and comprehension may help better understand for which students PACT would be most effective.

Purpose

PACT has been shown to enhance social studies content acquisition with middle-school students (Vaughn et al., 2013, 2015, 2017), but research with students with low reading skills has found inconsistent results (Swanson et al., 2015; Wanzek et al., 2016). Moreover, reading skills seem to moderate the effects of PACT with lower outcomes among classrooms with high numbers of low readers (Vaughn et al., 2019). Previous research found that students skill level in reading fluency and comprehension predicted intervention effects (McMaster et al., 2012; Szadokierski et al., 2017) and grades in history courses (Bigozzi et al., 2017). Therefore, the current study aimed to test for a skill-by-treatment interaction (Burns et al., 2010) by examining the extent to which preintervention reading scores predicted social studies outcomes associated with the PACT intervention. The following research questions guided the study: (a) What is the association between preintervention reading fluency and comprehension scores and content knowledge acquisition among middle-school students participating in PACT? and (b) To what extent do preintervention reading fluency and comprehension scores accurately identify students with proficient content knowledge acquisition among students participating in PACT?

Method

The current study involved students who participated in the first cohort of a larger, ongoing evaluation of the PACT intervention. The larger study is a randomized control trial (RCT) with four cohorts of middle schools, for a proposed total of approximately 90 schools.

Participants

Cohort 1 of the ongoing study included 13 middle schools in four Midwestern and Southeastern United States school districts. For randomization and analyses purposes, we blocked schools by district. Within each block, we randomly assigned schools to one of two conditions: PACT treatment or business-as-usual control (BAU). Seven schools were randomly assigned to the treatment condition, and six schools were in the BAU condition.

Participants included 1,376 eighth-grade students (52.8% male) enrolled in Cohort 1. There were 738 students in the treatment condition and 643 in the BAU condition. Based on data collected from the districts, 46.4% of the students were White, 12.7% were Black, 7.63% were Hispanic, 5.3% were Multiracial, 2.3% were Asian, 0.9% were other, and 24.7% were unreported. Most students (68.8%) were native English speakers (23.4% unreported) and did not receive English learner (96.3%) or special education services (91.1%).

A total of 14 teachers were in the treatment condition, and 12 were in the BAU condition. The teachers had an average of 13.9 years of experience (range, 0–27 years), and most (61.1%) had a master’s degree. Each district was supported by a coach, who was hired and trained by the research team, but was located locally in the area of each school. The three coaches (two districts shared a coach) all were veteran teachers.

Measures

Mirroring prior studies of PACT, we used one measure of reading comprehension at pretest, and one measure of social studies content knowledge at posttest. We also administered a measure of reading fluency at the pretest. The three measures used in the current study were a subset of the measures used in the overall RCT. All of the measures were conducted with paper and pencil in the social studies classrooms and were administered by the participating teachers. The participating school districts required that the teachers administer the assessments, rather than independent data collectors, and all teachers received a 2-hr training on how to conduct the assessments approximately 1 to 2 weeks before administering the pretests.

Reading Comprehension

Reading comprehension was measured with the fourth edition of the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006) as a test of reading comprehension. The overall project used the GMRT-4 data for pre- and post-testing, but the current analyses focused on the pretest data. The GMRT-4 is a 35-min timed assessment of reading comprehension that consists of expository and narrative passages ranging in length from 3 to 15 sentences. Students read passages silently and answer three to six multiple-choice questions that increase in difficulty as students progress through the assessment. The internal consistency ranged from .91 to .93, and alternate form reliability ranged from .80 to .87 (MacGinitie et al., 2006).

Reading Fluency

We used the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014) as a pretest measure of reading fluency. The TOSCRF-2 measures silent reading fluency for students between the ages of 7 and 25. Students are presented with progressively longer sentences written in all capital letters without punctuation or spaces between words. Students are then asked to place slashes between the words. Alternate-form and test–retest reliability generally exceeded .80 for most age groups (Hammill et al., 2014), and data from the TOSCRF-2 generally correlated well with other measures of reading (Wissinger et al., 2023). The TOSCRF-2 was administered in a group format and the assessment is timed for exactly 3 min. The data consisted of the age-based standard score for each student with a normative mean of 100 and a standard deviation of 15.

Social Studies Knowledge

We used the Assessment of Social Studies Knowledge (ASK; Vaughn et al., 2013) to measure social studies content knowledge acquisition. The ASK is a 42-item, four-option, untimed multiple-choice assessment of a student’s social studies content for the three units of the intervention. Items with known difficulty parameters were collected with permission from released items from state (Texas and Massachusetts) and Advanced Placement (College Board) social studies tests (Vaughn et al., 2013). Internal consistency of ASK data in previous research equaled a = .89 (Vaughn et al., 2015). The ASK measure was administered as a posttest immediately following the intervention, and as a delayed posttest 13 to 15 weeks after completing units. The data were the number of items correctly completed.

Intervention

Vaughn et al. (2013) initially developed and tested the PACT intervention to improve literacy strategies within social studies instruction. As part of eighth-grade social studies instruction, the intervention was implemented in the treatment classrooms with three United States History units (Colonial American, Road to Revolution, and Revolutionary War). Each PACT unit consists of a 10-day cycle, for a total of 30 instructional sessions that were intended to be 45 min each. Each unit contains five components designed to infuse evidence-based reading comprehension and vocabulary practices within typical classroom history instruction. We describe the five intervention components of PACT below.

Comprehension canopy

Each unit began with an introductory “comprehension canopy” on Day 1 that built motivation and presented an overarching issue or question to guide the purpose of reading and knowledge acquisition throughout the 10-day cycle. For example, students watched a short video, after which the teacher presented a series of questions to build background knowledge and connect new content to prior learning (e.g., “How did events and ideas lead to the signing of the Declaration of Independence?”).

Essential words routine and warm up

On Day 1, essential vocabulary words related to the unit were also taught. Students were introduced to four or five high-frequency keywords that were considered critically important to the content. The lesson for three or four other days then began with a warm-up, which was a brief review of the words through visual representations and turn-and-talk activities.

Text-based instruction and critical reading of text

Students read informational texts during focused sessions throughout each 10-day cycle; sessions were whole-class, small-group, paired, and/or individual silent reading. Teachers’ activities included providing a brief introduction, sharing a video clip, presenting a map to establish the context, or supplementing students’ understanding of text and content through lectures, PowerPoint presentations, and supporting visual and/or auditory materials. Periodically, teachers asked direct questions about content or questions that required inferencing. Throughout the activity, students recorded notes in a learning log that assisted with their organization of key information.

Team-based learning comprehension checks

In each cycle, the students completed two short comprehension checks (five multiple-choice questions) and one long comprehension check (10 multiple-choice questions and one open-ended writing question). For both types of checks, students first completed them individually without the use of text as a measure of individual accountability of content knowledge. Students then moved into a team-based learning (TBL) activity in which they worked in heterogeneous pairs or small groups and completed the same check with text materials and peer discourse.

TBL knowledge application

On Day 9, students engaged in text-based discourse to complete an activity that required them to articulate new perspectives, solve problems, and present conclusions. The teachers evaluated and provided immediate feedback and prompted student groups to extend their thinking and collaboration. The team recorded ideas in a graphic organizer that provided structure for the discourse and held students accountable for contributing text-based evidence to the discourse. Teams shared their final written product and received immediate feedback.

Procedure

Teachers received 10 to 12 hr of training, which focused on implementing the five intervention components and classroom procedures to facilitate student use of textual evidence. Teachers were all trained by coaches who were hired to support PACT implementation, and the training occurred at the district level approximately 1 to 3 weeks before beginning the first unit (Colonial America). Coaches also provided support for individual teachers (a) before each unit to establish goals and assess teacher understanding; (b) during each unit to model, co-teach, provide feedback, and answer questions; and (c) after each unit to debrief implementation and examine if the goals established for the unit were met. The teachers also received an additional half-day training on assessment administration.

After being trained, the teachers collected pretest data by group administering the GMRT and TOSCRF-2 over two sessions. The teacher then implemented the PACT intervention during their Colonial America, Road to Revolution, and Revolutionary War units, which were either the first three units or three of the first five units covered in eighth-grade social studies. The aforementioned coaching occurred during each unit. The teachers administered the ASK assessment within 2 weeks of completing the final unit. Teachers administered this assessment again 13 to 15 weeks later as a delayed posttest.

Fidelity

Treatment fidelity was assessed by having each teacher audio record two randomly selected components from each unit. The teachers were given a handheld audio recorder and were instructed to record only the PACT component during the lesson. After recording the lesson, the file was uploaded to a secure website. Two members of the research team, who were PhD students in school psychology and special education, coded the audio recordings.

The trained coders listened to the audio recordings and used a checklist protocol that was established in previous PACT research (Roberts et al., 2023; Swanson et al., 2015, 2017). The checklist contained steps for each PACT component and allowed for the degree of fidelity to be established. Specifically, there were five steps or items for comprehension canopy, seven items for essential words, two components for warm-up, six for critical reading of text, seven for team-based learning knowledge comprehension check, and nine for team-based learning knowledge application check. Each item was coded as 0 (absent) or 1 (observed). Each component was then given an overall rating of 0 (no components observed), 1 (components observed but required elements not observed), 2 (completes a few [i.e., less than 50%] of the required elements), 3 (completes a majority [i.e., more than 50%] of the required elements), or 4 (completes all required elements).

The coders completed a 3-hr training session on the use of the fidelity checklists. After completing the training, the coders individually coded the same audio-recorded lessons, and the codes were compared to an expert’s master coding with a point-by-point comparison. Before beginning the fidelity coding, each coder had to reach 90% exact interrater agreement with the expert coding criterion. Then, 33% of codings were dual coded to continue to assess interrater reliability. PACT component ratings (0–4) between the two coders correlated at r = .97, p < .001, and resulted in a weighted k of .82 (95% CI = [.68, .94], p < .001, which suggested adequate interrater agreement.

As shown in Table 1, the mean fidelity score was 3.49 (SD = 0.50), with 87.9% of the component observations being at least a 3.00. There was high fidelity across the three units. The units were implemented in order, with Colonial America being first, Road to Revolution second, and Revolutionary War third. There was an inverse relationship between unit order and mean fidelity score, which could suggest that teachers’ implementation of the components improved over time and with repetition. The control teachers also submitted two audio recordings per unit, which were also scored with the PACT fidelity tool and resulted in a mean rating of 0.94 (SD = 0.82), which was significantly lower (t = 9.75, p < .001, g = 3.95) than the rating for the treatment teachers. This suggests that BAU instruction did not generally reflect the approach or components of PACT.

Table 1.

Summary of Fidelity for Each Intervention Component.

Unit	Number of audios coded	Mean score	% positive (3 or 4)
Colonial America	26	3.43	84.6%
Road to Revolution	20	3.55	90.0%
Revolutionary War	3	3.60	100%
Total	49	3.49	87.8%

Analyses

Data were obtained as part of an ongoing randomized evaluation of PACT, but the data for the current research questions were examined with regressions. Before conducting analyses to address the research question, we used Aguinis et al. (2013) recommendation to remove outliers (0.0%–5.0%) that fell outside the expected range of ± 2.24 standard deviations. Furthermore, we removed 182 (13.2%) TOSCRF scores that equaled standard scores of 160 because those appeared to have been erroneously administered by six teachers (23.1%) so that students were allowed to complete the entire assessment rather than stop after 3 min as required by standardized administration procedures.

Missing data

We examined missingness across our variables of interest and found that missing data ranged between 25% and 44%. As such, we used STATA17 (StataCorp, 2021) to conduct multivariate imputation by chained equations (MICE) to impute missing data on all predictor and outcome variables of interest. We imputed data for all variables using predictive mean matching with 10 nearest neighbors and imputed 40 data sets, consistent with the White et al. (2011) recommendations. We accounted for the nested nature of the data and imputed by treatment condition. We used the final pooled, multiply imputed data set to conduct all analyses in STATA17 (StataCorp, 2021) to answer our research questions. We also conducted a sensitivity analysis in which we ran the analyses using the non-multiply imputed data. Non-multiply imputed data were generally equivalent to multiply imputed data, with results following the same trends as the multiply imputed data for all outcomes.

We present the results obtained from the multiply imputed data because this technique helps reduce bias, unlike listwise or pairwise deletion methods, and is a recommended approach for addressing missing data in behavioral research (Woods et al., 2021). Moreover, research in public health suggests that multiple imputation produces less biased estimates even with high missing percentages (Lee & Huber, 2021). We also conducted a full information maximum likelihood (FIML) analysis and replaced missing values with the mean to confirm the accuracy of the multiple imputation procedure. The results from all methods to address missingness produced similar results and established the robustness of the multiple imputation procedure.

Research Question 1—initial reading skill and PACT treatment

Students’ ASK Content scores at the post-test and delayed post-test were the outcomes of interest for the first research question. We conducted a series of multivariate multiple regression analyses to answer the question using robust standard errors. We entered students’ treatment conditions, pretest fluency scores, and pretest comprehension scores into the regression model and grand-mean centered all continuous variables. We also created two interaction terms: (a) treatment condition and pretest fluency, and (b) treatment condition and pretest comprehension. Finally, we controlled for district-fixed effects.

Research Question 2—accurate identification of proficient content knowledge

To answer our second research question and gain a deeper understanding about for whom the PACT intervention was most successful, we conducted a receiver operating characteristics (ROC) analysis on the ASK Content posttest outcome. ROC curve analyses are frequently used in social sciences to identify the score that best balances false-positive and false-negative errors (VanDerHeyden et al., 2017). In other words, we explored whether the data suggested a potential pretest fluency and comprehension cut score threshold that would suggest poor performance on the outcome criterion (i.e., ASK Content).

Second, we set the ASK Content posttest score cut point for the ROC analyses at 70% accuracy or a raw score of at or above 29. We then determined the optimal cut scores for the pretest fluency and comprehension assessments using Youden’s index J (see Baker et al., 2015; Schisterman et al., 2008) and included the full sample of treatment students to maximize both sensitivity (i.e., the proportion of students who were correctly identified as at risk of not meeting the standard) and specificity (i.e., the proportion of students who were correctly identified as not at risk of not meeting the standard). Finally, we used the pooled multiply imputed data set and pretest condition to run a regression analysis (i.e., at or above cut score on both pretests, at or above on comprehension only, at or above on fluency only, below cut score on both pretests). Our dependent variable was students’ ASK content post-test scores and reported an area under the curve estimate as a summary of how well the data identified students who scored at or above the outcome criteria. Our independent variable was the pretest condition, which was dummy-coded. We controlled for district-fixed effects.

Results

Table 2 presents descriptive statistics for student pretest and posttest scores (all, intervention, and control). Correlations between initial skills revealed that students’ pretest fluency was positively and significantly correlated with their pretest reading comprehension skills (r = .21, p < .001). This suggests that, at the pretest, students with higher fluency tended to also have higher comprehension, although this correlation was relatively small.

Table 2.

Descriptive Data for Study Variables.

Measure	Treatmentn = 738M (SE)	Controln = 638M (SE)	Totaln = 1,376M (SE)
Pretest
Reading Comprehension	25.89 (0.42)	25.07 (0.50)	25.51 (0.32)
Reading Fluency	97.69 (0.77)	88.96 (1.06)	93.64 (0.66)
ASK Content Posttest	23.76 (0.39)	18.06 (0.38)	21.12 (0.29)
ASK Content Delayed Posttest	22.85 (0.42)	18.44 (0.35)	20.81 (0.30)

Notes. Reading comprehension was measured with the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006), reading fluency was measured with the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014), and ASK = Assessment of social studies knowledge (Vaughn et al., 2013).

Association Between Initial Reading Skill and PACT Treatment

The first research question inquired about the relationship between initial reading skill and PACT treatment. The results of the regression analyses are presented in Table 3.

Table 3

Social Studies Knowledge Acquisition Scores (ASK) Regressed Onto Student Variables and Preintervention Scores.

	ASK content posttest		ASK content delayed posttest
Predictor	M (SE)	p-value	M (SE)	p-value
Intercept	16.48 (0.66)	<.001	16.27 (0.68)	<.001
District	0.78 (0.23)	.001	0.93 (0.25)	<.001
Treatment	4.88 (0.53)	<.001	3.94 (0.51)	<.001
Pretest Fluency (gmc)	0.08 (0.04)	.036	0.04 (0.03)	.282
Pretest Comprehension (gmc)	0.37 (0.04)	<.001	0.38 (0.03)	<.001
Pretest Fluency (gmc) × Treatment	-0.09 (0.05)	.050	-0.09 (0.04)	.033
Pretest Comprehension (gmc) × Treatment	0.13 (0.05)	.011	0.13 (0.05)	.011

Notes. gmc = grand mean centered, reading comprehension was measured with the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006) and reading fluency was measured with the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014).

Pretest reading comprehension

There was a significant pretest comprehension by treatment interaction for the ASK Content posttest. Therefore, we probed the interaction at 1 SD above and below the mean of students’ pretest comprehension scores. As shown in Figure 1, there was a significant difference between treatment and control students’ scores on the ASK Content posttest at 1 SD below the mean (3.40, p < .001), at the mean (4.80, p < .001), and at 1 SD above the mean (6.20 p < .001). These p-values remained significant after applying the Benjamini–Hochberg (1995) correction for the false discovery rate. The differences in the post-test were also practically meaningful, with effect sizes of d = 0.39, 0.49, and 0.63, respectively.

Figure 1.

Residual Analysis of Social Studies Content Acquisition (ASK) by Comprehension (GMRT-4).

As shown in Table 4, the differences between treatment and control groups followed the same trend and were significant for the ASK Content delayed posttest 6 to 12 weeks later. At the delayed posttest, there were significant and practically meaningful differences between treatment and control students’ scores at 1 SD below the mean (2.47, p = .002, d = 0.25), at the mean (3.86, p < .001, d = 0.39), and at 1 SD above the mean (5.24 p < .001, d = 0.54). The p-values remained significant after applying the Benjamini–Hochberg (1995) correction for false discovery rate.

Table 4.

Effect Size Estimates (d) Comparing Content Acquisition (ASK) Scores Between the Treatment and Control Groups by Preintervention Reading Score Group.

Preintervention measure	ASK posttest			ASK delayed posttest
Preintervention measure	Below average	Average	Above average	Below average	Average	Above average
Reading Comprehension	0.34*	0.49*	0.63*	0.25*	0.39*	0.54*
Reading Fluency	0.65*	0.50*	0.35*	0.54*	0.40*	0.26*

Notes. Reading comprehension was measured with the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006) and reading fluency was measured with the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014). ASK = Assessment of social studies knowledge (Vaughn et al., 2013). The below average group fell at least 1 SD below the mean of the preintervention measure, the average group fell between 1 SD below and 1 SD above the mean on the preintervention measure, and the above average group had a score that was at least 1 SD above the mean on the preintervention measure.

p < .01.

Pretest reading fluency

The pretest fluency by treatment interaction was nearing significance at the ASK Content posttest and reached statistical significance at the delayed posttest. Thus, we probed the significant interaction for the posttest to replicate the analyses completed with the reading comprehension pretest above. As shown in Figure 2, there was a significant difference between treatment and control students’ scores on the ASK Content posttest at 1 SD below the mean (6.38, p < .001), at the mean (4.92, p < .001), and at 1 SD above the mean (3.45, p < .001). These p-values remained significant after applying the Benjamini–Hochberg (1995) correction for the false discovery rate. The differences were meaningful, with effect sizes of d = 0.65, 0.50, and 0.35, respectively.

Figure 2.

Residual Analysis of Social Studies Content Acquisition (ASK) by Fluency (Test of Silent Contextual Reading Fluency Second Edition [TOSCRF-2]).

These differences between treatment and control followed the same trend and remained significant at the ASK Content delayed posttest 6 to 12 weeks later. At the delayed posttest, there were significant and practically meaningful differences between treatment and control students’ scores at 1 SD below the mean (5.30, p < .001, d = 0.54), at the mean (3.94, p < .001, d = 0.40), and at 1 SD above the mean (2.58, p = .011, d = 0.26). These p-values remained significant after applying the Benjamini–Hochberg (1995) correction for false discovery rate.

Accurate Identification of Students With Proficient Content Knowledge Acquisition

With the ASK Content posttest score cut point for the ROC analyses at 70% accuracy or a raw score of at or above 29, we found that 37.9% of our treatment sample fell at or above this cut point criterion (M = 34.20, SD = 0.22), and 62.1% fell below the cut point (M = 17.38, SD = 0.34). Pretest reading comprehension resulted in an area under the curve estimate of .77 (95% CI = [.76, .78]) and identified a threshold score of at or above 26 (sensitivity = 73.94%, specificity = 70.02%). Pretest reading fluency resulted in an area under the curve estimate of .56 (95% CI = [.55, .57]) and the threshold standard score was at or above 90 (sensitivity = 70.94%, specificity = 39.80%). These thresholds equated to approximately the 30th percentile on each measure.

We then used students’ pretest scores to dummy-code pretest conditions using the identified thresholds for each pretest measure. We found that 37.67% of our treatment sample performed at or above the threshold on both measures, 11.13% performed at or above the threshold on the comprehension measure only, 27.25% performed at or above the threshold on the fluency measure only, and 23.95% performed below the threshold on both pretest measures. The mean and standard deviation for each pretest condition is shown in Figure 3.

Figure 3.

Regression of Assessment of Social Studies Knowledge (ASK) Data Using Pretest Reading Conditions.

As seen in Figure 3, there was a statistically significant difference between the ASK content posttest scores of students who were at or above the threshold on both pretests compared to students who were below the threshold on both pretests (9.64, p < .001) and students who performed at or above the threshold on the fluency measure only (9.51, p < .001). Furthermore, both differences were practically meaningful, with effects of d = 0.94, respectively. There was not a statistically significant or practically meaningful difference between students who were at or above the threshold on both pretests and students who performed at or above the threshold on the comprehension measure only (1.23, p = .383, d = 0.18). Thus, if students perform at or above the 30th percentile on a pretest measure of reading comprehension, they are likely to benefit from the PACT intervention.

Discussion

The current study examined the association between preintervention reading fluency and comprehension scores and social studies content knowledge acquisition among middle-school students participating in PACT. We were interested in how well preintervention reading scores identified students who demonstrated proficient content knowledge on a posttest that addressed the three units that were taught. As summarized in Table 4, the PACT intervention led to significantly higher social studies content acquisition scores at posttest and delayed posttest for students at all levels of preintervention reading levels, that is, those who scored below average, average, and above average on the preintervention fluency and comprehension measures. Thus, the intervention appeared to be effective for students with varying reading skills, which was consistent with previous research conducted with students with disabilities (ES = 0.26, Swanson et al., 2015; ES = 0.26, Wanzek et al., 2016).

The current data found significant variability on social studies content acquisition based on preintervention reading skills, which was consistent with previous research that found differential effects at the student level (Roberts et al., 2023). There appeared to be a relationship between preintervention reading fluency and comprehension and content knowledge acquisition. Students generally completed 70% of the content acquisition measure correctly if they scored at or above the 30th percentile on preintervention measures of reading fluency and comprehension. Previous research also found relationships between preintervention reading proficiency and postintervention reading outcomes (Burns et al., 2022; Szadokierski et al., 2017), but the current investigation was the first study to examine the relationship with content area knowledge such as social studies.

Although both pretest fluency and comprehension scores were associated with social studies content acquisition, the comprehension score seemed to matter more, because students who scored low on the comprehension measure but not on the fluency measure had lower outcome scores (M = 19.25, SD = 8.70) than students who scored low on the fluency measure but not the comprehension measure (M = 27.61, SD = 9.14, d = 0.94). Previous research also found that initial comprehension skills predicted reading intervention outcomes (McMaster et al., 2012). Reading fluency was closely related to reading comprehension among adolescent students (Trapman et al., 2014), and students in middle school with difficulties in reading comprehension often also demonstrated difficulties with reading fluency (Clemens et al., 2017). The area under the curve estimate of approximately .50 suggested poor discrimination in the outcome score based on reading fluency, despite being a significant predictor. Moreover, the fluency score was much more accurate in predicting earned a criterion score in the outcome than who did not, which suggests an area for future research.

Implications for Practice

The current data may have some implications for practice. The current and previous research found that PACT was an effective tool for increasing students’ acquisition of social studies content knowledge, even among students who scored below average on a measure of reading fluency or reading comprehension. Thus, practitioners could consider adopting the program to better help students who may experience difficulty in social studies courses due potentially to reading difficulties. However, given that the 30th percentile on a preintervention of reading fluency and comprehension predicted differentiated results, students with reading skills below that criterion may require additional support to be proficient in social studies content.

Limitations and Directions for Future Research

Although the current data may be of interest to practitioners and researchers, they should be considered within the context of the limitations inherent to the study design. First, we selected 70% accuracy as the criterion because that is generally considered passing in most grading schemes and seemed to be a reasonable goal to select. However, it was arbitrarily selected and using a higher standard may result in different outcomes of the research. Future researchers could replicate the design with different standards for the criterion measure. Second, the assessments used for the study were related to but external to the school curriculum and were not related to school-based evaluations of student performance in any way. Thus, we cannot be assured that students always put forth their best effort on the measures.

The remaining limitations and directions for future research are related to the participating teachers. The third limitation was that the classroom teachers collected the data after being trained. We did not conduct any ongoing measure of assessment fidelity and there were some noted errors in administration. Future researchers could either collect data themselves or conduct ongoing assessments of measurement fidelity. Relatedly, the rate of missing data could be considered a study limitation despite using multiple imputations to address the missing data. Fourth, we did not record the certification of the participating teachers and it is not known how many may have been certified to teach English language arts, which could have affected how well the teachers incorporated literacy practices in to their curriculum, but that is a only a hypothesis for additional research. Fifth, the fidelity coding indicated high fidelity across the three units, but the units were not equally represented in the data because the teachers submitted fewer audio recordings for the final unit (Revolutionary War) than for the other two units. However, a total of 49 audio recordings were coded, which was consistent with the number coded in previous PACT research (e.g., n = 55, Swanson et al., 2015).

In addition to addressing the limitations of the current design, future researchers could build on the current study to examine other malleable skills that could potentially relate to success within the PACT program or other similar interventions designed to increase content knowledge and comprehension. Researchers could also consider potential intervention components to add to the PACT program to enhance its effect with students with low preintervention reading skills. The current study was the first in a potential line of inquiry that could replicate the current criteria, study additional or different criteria, or examine how to enhance PACT for students who do not meet the criteria on preintervention reading measures. For example, future researchers could intensify the comprehension canopy or critical reading of text components of PACT for students with low preintervention comprehension or fluency skills, respectively. Those are the two PACT components that conceptually most closely align with the foundational reading skills, but this is merely a hypothesis for future research.

Conclusion

Reading comprehension and content knowledge acquisition are highly related and both are concerns for adolescent learners. The current study again found PACT to be an effective framework to increase social studies content knowledge, but was the first to examine variability on social studies content acquisition based on preintervention reading skills. There was a significant skill-by-treatment interaction given that students demonstrated higher social studies content proficiency if their preintervention reading skills approximated the 30th percentile. However, students with the lowest preintervention reading fluency skills had the largest effect size between treatment and control groups, which also suggested an interaction effect. Thus, additional research is needed to better understand for which students PACT is most effective and how to predict intervention outcomes. Given the number of middle-school students who demonstrate difficulty with reading comprehension and social studies content acquisition, and the positive initial results found here, the additional research seems warranted.

Footnotes

Associate Editor: Robin Ennis

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grant R305R200002 from the National Center for Education Research in the Institute of Education Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Institute of Education Sciences or the Department of Education.

ORCID iDs

Matthew K. Burns

Monica E. Romero

References

Aguinis

Gottfredson

R. K.

Joo

(2013). Best-practice recommendations for defining, identifying, and handling outliers. Organizational Research Methods, 16(2), 270–301. https://doi.org/10.1177/1094428112470848

Baker

D. L.

Biancarosa

Park

B. J.

Bousselot

Smith

J. L.

Baker

S. K.

Kame’enui

E. J.

Alonzo

Tindal

(2015). Validity of CBM measures of oral reading fluency and reading comprehension on high-stakes reading assessments in Grades 7 and 8. Reading and Writing, 28, 57–104. https://doi.org/10.1007/s11145-014-9505-4

Benjamini

Hochberg

(1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57, 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x

Beri

Stanikzai

M. I.

(2018). Self-efficacy beliefs, student engagement and learning in the classroom: A review paper. American International Journal of Research in Humanities, Arts and Social Sciences, 22(1), 213–222.

Bigozzi

Tarchi

Vagnoli

Valente

Pinto

(2017). Reading fluency as a predictor of school outcomes across grades 4–9. Frontiers in Psychology, 8, Article 200. https://doi.org/10.3389/fpsyg.2017.00200

Burns

M. K.

Codding

R. S.

Boice

C. H.

Lukito

(2010). Meta-analysis of acquisition and fluency math interventions with instructional and frustration level skills: Evidence for a skill-by-treatment interaction. School Psychology Review, 39(1), 69–83. https://doi.org/10.1080/02796015.2010.12087791

Burns

M. K.

Petersen-Brown

Haegele

Rodriguez

Schmitt

Cooper

Clayton

Hutcheson

Conner

Hosp

VanDerHeyden

A. M.

(2016). Meta-analysis of academic interventions derived from neuropsychological data. School Psychology Quarterly, 31(1), 28–42. https://doi.org/10.1037/spq0000117

Burns

M. K.

Young

McCollom

E. M.

Stevens

M. A.

Izumi

J. T.

(2022). Predicting intervention effects with preintervention measures of decoding: Evidence for a skill-by-treatment interaction with kindergarten and first-grade students. Learning Disability Quarterly, 45(4), 320–330. https://doi.org/10.1177/07319487221113026

Clemens

N. H.

Simmons

L. E.

Wang

Kwok

O. M.

(2017). The prevalence of reading fluency and vocabulary difficulties among adolescents struggling with reading comprehension. Journal of Psychoeducational Assessment, 35(8), 785–798. https://doi.org/10.1177/0734282916662120

10.

Daggett

W. R.

Hasselbring

T. S.

(2014). What we know about adolescent reading. International Center for Leadership in Education: Rigor, Relevance, and Relationships for All Students. http://www.leadered.com/pdf/what_we_know_about_adolesent_reading_2014.pdf

11.

Edmunds

M. S.

Vaughn

Wexler

Reutebuch

Cable

Tackett

K. K.

Schnakenberg

J. W.

(2009). A synthesis of reading interventions and effects on reading comprehension outcomes for older struggling readers. Review of Educational Research, 79(1), 262–300. https://doi.org/10.3102/0034654308325998

12.

Goldman

S. R.

Snow

Vaughn

(2016). Common themes in teaching reading for understanding: Lessons from three projects. Journal of Adolescent & Adult Literacy, 60(3), 255–264. https://doi.org/10.1002/jaal.586

13.

Hammill

D. D.

Wiederholt

J. L.

Allen

E. A.

(2014). Test of silent contextual reading fluency (2nd ed.). PRO-ED.

14.

Heller

Greenleaf

C. L.

(2007). Literacy instruction in the content areas: Getting to the core of middle and high school improvement. Alliance for Excellent Education.

15.

Herrera

Truckenmiller

A. J.

Foorman

B. R.

(2016). Summary of 20 years of research on the effectiveness of adolescent literacy programs and practices (REL 2016-178). Regional Educational Laboratory Southeast, National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. http://ies.ed.gov/ncee/edlabs

16.

Kirby

J. R.

Deacon

S. H.

Bowers

P. N.

Izenberg

Wade-Woolley

Parrila

(2012). Children’s morphological awareness and reading ability. Reading and Writing, 25, 389–410. https://doi.org/10.1007/s11145-010-9276-5

17.

Lee

J. H.

Huber

J. C.

Jr. (2021). Evaluation of multiple imputation with large proportions of missing data: How much is too much? Iranian Journal of Public Health, 50(7), 1372–1380. https://doi.org/10.18502/ijph.v50i7.6626

18.

Lervåg

Aukrust

V. G.

(2010). Vocabulary knowledge is a critical determinant of the difference in reading comprehension growth between first and second language learners. Journal of Child Psychology and Psychiatry, 51(5), 612–620. https://doi.org/10.1111/j.1469-7610.2009.02185.x

19.

MacGinitie

W. H.

MacGinitie

R. K.

Maria

Dreyer

L. G.

Hughes

K. E.

(2006). Gates-MacGinitie reading tests (4th ed.). Houghton Mifflin Harcourt.

20.

McMaster

K. L.

Van den Broek

Espin

C. A.

White

M. J.

Rapp

D. N.

Kendeou

Bohn-Gettler

C. M.

Carlson

(2012). Making the right connections: Differential effects of reading intervention for subgroups of comprehenders. Learning and Individual Differences, 22(1), 100–111. https://doi.org/10.1016/j.lindif.2011.11.017

21.

National Center for Education Statistics. (2022). A nation’s report card.

22.

Parker

D. C.

Burns

M. K.

(2014). Using the instructional level as a criterion to target reading interventions. Reading & Writing Quarterly, 30(1), 79–94. https://doi.org/10.1080/10573569.2012.702047

23.

Roberts

Vaughn

Wanzek

Furman

Martinez

Sargent

(2023). Promoting adolescents’ comprehension of text: A randomized control trial of its effectiveness. Journal of Educational Psychology, 115(5), 665–682. https://doi.org/10.1037/edu0000794

24.

Salinger

(2011). Addressing the “crisis” in adolescent literacy. American Institutes for Research.

25.

Salinger

Osher

(2018). Academic interventions—Use with care. In Osher

Moroney

Williamson

(Eds.), Creating safe, equitable, engaging schools: A comprehensive, evidence-based approach to supporting students (pp. 235–252). Harvard Education Press.

26.

Schisterman

E. F.

Faraggi

Reiser

(2008). Youden Index and the optimal threshold for markers with mass at zero. Statistics in Medicine, 27(2), 297–315. https://doi.org/10.1002/sim.2993

27.

Scholin

Burns

M. K.

(2012). Relationship between pre-intervention data and post-intervention reading fluency and growth: A meta-analysis of assessment data for individual students. Psychology in the Schools, 49, 385–398. https://doi.org/10.1002/pits.21599

28.

StataCorp. (2021). Stata statistical software: Release 17. StataCorp LLC.

29.

Stuebing

K. K.

Barth

A. E.

Mofese

P. J.

Weiss

Fletcher

J. M.

(2009). IQ is not strongly related to response to reading instruction: A meta-analytic interpretation. Exceptional Children, 76, 31–51. https://doi.org/10.1177/001440290907600102

30.

Stuebing

K. K.

Barth

A. E.

Trahan

L. T.

Radhika

R. R.

Miciak

Fletcher

J. M.

(2015). Are child cognitive characteristics strong predictors of responses to intervention? A meta-analysis. Review of Educational Research, 85, 395–429. https://doi.org/10.3102/0034654314555996

31.

Swanson

Hairrell

Kent

Ciullo

Wanzek

J. A.

Vaughn

(2014). A synthesis and meta-analysis of reading interventions using social studies content for students with learning disabilities. Journal of Learning Disabilities, 47(2), 178–195. https://doi.org/10.1177/0022219412451131

32.

Swanson

Wanzek

Vaughn

Fall

A. M.

Roberts

Hall

Miller

V. L.

(2017). Middle school reading comprehension and content learning intervention for below-average readers. Reading & Writing Quarterly, 33(1), 37–53. https://doi.org/10.1080/10573569.2015.1072068

33.

Swanson

Wanzek

Vaughn

Roberts

Fall

A. M.

(2015). Improving reading comprehension and social studies knowledge among middle school students with disabilities. Exceptional Children, 81(4), 426–442. https://doi.org/10.1177/0014402914563704

34.

Szadokierski

Burns

M. K.

McComas

J. J.

(2017). Predicting intervention effectiveness from reading accuracy and rate measures through the instructional hierarchy: Evidence for a skill-by-treatment interaction. School Psychology Review, 46(2), 190–200. https://doi.org/10.17105/SPR-2017-0013.V46-2

35.

Trapman

Van Gelderen

Van Steensel

Van Schooten

Hulstijn

(2014). Linguistic knowledge, fluency and meta-cognitive knowledge as components of reading comprehension in adolescent low achievers: Differences between monolinguals and bilinguals. Journal of Research in Reading, 37(S1), S3–S21. https://doi.org/10.1111/j.1467-9817.2012.01539.x

36.

VanDerHeyden

A. M.

Codding

R. S.

Martin

(2017). Relative value of common screening measures in mathematics. School Psychology Review, 46(1), 65–87. https://doi.org/10.1080/02796015.2017.12087608

37.

Vaughn

Fall

A. M.

Roberts

Wanzek

Swanson

Martinez

L. R.

(2019). Class percentage of students with reading difficulties on content knowledge and comprehension. Journal of Learning Disabilities, 52(2), 120–134. https://doi.org/10.1177/0022219418775117

38.

Vaughn

Martinez

L. R.

Wanzek

Roberts

Swanson

Fall

A. M.

(2017). Improving content knowledge and comprehension for English language learners: Findings from a randomized control trial. Journal of Educational Psychology, 109(1), 22–34. https://doi.org/10.1037/edu0000069

39.

Vaughn

Roberts

Swanson

E. A.

Wanzek

Fall

A. M.

Stillman-Spisak

S. J.

(2015). Improving middle-school students’ knowledge and comprehension in social studies: A replication. Educational Psychology Review, 27, 31–50. https://doi.org/10.1007/s10648-014-9274-2

40.

Vaughn

Swanson

E. A.

Roberts

Wanzek

Stillman-Spisak

S. J.

Solis

Simmons

(2013). Improving reading comprehension and social studies knowledge in middle school. Reading Research Quarterly, 48(1), 77–93. https://doi.org/10.1002/rrq.039

41.

Vaughn

Wanzek

(2024). Promoting adolescents’ comprehension of text: Efficacy and effectiveness. Remedial and Special Education, 45(1), 58–67. https://doi.org/10.1177/07419325231190

42.

Wanzek

Swanson

Vaughn

Roberts

Fall

A. M.

(2016). English learner and non-English learner students with disabilities: Content acquisition and comprehension. Exceptional Children, 82(4), 428–442. https://doi.org/10.1177/00144029156194

43.

Wanzek

Vaughn

Scammacca

N. K.

Metz

Murray

C. S.

Roberts

Danielson

(2013). Extensive reading interventions for students with reading difficulties after grade 3. Review of Educational Research, 83(2), 163–195. https://doi.org/10.3102/0034654313477212

44.

Wei

Cromwell

A. M.

McClarty

K. L.

(2016). Career readiness: An analysis of text complexity for occupational reading materials. Journal of Educational Research, 109(3), 266–274. https://doi.org/10.1080/00220671.2014.945149

45.

White

I. R.

Royston

Wood

A. M.

(2011). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine, 30(4), 377–399. https://doi.org/10.1002/sim.4067

46.

Wissinger

D. R.

Truckenmiller

A. J.

Konek

A. E.

Ciullo

(2023). The validity of two tests of silent reading fluency: A meta-analytic review. Reading & Writing Quarterly, 40(2), 118–134. https://doi.org/10.1080/10573569.2023.2175340

47.

Woods

A. D.

Davis-Kean

Halvorson

King

K. M.

Logan

J. R.

Bainter

Brown

Clay

J. M.

Cruz

R. A.

Elsherif

M. M.

Mahmoud

Elsherif Gerasimova

Joyal-Desmarais

Moreau

Nissen

Schmidt

Uzdavines

Van Dusen

Vasilev

(2021). Missing data and multiple imputation decision tree. PsyArXiv. https://doi.org/10.31234/osf.io/mdw