Abstract
Previous research on Promoting Adolescent Comprehension of Text (PACT) found significant student-level variability in outcomes. The current study examined a potential skill-by-treatment interaction with 1,376 eighth-grade students from 13 middle schools as part of a larger evaluation study. Treatment students scored higher than control students on a measure of social studies content knowledge among students who scored 1 SD below the mean, at the mean, and 1 SD above the mean on preintervention measures of reading comprehension and reading fluency. Using social studies content knowledge, pretest reading comprehension and fluency resulted in estimates of area under the curve of .77 (95% CI = [.76, .78]) and .56 (95% CI = [.55, .57]), respectively. Both sets of pretest data identified a threshold score that approximated the 30th percentile. Thus, there appeared to be a skill-by-treatment interaction for PACT based on preintervention reading skills.
College- and career-readiness standards require students to comprehend increasingly complex disciplinary texts in content courses such as history, science, and social studies (Heller & Greenleaf, 2007; Wei et al., 2016). However, adolescents are often only minimally prepared to comprehend challenging content text independently (Swanson et al., 2014), which is evidenced by the 29% of students in eighth grade who scored in the proficient range on the National Assessment of Educational Progress for reading (National Center for Education Statistics, 2022). Poor reading comprehension among adolescent students in the United States is a crisis that needs to be addressed (Salinger, 2011).
Students reading below a proficient level are highly likely to experience difficulty in content courses relying on sophisticated literacy skills (Salinger & Osher, 2018). The mismatch between student skills and task demands in social studies and other content courses can cause students to become increasingly disengaged from school and affect their academic self-efficacy (Beri & Stanikzai, 2018), both of which can negatively affect success in high school, post-secondary education, and the workplace (Daggett & Hasselbring, 2014). Although influencing adolescent students’ comprehension is challenging (Goldman et al., 2016), previous research investigating the effectiveness of interventions addressing the needs of adolescents and focusing specifically on reading skills has shown promising but mixed effects (see Edmunds et al., 2009; Herrera et al., 2016; Wanzek et al., 2013 for meta-analyses).
Promoting Adolescent Comprehension of Text
Vaughn et al. (2013) developed the Promoting Adolescent Comprehension of Text (PACT) framework to address the unique demands of content area texts. The research team aimed to supplement and support typical social studies content lessons by incorporating evidence-based reading comprehension and vocabulary practices within the instruction. Vaughn and colleagues (2013) studied PACT within three distinct American History units commonly included in eighth-grade standards (Colonial America, the Road to Revolution, and the Revolutionary War). Each unit cycle contains five components: (a) comprehension canopy that introduce the unit, (b) an essential words routine to teach vocabulary, (c) knowledge acquisition through text-based instruction and critical reading, (d) team-based learning comprehension checks that are first completed individually and then with group-based activities, and (e) team-based learning knowledge application in which students engage in text-based discourse to complete an activity that requires them to articulate new perspectives, solve problems, and present conclusions. Through its five components, PACT emphasizes students’ critical thinking and analytical skills rather than focusing solely on the memorization of factual information.
Several studies have examined the impact of PACT on whole classes of social studies students. Vaughn et al. (2013) found that PACT led to significantly higher scores, as compared to a control group, on measures of social studies content (ES = 0.17), social studies comprehension (ES = 0.29), and even general reading comprehension (ES = 0.20) and the group difference for content knowledge was maintained 4 weeks after the intervention ended (ES = 0.25). Vaughn and colleagues (2015, 2016; Roberts et al., 2023; Vaughn & Wanzek, 2024) also conducted large-scale randomized trials with eighth-grade general education American History classes, which led to significantly positive effects on social studies knowledge and social studies comprehension, with small effects on general reading comprehension as well. Previous research also found significant effects from PACT on social studies knowledge acquisition and comprehension among students with disabilities (ES = 0.26 and 0.34, respectively; Swanson et al., 2015), but Wanzek et al. (2016) found positive effects (ES = 0.26) for students with disabilities only on the social studies knowledge acquisition measure and not on social studies comprehension measures or for more general reading achievement. A randomized trial with 135 teachers and over 7,000 students at 48 schools again found significant effects for content acquisition (ES = 0.13) and social studies comprehension (ES = 0.35) that maintained 9 weeks later, but the data suggested differential effects at the student level (Roberts et al., 2023).
Vaughn et al. (2019) examined the effects of PACT on 359 eighth-grade students who were identified with reading comprehension difficulties as compared to 331 students with reading comprehension difficulties who did not participate in PACT. Although the students generally scored better on measures of content acquisition, content comprehension, and reading comprehension, the effect was strongest for classrooms with smaller numbers of students with reading difficulties and became negligible to small for classrooms with a high percentage of students with reading comprehension difficulties.
In general, previous research on PACT has shown positive impacts on students’ social studies knowledge and comprehension when their teachers implement PACT strategies. However, the findings from the extant research point to the need for further research to better understand the conditions and mechanism under which PACT can work and for whom.
Skill-by-Treatment Interaction
Educational researchers have long been interested in identifying students for whom specific interventions would have a higher likelihood for success, but most research efforts focused on assessing student aptitudes and were not successful in finding an aptitude-by-treatment interaction (Burns et al., 2016; Stuebing et al., 2009, 2015). A recent alternative to aptitude-by-treatment interaction is called skill-by-treatment interaction because it focuses on measuring skills that are malleable to instruction rather than stable psychological characteristics (Burns et al., 2010). The premise of skill-by-treatment interaction was based on previous research that found a low correlation between measures of cognitive ability and reading growth during reading interventions (r = −.11) but a significantly stronger correlation between reading growth and baseline measures of reading fluency (r = .37) and word attack skills (r = .36; Scholin & Burns, 2012).
The link between preintervention reading scores and intervention effectiveness is well documented (Kirby et al., 2012; Lervåg & Aukrust, 2010), and preintervention data have accurately identified reading interventions that had a higher likelihood for strong effects (Burns et al., 2022). Previous research studied the use of reading fluency data to predict effective fluency interventions (Parker & Burns, 2014; Szadokierski et al., 2017) and reading comprehension responses to predict reading comprehension interventions (McMaster et al., 2012). Course grades in history were also predicted by reading comprehension (r = .40) and reading fluency (r = .35) among students in fourth through ninth grades (Bigozzi et al., 2017). However, no previous skill-by-treatment interaction research examined the effect on content acquisition for content courses like social studies, and examining the relationship between preintervention reading variables like fluency and comprehension may help better understand for which students PACT would be most effective.
Purpose
PACT has been shown to enhance social studies content acquisition with middle-school students (Vaughn et al., 2013, 2015, 2017), but research with students with low reading skills has found inconsistent results (Swanson et al., 2015; Wanzek et al., 2016). Moreover, reading skills seem to moderate the effects of PACT with lower outcomes among classrooms with high numbers of low readers (Vaughn et al., 2019). Previous research found that students skill level in reading fluency and comprehension predicted intervention effects (McMaster et al., 2012; Szadokierski et al., 2017) and grades in history courses (Bigozzi et al., 2017). Therefore, the current study aimed to test for a skill-by-treatment interaction (Burns et al., 2010) by examining the extent to which preintervention reading scores predicted social studies outcomes associated with the PACT intervention. The following research questions guided the study: (a) What is the association between preintervention reading fluency and comprehension scores and content knowledge acquisition among middle-school students participating in PACT? and (b) To what extent do preintervention reading fluency and comprehension scores accurately identify students with proficient content knowledge acquisition among students participating in PACT?
Method
The current study involved students who participated in the first cohort of a larger, ongoing evaluation of the PACT intervention. The larger study is a randomized control trial (RCT) with four cohorts of middle schools, for a proposed total of approximately 90 schools.
Participants
Cohort 1 of the ongoing study included 13 middle schools in four Midwestern and Southeastern United States school districts. For randomization and analyses purposes, we blocked schools by district. Within each block, we randomly assigned schools to one of two conditions: PACT treatment or business-as-usual control (BAU). Seven schools were randomly assigned to the treatment condition, and six schools were in the BAU condition.
Participants included 1,376 eighth-grade students (52.8% male) enrolled in Cohort 1. There were 738 students in the treatment condition and 643 in the BAU condition. Based on data collected from the districts, 46.4% of the students were White, 12.7% were Black, 7.63% were Hispanic, 5.3% were Multiracial, 2.3% were Asian, 0.9% were other, and 24.7% were unreported. Most students (68.8%) were native English speakers (23.4% unreported) and did not receive English learner (96.3%) or special education services (91.1%).
A total of 14 teachers were in the treatment condition, and 12 were in the BAU condition. The teachers had an average of 13.9 years of experience (range, 0–27 years), and most (61.1%) had a master’s degree. Each district was supported by a coach, who was hired and trained by the research team, but was located locally in the area of each school. The three coaches (two districts shared a coach) all were veteran teachers.
Measures
Mirroring prior studies of PACT, we used one measure of reading comprehension at pretest, and one measure of social studies content knowledge at posttest. We also administered a measure of reading fluency at the pretest. The three measures used in the current study were a subset of the measures used in the overall RCT. All of the measures were conducted with paper and pencil in the social studies classrooms and were administered by the participating teachers. The participating school districts required that the teachers administer the assessments, rather than independent data collectors, and all teachers received a 2-hr training on how to conduct the assessments approximately 1 to 2 weeks before administering the pretests.
Reading Comprehension
Reading comprehension was measured with the fourth edition of the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006) as a test of reading comprehension. The overall project used the GMRT-4 data for pre- and post-testing, but the current analyses focused on the pretest data. The GMRT-4 is a 35-min timed assessment of reading comprehension that consists of expository and narrative passages ranging in length from 3 to 15 sentences. Students read passages silently and answer three to six multiple-choice questions that increase in difficulty as students progress through the assessment. The internal consistency ranged from .91 to .93, and alternate form reliability ranged from .80 to .87 (MacGinitie et al., 2006).
Reading Fluency
We used the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014) as a pretest measure of reading fluency. The TOSCRF-2 measures silent reading fluency for students between the ages of 7 and 25. Students are presented with progressively longer sentences written in all capital letters without punctuation or spaces between words. Students are then asked to place slashes between the words. Alternate-form and test–retest reliability generally exceeded .80 for most age groups (Hammill et al., 2014), and data from the TOSCRF-2 generally correlated well with other measures of reading (Wissinger et al., 2023). The TOSCRF-2 was administered in a group format and the assessment is timed for exactly 3 min. The data consisted of the age-based standard score for each student with a normative mean of 100 and a standard deviation of 15.
Social Studies Knowledge
We used the Assessment of Social Studies Knowledge (ASK; Vaughn et al., 2013) to measure social studies content knowledge acquisition. The ASK is a 42-item, four-option, untimed multiple-choice assessment of a student’s social studies content for the three units of the intervention. Items with known difficulty parameters were collected with permission from released items from state (Texas and Massachusetts) and Advanced Placement (College Board) social studies tests (Vaughn et al., 2013). Internal consistency of ASK data in previous research equaled a = .89 (Vaughn et al., 2015). The ASK measure was administered as a posttest immediately following the intervention, and as a delayed posttest 13 to 15 weeks after completing units. The data were the number of items correctly completed.
Intervention
Vaughn et al. (2013) initially developed and tested the PACT intervention to improve literacy strategies within social studies instruction. As part of eighth-grade social studies instruction, the intervention was implemented in the treatment classrooms with three United States History units (Colonial American, Road to Revolution, and Revolutionary War). Each PACT unit consists of a 10-day cycle, for a total of 30 instructional sessions that were intended to be 45 min each. Each unit contains five components designed to infuse evidence-based reading comprehension and vocabulary practices within typical classroom history instruction. We describe the five intervention components of PACT below.
Comprehension canopy
Each unit began with an introductory “comprehension canopy” on Day 1 that built motivation and presented an overarching issue or question to guide the purpose of reading and knowledge acquisition throughout the 10-day cycle. For example, students watched a short video, after which the teacher presented a series of questions to build background knowledge and connect new content to prior learning (e.g., “How did events and ideas lead to the signing of the Declaration of Independence?”).
Essential words routine and warm up
On Day 1, essential vocabulary words related to the unit were also taught. Students were introduced to four or five high-frequency keywords that were considered critically important to the content. The lesson for three or four other days then began with a warm-up, which was a brief review of the words through visual representations and turn-and-talk activities.
Text-based instruction and critical reading of text
Students read informational texts during focused sessions throughout each 10-day cycle; sessions were whole-class, small-group, paired, and/or individual silent reading. Teachers’ activities included providing a brief introduction, sharing a video clip, presenting a map to establish the context, or supplementing students’ understanding of text and content through lectures, PowerPoint presentations, and supporting visual and/or auditory materials. Periodically, teachers asked direct questions about content or questions that required inferencing. Throughout the activity, students recorded notes in a learning log that assisted with their organization of key information.
Team-based learning comprehension checks
In each cycle, the students completed two short comprehension checks (five multiple-choice questions) and one long comprehension check (10 multiple-choice questions and one open-ended writing question). For both types of checks, students first completed them individually without the use of text as a measure of individual accountability of content knowledge. Students then moved into a team-based learning (TBL) activity in which they worked in heterogeneous pairs or small groups and completed the same check with text materials and peer discourse.
TBL knowledge application
On Day 9, students engaged in text-based discourse to complete an activity that required them to articulate new perspectives, solve problems, and present conclusions. The teachers evaluated and provided immediate feedback and prompted student groups to extend their thinking and collaboration. The team recorded ideas in a graphic organizer that provided structure for the discourse and held students accountable for contributing text-based evidence to the discourse. Teams shared their final written product and received immediate feedback.
Procedure
Teachers received 10 to 12 hr of training, which focused on implementing the five intervention components and classroom procedures to facilitate student use of textual evidence. Teachers were all trained by coaches who were hired to support PACT implementation, and the training occurred at the district level approximately 1 to 3 weeks before beginning the first unit (Colonial America). Coaches also provided support for individual teachers (a) before each unit to establish goals and assess teacher understanding; (b) during each unit to model, co-teach, provide feedback, and answer questions; and (c) after each unit to debrief implementation and examine if the goals established for the unit were met. The teachers also received an additional half-day training on assessment administration.
After being trained, the teachers collected pretest data by group administering the GMRT and TOSCRF-2 over two sessions. The teacher then implemented the PACT intervention during their Colonial America, Road to Revolution, and Revolutionary War units, which were either the first three units or three of the first five units covered in eighth-grade social studies. The aforementioned coaching occurred during each unit. The teachers administered the ASK assessment within 2 weeks of completing the final unit. Teachers administered this assessment again 13 to 15 weeks later as a delayed posttest.
Fidelity
Treatment fidelity was assessed by having each teacher audio record two randomly selected components from each unit. The teachers were given a handheld audio recorder and were instructed to record only the PACT component during the lesson. After recording the lesson, the file was uploaded to a secure website. Two members of the research team, who were PhD students in school psychology and special education, coded the audio recordings.
The trained coders listened to the audio recordings and used a checklist protocol that was established in previous PACT research (Roberts et al., 2023; Swanson et al., 2015, 2017). The checklist contained steps for each PACT component and allowed for the degree of fidelity to be established. Specifically, there were five steps or items for comprehension canopy, seven items for essential words, two components for warm-up, six for critical reading of text, seven for team-based learning knowledge comprehension check, and nine for team-based learning knowledge application check. Each item was coded as 0 (absent) or 1 (observed). Each component was then given an overall rating of 0 (no components observed), 1 (components observed but required elements not observed), 2 (completes a few [i.e., less than 50%] of the required elements), 3 (completes a majority [i.e., more than 50%] of the required elements), or 4 (completes all required elements).
The coders completed a 3-hr training session on the use of the fidelity checklists. After completing the training, the coders individually coded the same audio-recorded lessons, and the codes were compared to an expert’s master coding with a point-by-point comparison. Before beginning the fidelity coding, each coder had to reach 90% exact interrater agreement with the expert coding criterion. Then, 33% of codings were dual coded to continue to assess interrater reliability. PACT component ratings (0–4) between the two coders correlated at r = .97, p < .001, and resulted in a weighted k of .82 (95% CI = [.68, .94], p < .001, which suggested adequate interrater agreement.
As shown in Table 1, the mean fidelity score was 3.49 (SD = 0.50), with 87.9% of the component observations being at least a 3.00. There was high fidelity across the three units. The units were implemented in order, with Colonial America being first, Road to Revolution second, and Revolutionary War third. There was an inverse relationship between unit order and mean fidelity score, which could suggest that teachers’ implementation of the components improved over time and with repetition. The control teachers also submitted two audio recordings per unit, which were also scored with the PACT fidelity tool and resulted in a mean rating of 0.94 (SD = 0.82), which was significantly lower (t = 9.75, p < .001, g = 3.95) than the rating for the treatment teachers. This suggests that BAU instruction did not generally reflect the approach or components of PACT.
Summary of Fidelity for Each Intervention Component.
Analyses
Data were obtained as part of an ongoing randomized evaluation of PACT, but the data for the current research questions were examined with regressions. Before conducting analyses to address the research question, we used Aguinis et al. (2013) recommendation to remove outliers (0.0%–5.0%) that fell outside the expected range of ± 2.24 standard deviations. Furthermore, we removed 182 (13.2%) TOSCRF scores that equaled standard scores of 160 because those appeared to have been erroneously administered by six teachers (23.1%) so that students were allowed to complete the entire assessment rather than stop after 3 min as required by standardized administration procedures.
Missing data
We examined missingness across our variables of interest and found that missing data ranged between 25% and 44%. As such, we used STATA17 (StataCorp, 2021) to conduct multivariate imputation by chained equations (MICE) to impute missing data on all predictor and outcome variables of interest. We imputed data for all variables using predictive mean matching with 10 nearest neighbors and imputed 40 data sets, consistent with the White et al. (2011) recommendations. We accounted for the nested nature of the data and imputed by treatment condition. We used the final pooled, multiply imputed data set to conduct all analyses in STATA17 (StataCorp, 2021) to answer our research questions. We also conducted a sensitivity analysis in which we ran the analyses using the non-multiply imputed data. Non-multiply imputed data were generally equivalent to multiply imputed data, with results following the same trends as the multiply imputed data for all outcomes.
We present the results obtained from the multiply imputed data because this technique helps reduce bias, unlike listwise or pairwise deletion methods, and is a recommended approach for addressing missing data in behavioral research (Woods et al., 2021). Moreover, research in public health suggests that multiple imputation produces less biased estimates even with high missing percentages (Lee & Huber, 2021). We also conducted a full information maximum likelihood (FIML) analysis and replaced missing values with the mean to confirm the accuracy of the multiple imputation procedure. The results from all methods to address missingness produced similar results and established the robustness of the multiple imputation procedure.
Research Question 1—initial reading skill and PACT treatment
Students’ ASK Content scores at the post-test and delayed post-test were the outcomes of interest for the first research question. We conducted a series of multivariate multiple regression analyses to answer the question using robust standard errors. We entered students’ treatment conditions, pretest fluency scores, and pretest comprehension scores into the regression model and grand-mean centered all continuous variables. We also created two interaction terms: (a) treatment condition and pretest fluency, and (b) treatment condition and pretest comprehension. Finally, we controlled for district-fixed effects.
Research Question 2—accurate identification of proficient content knowledge
To answer our second research question and gain a deeper understanding about for whom the PACT intervention was most successful, we conducted a receiver operating characteristics (ROC) analysis on the ASK Content posttest outcome. ROC curve analyses are frequently used in social sciences to identify the score that best balances false-positive and false-negative errors (VanDerHeyden et al., 2017). In other words, we explored whether the data suggested a potential pretest fluency and comprehension cut score threshold that would suggest poor performance on the outcome criterion (i.e., ASK Content).
Second, we set the ASK Content posttest score cut point for the ROC analyses at 70% accuracy or a raw score of at or above 29. We then determined the optimal cut scores for the pretest fluency and comprehension assessments using Youden’s index J (see Baker et al., 2015; Schisterman et al., 2008) and included the full sample of treatment students to maximize both sensitivity (i.e., the proportion of students who were correctly identified as at risk of not meeting the standard) and specificity (i.e., the proportion of students who were correctly identified as not at risk of not meeting the standard). Finally, we used the pooled multiply imputed data set and pretest condition to run a regression analysis (i.e., at or above cut score on both pretests, at or above on comprehension only, at or above on fluency only, below cut score on both pretests). Our dependent variable was students’ ASK content post-test scores and reported an area under the curve estimate as a summary of how well the data identified students who scored at or above the outcome criteria. Our independent variable was the pretest condition, which was dummy-coded. We controlled for district-fixed effects.
Results
Table 2 presents descriptive statistics for student pretest and posttest scores (all, intervention, and control). Correlations between initial skills revealed that students’ pretest fluency was positively and significantly correlated with their pretest reading comprehension skills (r = .21, p < .001). This suggests that, at the pretest, students with higher fluency tended to also have higher comprehension, although this correlation was relatively small.
Descriptive Data for Study Variables.
Notes. Reading comprehension was measured with the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006), reading fluency was measured with the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014), and ASK = Assessment of social studies knowledge (Vaughn et al., 2013).
Association Between Initial Reading Skill and PACT Treatment
The first research question inquired about the relationship between initial reading skill and PACT treatment. The results of the regression analyses are presented in Table 3.
Social Studies Knowledge Acquisition Scores (ASK) Regressed Onto Student Variables and Preintervention Scores.
Notes. gmc = grand mean centered, reading comprehension was measured with the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006) and reading fluency was measured with the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014).
Pretest reading comprehension
There was a significant pretest comprehension by treatment interaction for the ASK Content posttest. Therefore, we probed the interaction at 1 SD above and below the mean of students’ pretest comprehension scores. As shown in Figure 1, there was a significant difference between treatment and control students’ scores on the ASK Content posttest at 1 SD below the mean (3.40, p < .001), at the mean (4.80, p < .001), and at 1 SD above the mean (6.20 p < .001). These p-values remained significant after applying the Benjamini–Hochberg (1995) correction for the false discovery rate. The differences in the post-test were also practically meaningful, with effect sizes of d = 0.39, 0.49, and 0.63, respectively.

Residual Analysis of Social Studies Content Acquisition (ASK) by Comprehension (GMRT-4).
As shown in Table 4, the differences between treatment and control groups followed the same trend and were significant for the ASK Content delayed posttest 6 to 12 weeks later. At the delayed posttest, there were significant and practically meaningful differences between treatment and control students’ scores at 1 SD below the mean (2.47, p = .002, d = 0.25), at the mean (3.86, p < .001, d = 0.39), and at 1 SD above the mean (5.24 p < .001, d = 0.54). The p-values remained significant after applying the Benjamini–Hochberg (1995) correction for false discovery rate.
Effect Size Estimates (d) Comparing Content Acquisition (ASK) Scores Between the Treatment and Control Groups by Preintervention Reading Score Group.
Notes. Reading comprehension was measured with the Gates–MacGinitie general reading comprehension subtest (GMRT-4; MacGinitie et al., 2006) and reading fluency was measured with the Test of Silent Contextual Reading Fluency: Second Edition (TOSCRF-2; Hammill et al., 2014). ASK = Assessment of social studies knowledge (Vaughn et al., 2013). The below average group fell at least 1 SD below the mean of the preintervention measure, the average group fell between 1 SD below and 1 SD above the mean on the preintervention measure, and the above average group had a score that was at least 1 SD above the mean on the preintervention measure.
p < .01.
Pretest reading fluency
The pretest fluency by treatment interaction was nearing significance at the ASK Content posttest and reached statistical significance at the delayed posttest. Thus, we probed the significant interaction for the posttest to replicate the analyses completed with the reading comprehension pretest above. As shown in Figure 2, there was a significant difference between treatment and control students’ scores on the ASK Content posttest at 1 SD below the mean (6.38, p < .001), at the mean (4.92, p < .001), and at 1 SD above the mean (3.45, p < .001). These p-values remained significant after applying the Benjamini–Hochberg (1995) correction for the false discovery rate. The differences were meaningful, with effect sizes of d = 0.65, 0.50, and 0.35, respectively.

Residual Analysis of Social Studies Content Acquisition (ASK) by Fluency (Test of Silent Contextual Reading Fluency Second Edition [TOSCRF-2]).
These differences between treatment and control followed the same trend and remained significant at the ASK Content delayed posttest 6 to 12 weeks later. At the delayed posttest, there were significant and practically meaningful differences between treatment and control students’ scores at 1 SD below the mean (5.30, p < .001, d = 0.54), at the mean (3.94, p < .001, d = 0.40), and at 1 SD above the mean (2.58, p = .011, d = 0.26). These p-values remained significant after applying the Benjamini–Hochberg (1995) correction for false discovery rate.
Accurate Identification of Students With Proficient Content Knowledge Acquisition
With the ASK Content posttest score cut point for the ROC analyses at 70% accuracy or a raw score of at or above 29, we found that 37.9% of our treatment sample fell at or above this cut point criterion (M = 34.20, SD = 0.22), and 62.1% fell below the cut point (M = 17.38, SD = 0.34). Pretest reading comprehension resulted in an area under the curve estimate of .77 (95% CI = [.76, .78]) and identified a threshold score of at or above 26 (sensitivity = 73.94%, specificity = 70.02%). Pretest reading fluency resulted in an area under the curve estimate of .56 (95% CI = [.55, .57]) and the threshold standard score was at or above 90 (sensitivity = 70.94%, specificity = 39.80%). These thresholds equated to approximately the 30th percentile on each measure.
We then used students’ pretest scores to dummy-code pretest conditions using the identified thresholds for each pretest measure. We found that 37.67% of our treatment sample performed at or above the threshold on both measures, 11.13% performed at or above the threshold on the comprehension measure only, 27.25% performed at or above the threshold on the fluency measure only, and 23.95% performed below the threshold on both pretest measures. The mean and standard deviation for each pretest condition is shown in Figure 3.

Regression of Assessment of Social Studies Knowledge (ASK) Data Using Pretest Reading Conditions.
As seen in Figure 3, there was a statistically significant difference between the ASK content posttest scores of students who were at or above the threshold on both pretests compared to students who were below the threshold on both pretests (9.64, p < .001) and students who performed at or above the threshold on the fluency measure only (9.51, p < .001). Furthermore, both differences were practically meaningful, with effects of d = 0.94, respectively. There was not a statistically significant or practically meaningful difference between students who were at or above the threshold on both pretests and students who performed at or above the threshold on the comprehension measure only (1.23, p = .383, d = 0.18). Thus, if students perform at or above the 30th percentile on a pretest measure of reading comprehension, they are likely to benefit from the PACT intervention.
Discussion
The current study examined the association between preintervention reading fluency and comprehension scores and social studies content knowledge acquisition among middle-school students participating in PACT. We were interested in how well preintervention reading scores identified students who demonstrated proficient content knowledge on a posttest that addressed the three units that were taught. As summarized in Table 4, the PACT intervention led to significantly higher social studies content acquisition scores at posttest and delayed posttest for students at all levels of preintervention reading levels, that is, those who scored below average, average, and above average on the preintervention fluency and comprehension measures. Thus, the intervention appeared to be effective for students with varying reading skills, which was consistent with previous research conducted with students with disabilities (ES = 0.26, Swanson et al., 2015; ES = 0.26, Wanzek et al., 2016).
The current data found significant variability on social studies content acquisition based on preintervention reading skills, which was consistent with previous research that found differential effects at the student level (Roberts et al., 2023). There appeared to be a relationship between preintervention reading fluency and comprehension and content knowledge acquisition. Students generally completed 70% of the content acquisition measure correctly if they scored at or above the 30th percentile on preintervention measures of reading fluency and comprehension. Previous research also found relationships between preintervention reading proficiency and postintervention reading outcomes (Burns et al., 2022; Szadokierski et al., 2017), but the current investigation was the first study to examine the relationship with content area knowledge such as social studies.
Although both pretest fluency and comprehension scores were associated with social studies content acquisition, the comprehension score seemed to matter more, because students who scored low on the comprehension measure but not on the fluency measure had lower outcome scores (M = 19.25, SD = 8.70) than students who scored low on the fluency measure but not the comprehension measure (M = 27.61, SD = 9.14, d = 0.94). Previous research also found that initial comprehension skills predicted reading intervention outcomes (McMaster et al., 2012). Reading fluency was closely related to reading comprehension among adolescent students (Trapman et al., 2014), and students in middle school with difficulties in reading comprehension often also demonstrated difficulties with reading fluency (Clemens et al., 2017). The area under the curve estimate of approximately .50 suggested poor discrimination in the outcome score based on reading fluency, despite being a significant predictor. Moreover, the fluency score was much more accurate in predicting earned a criterion score in the outcome than who did not, which suggests an area for future research.
Implications for Practice
The current data may have some implications for practice. The current and previous research found that PACT was an effective tool for increasing students’ acquisition of social studies content knowledge, even among students who scored below average on a measure of reading fluency or reading comprehension. Thus, practitioners could consider adopting the program to better help students who may experience difficulty in social studies courses due potentially to reading difficulties. However, given that the 30th percentile on a preintervention of reading fluency and comprehension predicted differentiated results, students with reading skills below that criterion may require additional support to be proficient in social studies content.
Limitations and Directions for Future Research
Although the current data may be of interest to practitioners and researchers, they should be considered within the context of the limitations inherent to the study design. First, we selected 70% accuracy as the criterion because that is generally considered passing in most grading schemes and seemed to be a reasonable goal to select. However, it was arbitrarily selected and using a higher standard may result in different outcomes of the research. Future researchers could replicate the design with different standards for the criterion measure. Second, the assessments used for the study were related to but external to the school curriculum and were not related to school-based evaluations of student performance in any way. Thus, we cannot be assured that students always put forth their best effort on the measures.
The remaining limitations and directions for future research are related to the participating teachers. The third limitation was that the classroom teachers collected the data after being trained. We did not conduct any ongoing measure of assessment fidelity and there were some noted errors in administration. Future researchers could either collect data themselves or conduct ongoing assessments of measurement fidelity. Relatedly, the rate of missing data could be considered a study limitation despite using multiple imputations to address the missing data. Fourth, we did not record the certification of the participating teachers and it is not known how many may have been certified to teach English language arts, which could have affected how well the teachers incorporated literacy practices in to their curriculum, but that is a only a hypothesis for additional research. Fifth, the fidelity coding indicated high fidelity across the three units, but the units were not equally represented in the data because the teachers submitted fewer audio recordings for the final unit (Revolutionary War) than for the other two units. However, a total of 49 audio recordings were coded, which was consistent with the number coded in previous PACT research (e.g., n = 55, Swanson et al., 2015).
In addition to addressing the limitations of the current design, future researchers could build on the current study to examine other malleable skills that could potentially relate to success within the PACT program or other similar interventions designed to increase content knowledge and comprehension. Researchers could also consider potential intervention components to add to the PACT program to enhance its effect with students with low preintervention reading skills. The current study was the first in a potential line of inquiry that could replicate the current criteria, study additional or different criteria, or examine how to enhance PACT for students who do not meet the criteria on preintervention reading measures. For example, future researchers could intensify the comprehension canopy or critical reading of text components of PACT for students with low preintervention comprehension or fluency skills, respectively. Those are the two PACT components that conceptually most closely align with the foundational reading skills, but this is merely a hypothesis for future research.
Conclusion
Reading comprehension and content knowledge acquisition are highly related and both are concerns for adolescent learners. The current study again found PACT to be an effective framework to increase social studies content knowledge, but was the first to examine variability on social studies content acquisition based on preintervention reading skills. There was a significant skill-by-treatment interaction given that students demonstrated higher social studies content proficiency if their preintervention reading skills approximated the 30th percentile. However, students with the lowest preintervention reading fluency skills had the largest effect size between treatment and control groups, which also suggested an interaction effect. Thus, additional research is needed to better understand for which students PACT is most effective and how to predict intervention outcomes. Given the number of middle-school students who demonstrate difficulty with reading comprehension and social studies content acquisition, and the positive initial results found here, the additional research seems warranted.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grant R305R200002 from the National Center for Education Research in the Institute of Education Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Institute of Education Sciences or the Department of Education.
