Effects of a University-Led High-Impact Tutoring Program on Low-Achieving High School Students: A 3-Year Randomized Controlled Trial

Abstract

This study uses a randomized controlled trial designed to examine a university-led high-impact tutoring program at seven high schools. The treatment group (n = 525) participated in high-impact tutoring (i.e., groups of 2:1 or 3:1) while the control group (n = 438) attended a remedial mathematics course. The treatment group showed a difference of nearly a half-year of learning (0.13 SD) compared with the control group. We also found no evidence that 2:1 student–tutor groups were more effective than 3:1 groups. Although the university-led program produced strong effects, it was delivered at a high cost. Future work is needed to investigate strategies for reducing the cost of high-impact tutoring while maintaining effectiveness.

Keywords

high-impact tutoring high-dosage tutoring remedial instruction instructional time randomized controlled trial

Introduction

Low performance in mathematics during high school can limit students’ academic and life opportunities (Dougherty & Fleming, 2012; Joensen & Nielsen, 2009). High school students who underperform in mathematics are less likely to enroll in advanced courses that prepare them for college and high-earning majors (Aughinbaugh, 2012; Cha, 2015; Long et al., 2009; Trusty & Niles, 2003), and high school mathematics achievement has been linked to income, employment, and incarceration rates in adulthood (Chetty et al., 2014; Duncan et al., 2007; Heckman et al., 2006). Despite evidence underscoring the significance of mathematics achievement, a large percentage of U.S. students reach high school scoring well below proficiency benchmarks (National Assessment of Educational Progress [NAEP], 2024). Disruptions to learning caused by the COVID-19 pandemic have exacerbated this problem. On the NAEP, often referred to as the “Nation’s Report Card,” post-pandemic results for high school–bound eighth grade students have shown mathematics scores plummeting to levels not observed since 1990 (NAEP, 2024). These declines have also come with increased socioeconomic achievement gaps (NAEP, 2022, 2024).

Because it is low-income students who have disproportionately fallen behind their peers, poor mathematics achievement may ultimately diminish the potential for the education system to serve as an equalizer of economic and social opportunity (Kotok, 2017; NAEP, 2024; Tyson et al., 2007). If learning declines are not remediated to pre-pandemic levels, Hanushek (2023) estimates that the post-pandemic cost of a lower-skilled workforce will lead to an aggregate loss of $28 trillion for the U.S. economy. Such striking forecasts indicate that aggressive measures may be needed to raise the achievement of students before they exit the K–12 education system. Responding to this problem is challenging. Students in need of academic support in high school typically have less access to quality instructional experiences that might help to remediate academic skills (Flores, 2007).

Under the Coronavirus Aid, Relief, and Economic Security (CARES) Act, the federal government allocated Elementary and Secondary School Emergency Relief (ESSER) funds for post-pandemic academic recovery efforts (Office of State and Grantee Relations [OESE], 2024). Many districts used these funds for “high leverage” instructional programs that increased instructional time, with ~40% of school districts using their ESSER funds to deliver variations of high-impact tutoring models. High-impact tutoring programs (also referred to as high-dosage tutoring) have emerged as one of the most promising strategies for remediating the outcomes of low-achieving students (Guryan et al, 2021). These programs are characterized by intensive small-group instruction during the school day in which one trained tutor works with one to four students over at least a 10-week period (Robinson et al., 2021). To be considered a high-impact tutoring program, sessions must occur at least three times each week and last for 30–60 minutes per session (Kraft et al., 2024). Rigorous evaluations of high-impact tutoring programs have shown strong gains in mathematics and reading, student attendance, and social and life outcomes among academically underperforming students. In a recent meta-analysis of 282 randomized controlled trials, researchers reported a pooled effect size of 0.42 SD on academic achievement for high-impact tutoring programs (Kraft et al., 2024). While these overall results are impressive, the academic benefits of large tutoring programs are not as strong as those of small programs, suggesting that bringing quality high-impact tutoring programs to scale may be difficult.

Key gaps also remain in the literature. At the high school level, there is comparatively less experimental research (de Ree et al., 2021; Guryan et al., 2021). In addition, few studies disentangle the effects of high-impact tutoring from the added instructional time that it provides, leaving questions about potential opportunity costs. These questions are important because research has found that supplementary instructional time in core subjects can be beneficial for low-achieving high school students (Angrist et al., 2016; Cattaneo et al., 2017; Cohodes & Parham, 2021). High-impact tutoring also demands considerable human and financial resources. If gains from high-impact tutoring are mostly attributable to providing students with increased instructional time, high schools might choose less resource-intensive approaches to academic remediation.

The purpose of this study was to investigate the effects of a university-led model of high-impact tutoring for high school students by using a randomized controlled trial design for three separate cohorts of ninth grade students at seven high schools in Oklahoma. The analyses in this study addressed the following questions:

Research Question 1: Does a university-led high-impact tutoring program produce stronger academic outcomes than a remedial mathematics course for students who enter ninth grade achieving below grade level in mathematics?

Research Question 2: Is there a difference in academic outcomes for tutored students based on student–tutor group size (i.e.,2:1 vs. 3:1), hours tutored, and student background characteristics?

In the pooled sample, students (n = 525) in the treatment group participated in high-impact tutoring (i.e., groups of 2:1 or 3:1) three class periods per week for an entire academic year. In the control group, students (n = 438) attended a remedial mathematics course. This study's design built on previous experimental studies of high-impact tutoring by using a remedial mathematics class for the control condition. This control condition allowed us to distinguish the effects of high-impact tutoring from the added instructional time that it offers. It also created a higher threshold for observing program effects than studies using business-as-usual control conditions that do not provide instructional time in mathematics. The analyses further tested for differences in outcomes between 2:1 and 3:1 student–tutor group sizes, offering evidence on whether larger student–tutor group sizes can be used to expand the reach of high-impact tutoring. These findings have implications for research and policy given the expansion of high-impact tutoring as an academic remediation strategy nationwide.

Self-Determination Theory and High-Impact Tutoring

The principles of self-determination theory offer compelling support for why high-impact tutoring can accelerate student learning (Coyle et al., 2014; Howard et al., 2021; Taylor et al., 2014). Self-determination theory contends that human motivation stems from meeting innate psychological needs for autonomy, competence, and relatedness (Deci & Ryan, 2012). Autonomy refers to one's need to feel in control over the actions one takes while competence describes the need to feel effective by mastering tasks and learning new skills (Deci & Ryan, 2012). The notion of relatedness emphasizes an inherent need to have a connection with others and to feel a sense of belonging, in which one feels supported and understood (Deci & Ryan, 2012).

In applying self-determination theory to the context of high-impact tutoring, it is conceivable that tutors can respond to individual interests, concerns, and questions to a greater degree than what is possible in whole-classroom settings. Tutors can assess individual skill levels and tailor instruction accordingly, enabling tutors to teach at the right level for each student (Banerjee et al., 2016). As a result, the work of tutors can help to increase students’ sense of purpose, ownership, and overall autonomy in ways that heighten motivation to learn (Adams & Khojasteh, 2018; Gershenson, 2016). By expanding instructional time, high-impact tutoring offers students opportunities to become competent in mathematics by practicing problems and mastering new skills that they are prepared to learn (Cerasoli & Ford, 2014).

High-impact tutoring also has a relational component. Students who study with a tutor for a sustained period may be able to form a personal bond with their tutor. In programs that employ tutors who are near peers, tutors may be able to relate to tutees, building strong relationships with them over time (Colvin, 2007; Duran, 2017). Such interpersonal connections can hypothetically fulfill developmental needs pertaining to social attachment and relatedness with others, which then can be converted into other assets, such as greater motivation to learn (Adams & Khojasteh, 2018; Lohmeier & Lee, 2011).

Empirical Research on High-Impact Tutoring

In the empirical literature, high-impact tutoring has shown robust evidence of being an effective personalized instructional intervention (Kraft et al., 2024). In comprehensive reviews, rigorous experimental studies have indicated that high-impact tutoring has large positive effects on student achievement in mathematics and reading (Kraft et al., 2024; Nickow et al., 2020). Kraft et al. (2024) combined the results of 282 randomized controlled trials of high-impact tutoring programs, finding an overall pooled effect size of 0.42 SD. These results were driven by large effects from literacy tutoring programs during elementary school. Results from this meta-analytic work found that large programs had smaller effects, ranging from 0.21 SD for tutoring programs serving 400–499 students to 0.16 SD for programs serving 1,000 students or more. Among effective program characteristics, high-impact tutoring seems to be most beneficial when delivered in person in “doses” consisting of three or more sessions per week (Robinson et al., 2021). Many effective high-impact tutoring programs operate for an entire school year (Guryan et al., 2021) and use 3:1 student–tutor ratios or lower (Kraft et al., 2024). Along with these characteristics, the use of quality instructional materials that are aligned with course content seems to create the conditions for tutoring success (Robinson et al., 2021).

While outcomes from high-impact tutoring are encouraging, few U.S.-based randomized studies have evaluated the effects of high-impact tutoring on mathematics achievement during high school. Remediating mathematics skills in high school has proven challenging (Heinrich et al., 2019). Low-achieving students tend to have significant gaps in foundational knowledge that prevent them from being successful in high school mathematics classes (Siegler & Braithwaite, 2017). At the high school level, existing studies of high-impact tutoring programs have shown smaller effects than those at the elementary school level (Hickey et al., 2019; Nickow et al., 2020), although high school programs that offer more frequent tutoring have exhibited stronger results (Kraft et al., 2024). For example, de Ree et al. (2021) tested 50-minute daily tutoring sessions over a 16-week period and recorded large gains in mathematics test scores (0.44–0.72 SD) in a small sample of 98 high school students in the Netherlands. Students in the tutoring treatment group replaced their regularly scheduled classes (with the exception of their mathematics classes) with a high-impact tutoring class within the school day, whereas control group students took their regular classes. In a major U.S.-based study, Guryan et al. (2021) performed two large-scale randomized controlled trials involving high school students in Chicago who received 60-minute tutoring sessions each day of the school week in a 2:1 instructional format for an entire school year. Tutored students in this intensive program made substantial gains in mathematics test scores and course grades compared with control group students. However, the difference between the treatment and control groups became considerably smaller when the control condition received additional minutes of mathematics instruction as opposed to control samples that included students who took an elective course.

Effective high-impact tutoring programs require training, logistical coordination, and financial resources. To understand how to optimize high-impact tutoring, there remains a need to investigate varying elements of tutoring programs, such as tutor–student group sizes, length of programs, and instructional models. During high school, few U.S.-based experimental studies have explored high-impact tutoring models with near-peer tutors of high school students. This gap is significant because existing evidence has indicated that there are academic benefits from near-peer or peer tutoring (Colvin, 2007; Duran, 2017; Karsenty, 2010; Roscoe & Chi, 2007). Prior work has suggested that not only is research needed to examine different school contexts and features of high-impact tutoring but also study designs are needed to distinguish the effects of high-impact tutoring from the added instructional time it provides. In most experimental studies of high-impact tutoring, control group students participated in a business-as-usual scenario that did not provide instruction in mathematics. It is thus difficult to conclude that high-impact tutoring as an instructional format is the primary mechanism behind observed academic gains rather than the supplementary instructional time that students receive from these programs (Kraft, 2020; Kraft & Novicoff, 2024). If gains are attributable to added instructional time, the use of resources for high-impact tutoring may be reconsidered.

Remedial Instruction: A Potential Alternative to High-Impact Tutoring

There are challenges to bringing in-person high-impact tutoring programs to scale. Logistically, recruiting sufficient numbers of high-quality tutors for part-time work and finding tutors to commute long distances to small towns and rural areas can be difficult. Furthermore, virtual models that might extend the reach of high-impact tutoring to rural areas have exhibited smaller academic gains than in-person models (Kraft et al., 2024). Cost is another barrier. In one influential high school level analysis, researchers found large achievement effects for a program consisting of 1 hour of 2:1 tutoring per day for the entire academic year, but the cost of this program was $3,200–4,800 per student each year (Guryan et al., 2021). This amount is as much as one third of average annual per-pupil funding ($14,000) in the United States (National Center for Education Statistics [NCES], 2022).

Compared with high-impact tutoring, extended instructional time with a classroom teacher is a less resource-intensive approach that may help to accelerate academic growth over the long run. In a well-performed meta-analysis, researchers analyzed 74 causal studies and found that increased instructional time was associated with a range of small to medium effects on student achievement (Kraft & Novicoff, 2024). Other studies have reported that added remedial instructional time in core subjects may be particularly beneficial for low-achieving high school students (Angrist et al., 2016; Cattaneo et al., 2017; Cohodes & Parham, 2021; Nomi & Allensworth, 2009). When logistical and financial barriers limit the reach of high-impact tutoring, remedial instruction could be an approach to raising the mathematics achievement of low-achieving high school students. However, the effects of remedial instruction relative to high-impact tutoring can only be inferred imprecisely by comparing aggregated results across studies (Kraft & Novicoff, 2024). Little work has been done that compares results in the same study samples, which would give more precise estimates of the difference between high-impact tutoring and alternative approaches to remediation.

Methods

Design of Treatment and Control Conditions

In this study, we compared the outcomes of ninth grade students who were randomly assigned to either a remedial mathematics class providing high-impact tutoring (treatment) or to a remedial mathematics class delivered by a classroom teacher only (control). The primary focus of both the high-impact tutoring treatment course and the control group course was to develop pre-skills for Algebra I (i.e., pre-Algebra). For ninth grade students, Algebra I is a critical gateway to higher-level mathematics classes, and it is required for high school graduation in most states (Chetty et al., 2014; Duncan et al., 2007; Rose & Betts, 2004). Because both groups were enrolled in the same remedial mathematics course, treatment and control group students within schools received approximately the same amount of extra instructional time for the academic year, with class periods typically being 50–55 minutes in duration. Class sections for both study conditions were capped at 22 students. In the control group, the average class size was 19 students over the 3-year period of analysis.

The same teacher within schools was usually responsible for leading both treatment and control group sections. Among these teachers, six were novice emergency-certified teachers while six other teachers held standard certification in high school mathematics. In the high-impact tutoring treatment group, one tutor worked with two or three low-achieving ninth grade students for an entire class period three times a week over the course of the school year. On school days when treatment group students were not tutored, they worked with a classroom teacher in a whole-group environment. Oklahoma state law requires a teacher to oversee all classrooms even if tutors are present. It is important to note that the remedial mathematics classes did not replace students’ regular Algebra I course that is required for graduation. Students in both the treatment and control groups were double blocked in two mathematics classes (i.e., remedial pre-Algebra and Algebra I) for the duration of the school year.

Four faculty and four support staff in the College of Education at the University of Oklahoma participated in the high-impact tutoring program. Three faculty members were mathematics education scholars while program staff comprised undergraduate and graduate research assistants. The program's tutoring director organized tutor recruitment and hiring, tutor training sessions, school site coordination, and data management. Tutor trainers had either high school teaching experience or experience teaching high school mathematics. To support instructional coherence, program faculty created a common pacing guide and instructional materials laying out the sequence, activities, and curricular goals for the high-impact tutoring treatment group and the remedial mathematics control group. The pacing guide was adapted from Big Ideas—a Common Core–aligned curriculum for middle and high school students (Larson & Boswell, 2019).

Tutors were undergraduate and graduate students from the University of Oklahoma. To be eligible for a position, tutors were required to pass an interview, hold a 3.0 grade-point average or higher, and have a B grade or higher in a college-level algebra or calculus course. As near peers, university students represent a potentially important resource for reaching high school students with tutoring programs because they often have the requisite skills to teach ninth grade mathematics, and many can accept part-time work (Colvin, 2007; Duran, 2017; Kraft et al., 2024; Roscoe & Chi, 2007). Tutors received a compensation package (i.e., $4,700 per semester) that was higher than that of regular part-time work that university students seek near campus. Program staff monitored tutor performance at sites, responding to performance challenges as necessary (e.g., incomplete lesson plans, tardiness, and professional conduct). During the 3-year study, three tutors were released midsemester because of performance issues, and an average of five tutors per semester were not asked to return between semesters because of performance concerns. Graduation, scheduling conflicts, and tutor health issues were the main reasons most tutors stopped tutoring at the end of a semester or an academic year. Table 1 presents the characteristics of tutors. Approximately 87% served at least 1 academic year (two semesters) while 33% worked as tutors for 2 academic years (four semesters).

Table 1

Characteristics of Tutors

Characteristic	Mean/proportion (SD)
Male	0.47 (0.50)
Female	0.53 (0.50)
Graduate	0.06 (0.23)
Undergraduate	0.94 (0.23)
International student	0.27 (0.44)
Grade-point average (4.0)	3.7 (0.31)
Semesters tutored	2.92 (1.42)
Tutors serving two semesters	0.87 (0.33)
Tutors serving four semesters	0.33 (0.47)
Major/minor in education	0.04 (0.20)
Total tutors	228

Note. We collected data on the race/ethnicity of tutors in year 3. Tutors were 41% White, 17% Black, 15% Asian, and 27% other background or multiracial.

We developed our high-impact tutoring program based on the following pillars.

Reflective Instruction and Personalized Learning

The program's instructional strategies were grounded in the principles of mathematical proficiency—conceptual understanding, procedural fluency, strategic competence, adaptive reasoning, and productive disposition—as outlined by the National Research Council and Mathematics Learning Study Committee (2001) and National Council of Teachers of Mathematics (2014). To develop these five strands of mathematical proficiency, tutors used various communication strategies, such as describing one's thinking and peer–peer teaching. Tutors also used models and manipulatives (e.g., fraction strips, base-10 blocks, and algebra tiles) to strengthen students’ conceptual knowledge. These types of instructional approaches were integrated into tutors’ lessons with the aim of promoting deep understanding and application (National Council of Teachers of Mathematics, 2014). For each session, tutors developed lesson plans that incorporated opportunities to address the specific learning needs of their students.

Continuous Training and Monitoring

Prior to the academic year in early August, tutors attended a tutoring “boot camp” for 4 full days. These sessions were followed by weekly 90-minute training and lesson-planning sessions. During weekly training sessions, program staff trained tutors to build the pre-skills needed for grade-level mathematics (i.e., Algebra I). Training sessions introduced strategies for increasing high school students’ self-concept, differentiating instruction, and engaging students through instructional games. Drawing from a common pacing guide, tutors submitted and received feedback from program staff on their lesson plans each week. Program staff also performed observations of tutors at school sites that were followed by formative feedback on weekly tutor training days.

Relationship Building and Mentoring

Program staff trained tutors to build relationships with high school students based on trust and psychological safety. Training sessions emphasized the importance of creating supportive learning environments where students would feel comfortable asking questions and taking risks. Training sessions contained modules to foster cultural competence, mentoring, and positive tutor–student interactions. These modules aimed to help tutors cultivate asset-based mindsets by having tutors reflect on the backgrounds of tutees at school sites. Because low-achieving ninth grade students tend to begin high school with different individual learning needs, each tutor committed to working with the same ninth grade students for the semester while program staff consistently worked with tutors to tailor instruction to match students’ knowledge and pace of learning.

School Site Support and Coordination

Tutors provided instruction three times a week and classroom teachers did so on the remaining 2 days of the school week. To assist with coordinating instruction, program staff visited sites and held coordination sessions with teachers.

Sample and Study Setting

Three separate cohorts of ninth grade students at seven high schools participated in this study from 2021 to 2024. The research team recruited public and charter high schools that were within a 30- to 45-minute drive from the University of Oklahoma's campus, which is in central Oklahoma and ~20 miles from the state's largest metropolitan area, Oklahoma City. This recruitment decision was made to minimize travel costs and give tutors time to return to campus for their college classes. Within the radius of potential school sites, seven of the 10 school districts that we contacted chose to participate in the study. Three school districts declined our invitation to participate in the study because they were unable to agree to a randomized controlled trial design. There were no direct costs for schools to participate in the study, but schools had to allocate classroom space for the program and assign a classroom teacher to both treatment and control group class sections.

Table 2 presents the characteristics of each school site. Two schools were rural and five were midsize or large urban schools. The seven schools were characterized by a high degree of racial/ethnic diversity. One high school was a charter school, and the other high schools were district-run public schools. Nearly every school had a high percentage of students eligible for free or reduced-price lunch, with two schools having >90% of their students eligible. Oklahoma is a low-achieving state that ranks among the poorest-performing states on the National Assessment for Educational Progress (NAEP, 2024). Even so, three of the schools in the study were below the state average for academic proficiency, and all study participants were below the state average at baseline.

Table 2

Study Sites of Participating Schools

Factor	School 1	School 2	School 3	School 4	School 5	School 6	School 7
School type	District-run public	District-run public	District-run public	District-run public	District-run public	Public charter	District-run public
Grades	9–12	9–12	9–12	9–12	9–12	9–12	9–12
Academic proficiency (%)^a	10–20	50–60	40–50	60–70	70–80	50–60	0–10
Free or reduced price lunch (%)	60–70	60–70	30–40	50–60	40–50	90–100	90–100
Native American (%)	0–10	0–10	0–10	0–10	0–10	0–10	0–10
Latino/a (%)	0–10	10–20	10–20	20–30	10–20	90–100	70–80
White (%)	0–10	60–70	60–70	40–50	50–60	0–10	0–10
Black (%)	80–90	0–10	0–10	0–10	0–10	0–10	1–10
Other (%)	0–10	10–20	10–20	10–20	20–30	0–10	0–10
Locale	Large urban	Rural	Rural	Midsize urban	Midsize urban	Large urban	Large urban

Note. All schools were located in Oklahoma. These descriptive school-level data come from the Oklahoma State Accountability System (Oklahoma Educational Quality and Accountability, 2024). We provide a range for school information to avoid identifying individual school sites.

Academic proficiency is a composite measure of the percentage of students who scored basic, proficient, or advanced on state tests of English language arts, math, and science. The state average on this index was 49%.

Our initiative was funded to increase tutoring capacity each year. In year 1, 223 ninth grade students were randomized to treatment (i.e., high-impact tutoring) and control groups (i.e., remedial mathematics course with a classroom teacher only) at one high school where there were three treatment class sections and seven control group sections. In year 2, we expanded the initiative to five schools where 370 students were randomized to treatment and control conditions with 10 high-impact tutoring class sections and nine remedial mathematics class sections. In year 3, 520 students were randomized to treatment and control conditions at seven high schools where there was a total of 19 high-impact tutoring class sections and 11 remedial mathematics class sections. One small school site was unable to establish a control group in year 3, so robustness checks were performed, excluding this school.

Attrition analyses showed no statistical differences in attrition between the high-impact tutoring treatment group and control group in the pooled sample and within years (see Appendix Table 1A). In the 3-year pooled sample, attrition was 7% in the treatment group and 6% in the control group. For each year, attrition was 4% in the treatment group and 6% in the control group in year 1, 5% in the treatment group and 6% in the control group in year 2, and 10% in the treatment group and 6% in the control group in year 3. These attrition rates for the treatment and control groups indicate a moderate level of potential bias from differential attrition (Institute of Education Sciences, 2022; Moffatt, 2020). Attrition is related to certain demographic characteristics and not missing at random because students eligible for free or reduced-price lunch were more likely to be lost to attrition than those who were not eligible (odds ratio [OR] = 2.55, SE = 0.74, p < .01).

At school sites, we sought to strengthen study participation by holding information nights and by communicating routinely with teachers, families, and students about the program. We used a multistage passive consent process that may have supported high study participation rates. Only eight students opted out of the study in year 2, and four students opted out of the analysis in year 3.

Experimental Design and Randomization Procedure

To estimate causal effects, we used a two-stage clustered randomization procedure with the high-impact tutoring treatment being administered at the classroom level within schools (see Figure 1). We administered the Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) mathematics assessment to students at the end of eighth grade to identify eligible study participants (i.e., low and low-average performance) (Thum & Kuhfeld, 2020). At each school site, study eligibility cut scores varied based on school size and tutor availability. Average baseline achievement was at the 25th percentile, with school averages ranging between the 17th and 30th percentiles in six of the seven schools. One small school that entered the study in year 3 started at the 39th percentile, which was a higher baseline average than those of the other schools in the study sample. With this school excluded, Appendix Table 2A shows results that are consistent with the main results.

Figure 1.

Randomization procedures.

In late spring, eligible eighth grade students were randomized to remedial mathematics class sections at each high school for the following academic year. These remedial class sections were then randomly selected to be either a high-impact tutoring treatment group or a remedial mathematics class control group taught solely by a classroom teacher. All study participants were also enrolled in an Algebra I class required for graduation. The study design is strengthened by having eligible students randomly assigned to remedial course sections and then having these course sections randomly assigned to be either high-impact tutoring or a remedial mathematics class taught solely by a classroom teacher. In year 3 of the study, we included an additional treatment condition within the high-impact tutoring course sections by randomly assigning students to either 2:1 or 3:1 student–tutor group sizes.

At the start of the academic year, all ninth grade students took the NWEA MAP assessment, providing us with baseline achievement data in mathematics. This assessment was administered again at the end of the academic year, enabling us to determine the effects of high-impact tutoring on students’ mathematics test scores during the school year. Our program staff trained teachers to administer the NWEA MAP assessment and visited sites on testing days to assist with administering and monitoring the assessment. Table 3 presents summary statistics for each variable of analysis for the control group, treatment group, and nonstudy student samples (i.e., students who tested out of the study in eighth grade). Nonstudy students had much higher baseline mathematics scores, were more likely to be White students, and were less likely to be eligible for free or reduced-price lunch and be English language learners.

Table 3

Summary Statistics

Variable	Control group sample				Treatment group sample				Nonstudy ninth grade students
Variable	Mean/prop.	SD	Min	Max	Mean/prop.	SD	Min	Max	Mean/prop.	SD	Min	Max
Fall baseline math RIT score	213.31	10.77	173	243	213.12	11.44	167	251	220.79	15.05	150	263
Winter math RIT score	215.84	11.49	176	245	216.64	12.03	173	268	221.85	15.18	159	264
EOY math RIT score	218.53	12.65	176	251	219.39	12.75	180	282	224.98	15.72	169	290
Fall-winter growth	2.53	7.60	−32	33	3.66	8.02	−35	48	1.26	7.53	−24	57
Fall-spring growth	5.22	8.89	−27	49	6.27	9.22	−24	85	4.19	8.97	−44	74
EOY GPA	2.52	0.93	0	4.03	2.56	0.94	0	4.08	2.78	1.00	0	4.08
Female	0.51	0.50	0	1	0.52	0.50	0	1	0.49	0.50	0	1
FRL	0.80	0.40	0	1	0.76	0.43	0	1	0.63	0.48	0	1
Latino/a	0.63	0.48	0	1	0.46	0.50	0	1	0.35	0.48	0	1
Black	0.05	0.21	0	1	0.08	0.27	0	1	0.08	0.28	0	1
White	0.17	0.38	0	1	0.30	0.46	0	1	0.37	0.48	0	1
Native	0.06	0.24	0	1	0.07	0.26	0	1	0.06	0.24	0	1
Other race	0.09	0.29	0	1	0.09	0.29	0	1	0.13	0.34	0	1
ELL	0.41	0.49	0	1	0.27	0.45	0	1	0.18	0.38	0	1
Special education	0.15	0.35	0	1	0.22	0.41	0	1	0.13	0.33	0	1
No. of students	438				525				1,774

Note. prop. = proportion; RIT = Rasch unit; EOY = end of year; GPA = grade-point average; FRL = free or reduced-price lunch; ELL = English language learner. Nonstudy students were generally higher-performing students who tested out of the study when they took the Northwest Evaluation Association Measures of Academic Progress mathematics assessment at the end of eighth grade.

Table 4 compares the baseline characteristics of the treatment and control groups for each of the 3 years of analysis and in the pooled sample. In the pooled sample, control group students were more likely to be Latino/a (p < .001) and English language learners (p < .001). Treatment group students were more likely to be White (p < .001). Based on within-year comparisons, baseline differences observed in the pooled sample seemed to be driven by differences that occurred in year 3 of the study sample.

Table 4

Comparison of the Baseline Characteristics of the Treatment and Control Groups

Variable	Pooled sample		Year 1		Year 2		Year 3
	Control	Treatment	Control	Treatment	Control	Treatment	Control	Treatment
	Mean/ prop. (SE)	Mean/ prop. (SE)	Mean/ prop. (SE)	Mean/ prop. (SE)	Mean/ prop. (SE)	Mean/ prop. (SE)	Mean/ prop. (SE)	Mean/ prop. (SE)
Fall baseline math RIT score	213.31 (0.51)	213.12 (0.49)	212.75 (0.95)	214.13 (1.56)	214.21 (0.89)	212.14 (0.84)	212.92 (0.84)	213.49 (0.67)
Female	0.51 (0.02)	0.52 (0.02)	0.47 (0.04)	0.50 (0.06)	0.52 (0.04)	0.55 (0.04)	0.53 (0.04)	0.50 (0.03)
FRL	0.80 (0.02)	0.76 (0.02)	0.95 (0.02)	0.96 (0.02)	0.69 (0.04)	0.68 (0.03)	0.79 (0.03)	0.75 (0.03)
Latino/a	0.63*** (0.02)	0.46 (0.02)	0.86 (0.03)	0.93 (0.03)	0.50 (0.04)	0.41 (0.04)	0.55*** (0.04)	0.37 (0.03)
Black	0.05 (0.01)	0.08* (0.01)	—	—	0.03 (0.01)	0.07 (0.02)	0.10 (0.02)	0.10 (0.02)
White	0.17 (0.02)	0.30*** (0.02)	—	—	0.31 (0.04)	0.31 (0.03)	0.18 (0.03)	0.37*** (0.03)
Native	0.06 (0.01)	0.07 (0.01)	—	—	0.08 (0.02)	0.06 (0.02)	0.10 (0.02)	0.10 (0.02)
Other race	0.09 (0.01)	0.09 (0.01)	0.14 (0.03)	0.07 (0.03)	0.07 (0.02)	0.14 (0.03)	0.06 (0.02)	0.07 (0.02)
ELL	0.41*** (0.02)	0.27 (0.02)	0.50 (0.04)	0.47 (0.06)	0.35* (0.04)	0.24 (0.03)	0.38** (0.04)	0.24 (0.03)
Special education	0.15 (0.02)	0.22 (0.02)	0.17 (0.03)	0.18 (0.05)	0.17 (0.03)	0.24 (0.03)	0.11 (0.03)	0.21** (0.02)
No. of students	438	525	133	68	150	177	155	280

Note. prop. = proportion; RIT = Rasch unit; FRL = free or reduced-price lunch; ELL = English language learner. To compare baseline characteristics between students in the treatment and control groups, t test comparisons were performed in the pooled sample and for separate years.

*p < .05; **p < .01; ***p < .001.

Math Rasch Unit Score and Student Grade-Point Average

The MAP assessment was our primary outcome of interest. Developed by NWEA, the MAP assessment is a computer adaptive test that is used to evaluate mathematics ability and growth throughout the school year. More than 9,700 schools use the MAP assessment in 145 countries, and NWEA has a 40-year history of developing such assessments. In this study, the MAP was administered three times each year (i.e., early fall, winter, and late spring). In ninth grade, the MAP assessment is designed to measure students’ growth and achievement in the areas of algebraic thinking, numbers and operations, measurement and data, and geometry (NWEA, 2019). MAP results are reported using Rasch Units (RITs) that range from 100 to 350 points. The reliability evidence (test–retest reliability and marginal reliability) for the MAP is strong. For the ninth grade mathematics assessment, test–retest reliability is >0.9, and marginal reliability is 0.96 (NWEA, 2019). External alignment studies offer further evidence to support the validity of the MAP assessment (Egan & Davidson, 2017). These studies have shown that 97% of MAP items are aligned with Common Core State Standards (ninth grade mathematics: r = .72). In addition to analyzing the NWEA MAP assessment, we investigated students’ grade-point average at the end of the year (scale 0–4.0). We originally collected data on student tardiness and absences, but we excluded these indicators after learning from our program staff that these data often were inconsistently recorded at school sites.

Administrative Data

Data were collected on the following student background characteristics: gender, free or reduced-priced lunch status, special education status, English language learner status, and race/ethnic background. Tutors submitted weekly logs that recorded the total amount of time they spent tutoring each student during the three tutoring sessions of the week. Tutored students could receive a maximum of about 4,100 minutes of instruction from a tutor during the academic year. Table 5 presents a breakdown of tutoring minutes for students in the high-impact tutoring treatment group. Approximately 50% of students who remained in the study for the ninth grade school year received 3,000 or more minutes of tutoring. Variations in tutoring minutes were due to individual student absences, suspensions, and school closures (e.g., ice/snow days).

Table 5

Minutes Tutored in High-Impact Tutoring Treatment Group

Range of minutes tutored	Percentage of tutored students
1–1,000	3
1,001–2,000	7
2,000–3,000	41
3,000–3,500	39
3,500–4,000	11

Note. Approximately 0.5% of students received slightly more than 4,000 minutes of tutoring. Minutes tutored calculations are for 508 ninth grade students who participated in tutoring for the entire school year. Percentages do not sum to 100 because of rounding.

Data Analytic Strategy

To estimate the effects of high-impact tutoring, we analyzed data from a pooled sample of three separate annual cohorts of ninth grade students from 2021 to 2024. Regression models were performed that estimated intent-to-treat (ITT) and treatment-on-treated (TOT) effects (Moffatt, 2020). For TOT effects, compliance with treatment indicated whether the student was enrolled in the high-impact tutoring section for the academic year, whereas control group students had to be enrolled in the remedial pre-Algebra class all year and must not have received our program's tutoring within the school day. For the remedial mathematics class control group, class rosters were checked at the start, midpoint, and end of the academic year. Weekly tutor logs allowed us to monitor participation in the treatment group during the academic year. The ITT sample comprised students who were randomly assigned to the high-impact tutoring treatment or the remedial mathematics course but left their assigned section, typically for a different elective class within the same school. Students were not permitted to transfer between the treatment and control group sections.

For the main analyses, we estimated the following regression model in Stata, clustering standard errors at the school level:

\begin{matrix} Outcom e_{ij} = β_{0} + β_{1} High_Impact_Tutorin g_{ij} + β_{2} Baseline_MathScor e_{ij} \\ + β_{3} Se x_{ij} + β_{4} FR L_{ij} + β_{5} Latino / a_{ij} + β_{6} Blac k_{ij} \\ + β_{7} Nativ e_{ij} + β_{8} Other_rac e_{ij} + β_{9} EL L_{ij} \\ + β_{10} IE P_{ij} + γ_{j} + δ_{t} + ε_{ij} \end{matrix}

where Outcome_ij is the academic performance (i.e., math Rasch Unit [RIT] score or grade-point average) of student i in school j, High_Impact_Tutoring_ij is a binary variable indicating whether a student was assigned to high-impact tutoring, Baseline_MathScore_ij represents students’ baseline math RIT score at the start of ninth grade on the NWEA MAP assessment, FRL_ij is a binary variable indicating eligibility for free or reduced-price lunch (FRL), and race/ethnicity indicators are Latino/a, Black, Native (American), and Other_race (i.e., biracial/multiracial backgrounds). White is the reference category. IEP_ij denotes whether the student had an Individualized Education Program (IEP), and ELL_ij indicates whether the student was an English language learner (ELL). The term γ_j is the school fixed effect, and δ_t is the cohort year fixed effect accounting for year-specific influences. And ε_ij is the error term, clustered at the school level to account for the nonindependence of students within the same school. The model estimates robust standard errors. We did not account for clustering by class section because the same teacher taught both treatment and control sections at five of the seven school sites. Instead, fixed effects for each school site were included to account for school-level differences. We further estimated effects separately for each year of the study, removing the fixed effect for cohort year. Our statistical models also tested for heterogeneous associations based on the number of tutoring hours a student received during the school year.

In subsequent models, we performed moderator analyses to examine whether treatment effects varied based on student and school background characteristics (see Appendix Table 4A). As in the main models, we analyzed results for mathematics RIT score as a function of treatment status, baseline math RIT scores, student demographic characteristics, school attended, and cohort year with standard errors clustered at the school level. To assess subgroup differences, we then interacted treatment status with the following subgroup indicators: FRL status, Hispanic, Black, ELL status, special education status, low-performing student, charter school, and rural school. In year 3 of the study, students were randomly assigned to 3:1 and 2:1 tutoring group size, enabling us to test for potential differences based on student-tutor group sizes (2:1 and 3:1).

Results

Descriptive Patterns

In the pooled sample, students in the treatment group grew 6.27 RIT points, whereas students in the control group grew 5.22 RIT points on the NWEA MAP assessment during the academic year. To put this growth into context, average projected growth for ninth grade students on this assessment is 3.6 RIT points (Thum & Kuhfeld, 2020). However, these comparisons with national norms are purely descriptive and do not represent causal effects. Figure 2 shows that students in the high-impact tutoring treatment group had more substantial gains than control group students in the remedial mathematics class. Sixty-four percent of treated students gained more than their expected annual growth in mathematics while 59% of students in the control group gained more than expected annual growth. Additionally, 43% of treated students gained more than double expected annual growth compared to 37% of control group students doing so.

Figure 2.

Rasch Unit score growth on the NWEA MAP assessment in mathematics (%).

Treatment Effects: 3-Year Pooled Sample

Table 6 presents ITT and TOT results for students’ end-of-year mathematics achievement and grade-point averages. The ITT effect on mathematics achievement was 1.62 RIT points (p < .05), and the TOT effect was 1.59 RIT points (p < .05) on the NWEA MAP assessment. Because average expected growth in ninth grade was 3.6 RIT points on the NWEA MAP assessment, these differences amount to nearly a half-year of learning in mathematics (Thum & Kuhfeld, 2020). We also expressed the treatment–control difference in standard deviation units, yielding an effect size of 0.13 SD between the high-impact tutoring treatment group and the remedial mathematics class control group (Imbens & Rubin, 2015). The similarity between the ITT and TOT effects could be due to the overall high study participation rate and low attrition in the treatment and control groups in this study. In these models, we observed no statistical differences in grade-point averages between the treatment and control groups.

Table 6

Estimated Effects on Mathematics Achievement and Grade-Point Average

Variable	Intent to treat (ITT)		Treatment effect (TOT)
Variable	EOY math RIT score	EOY GPA	EOY math RIT score	EOY GPA
High-impact tutoring treatment	1.62* (0.78)	0.08 (0.08)	1.59* (0.80)	0.08 (0.07)
Baseline math RIT score	0.78*** (0.03)	0.03*** (0.00)	0.79*** (0.03)	0.03*** (0.00)
Demographic factors	Y	Y	Y	Y
School fixed effects	Y	Y	Y	Y
Cohort year fixed effects	Y	Y	Y	Y
Control group mean	218.11 (0.43)	2.50 (0.04)	217.98 (0.43)	2.49 (0.04)
No. of students	963	963	936	936
No. of schools	7	7	7	7

Note. EOY = end of year; RIT = Rasch Unit; GPA = grade-point average. Robust standard errors are in parentheses. EOY math score is the math RIT score on the Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) mathematics assessment. EOY GPA is the end-of-year grade point average (4.0 scale). Baseline math score is the fall math RIT score on the NWEA MAP assessment. Demographic factors are sex, free or reduced-price lunch (FRL) status, race/ethnicity, English language learner (ELL) status, and Individualized Education Program (IEP) status. School fixed effects are binary indicators for schools in the analysis. Cohort year fixed effects represent binary indicators for the years of analysis.

*p < .05; **p < .01; ***p < 0.001.

Table 7 presents TOT results based on the number of hours tutored. In the first column, there was a marginal positive effect on the mathematics RIT score but no difference for grade-point average. In the third and fourth columns, a quadratic term was introduced to test for potential nonlinear effects of tutoring time and academic outcomes, but there was no statistical evidence of a curvilinear relationship between tutoring time received and mathematics RIT score. Grade-point average showed a negligible negative effect. In Appendix Table 3A, we estimated effects by 2,000, 2,600, and 3,000 minutes tutored. These estimates also exhibited little to no difference based on these varying levels of tutoring time.

Table 7

Effect of High-Impact Tutoring on Treated (Ninth Grade) Students by Hours Tutored

Variable	EOY math RIT score	EOY GPA	EOY math RIT score	EOY GPA
High-impact tutoring treatment (hours)	0.03* (0.02)	0.00 (0.00)	0.08 (0.07)	−0.01* (0.01)
High-impact treatment² (hours)			−0.00(0.00)	0.00(0.00)
Baseline math RIT score	0.79***(0.03)	0.03***(0.00)	0.79***(0.03)	0.03***(0.00)
Demographic factors	Y	Y	Y	Y
School fixed effects	Y	Y	Y	Y
Cohort year fixed effects	Y	Y	Y	Y
Control group mean	218.34(0.61)	2.51(0.05)	218.34(0.61)	2.51(0.05)
No. of students	936	936	936	936
No. of schools	7	7	7	7

Note. EOY = end of year; RIT = Rasch Unit; GPA = grade-point average. Robust standard errors are in parentheses. Columns 3 and 4 include a quadratic term for hours tutored. EOY math score is the math RIT score on the Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) mathematics assessment. EOY GPA is the end-of-year GPA (4.0 scale). Baseline math score is the fall math RIT score on the NWEA MAP assessment. Demographic factors are sex, free or reduced-price lunch (FRL) status, race/ethnicity, English language learner (ELL) status, and Individualized Education Program (IEP) status. School fixed effects are binary indicators for schools. Cohort year fixed effects are binary indicators for the years of analysis.

*p < .05; **p < .01; ***p < 0.001.

Effects by Year, Tutor–Student Group Size, and Student Subgroup

Table 8 presents ITT results by year. Results demonstrate that test-score gains for the high-impact tutoring treatment group were concentrated in year 3. In year 3, there was a 2.56 RIT score (p < .10) difference between the treatment and control groups. By contrast, there was a smaller advantage for tutored students in year 1 and year 2. Consistent with analyses of the pooled sample, no differences were observed between the treatment and control groups for end-of-year grade-point averages.

Table 8

Estimated Effects (Intention to Treat) on Mathematics Achievement and Grade-Point Average by Year

Variable	EOY math score (year 1)	EOY math score (year 2)	EOY math score (year 3)	EOY GPA (year 1)	EOY GPA (year 2)	EOY GPA (year 3)
High-impact tutoring treatment	0.71(1.13)	1.09**(0.39)	2.56^†(1.48)	0.08(0.11)	0.11(0.13)	0.06(0.05)
Baseline math RIT score	0.93***(0.05)	0.72***(0.04)	0.72***(0.05)	0.04***(0.01)	0.03***(0.01)	0.03***(0.00)
Demographic factors	Y	Y	Y	Y	Y	Y
School fixed effects	N	Y	Y	N	Y	Y
Control group mean	220.52(0.69)	218.60(0.21)	216.39(0.96)	2.71(0.07)	2.45(0.07)	2.43(0.03)
No. of students	201	327	435	201	327	435
No. of schools	1	5	7	1	5	7

Note. EOY = end of year; GPA = grade-point average; RIT = Rasch Unit. Robust standard errors are in parentheses. EOY math score is the math RIT score on the Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) mathematics assessment. EOY GPA is the end-of-year GPA (4.0 scale). Baseline math score is the fall math RIT score on the NWEA MAP assessment. Demographic factors are sex, free or reduced-price lunch (FRL) status, race/ethnicity, English language learner (ELL) status, and Individualized Education Program (IEP) status. School fixed effects are binary indicators for schools.

† < 0.10; *p < .05; **p < .01; ***p < 0.001.

Table 9 presents TOT results, exhibiting similar patterns. In year 3, there was a difference of 2.53 RIT points (p < .10) between the treatment and control groups, which amounts to about three quarters of a year of learning in mathematics for ninth grade students. There were no statistical differences for end-of-year grade-point averages.

Table 9

Estimated Effects (Treatment on Treated) on Mathematics Achievement and Grade-Point Average by Year

Variable	EOY math score (year 1)	EOY math score (year 2)	EOY math score (year 3)	EOY GPA (year 1)	EOY GPA (year 2)	EOY GPA (year 3)
High-impact tutoring treatment	0.79(1.13)	1.00*(0.44)	2.53^†(1.49)	0.08(0.11)	0.09(0.13)	0.07(0.05)
Baseline RIT math score	0.92***(0.05)	0.75***(0.06)	0.72***(0.05)	0.04***(0.01)	0.03***(0.01)	0.03***(0.00)
Demographic factors	Y	Y	Y	Y	Y	Y
School fixed effects	N	Y	Y	N	Y	Y
Control group mean	220.29(0.69)	218.33(0.26)	216.41(0.96)	2.70(0.07)	2.46(0.07)	2.43(0.03)
No. of students	199	303	434	199	303	434
No. of schools	1	5	7	1	5	7

^† < 0.10; *p < .05; **p < .01; ***p < 0.001.

In year 3, students assigned to high-impact tutoring were randomly assigned to either a 2:1 or 3:1 student–tutor group within their high-impact tutoring class sections. Table 10 indicates no evidence that 2:1 student–tutor groups outperform 3:1 groups. By performing a Wald test, we found a statistically significant positive effect for 3:1 over 2:1 student–tutor groups (p < .05), though the 3:1 sample is relatively small.

Table 10

Effects by 3:1 and 2:1 Student–Tutor Ratios

Variable	EOY math score	EOY GPA
High-impact tutoring treatment (2:1)	1.10(0.67)	0.05(0.09)
High-impact tutoring treatment (3:1)	4.08**(1.27)	0.23*(0.09)
Fall baseline math RIT score	0.79***(0.03)	0.03***(0.00)
Demographic factors	Y	Y
School fixed effects	Y	Y
Cohort year fixed effects	Y	Y
Control group mean	218.06(0.33)	2.49(0.05)
No. of students (3:1)	108	108
No. of students (2:1)	417	417
No. of schools	7	7

Note. EOY = end of year; GPA = grade-point average; RIT = Rasch Unit. Robust standard errors are in parentheses. EOY math score is the math RIT score on the Northwest Evaluation Association (NWEA) Measures of Academic Progress (MAP) mathematics assessment. EOY GPA is the end-of-year GPA (4.0 scale). Baseline math score is the fall math RIT score on the NWEA MAP assessment. Demographic factors are sex, free or reduced-price lunch (FRL) status, race/ethnicity, English language learner (ELL) status, and Individualized Education Program (IEP) status. School fixed effects are binary indicators for schools. Cohort year fixed effects are binary indicators for the years of analysis.

*p < .05; **p < .01; ***p < 0.001.

Appendix Table 4A presents analyses of ITT effects across student subgroups for students eligible for free or reduced-priced lunch, Latino/a students, Black students, English language learners, students with special needs, students achieving at the 15th percentile and below, charter school students, and students in rural schools. The interactions between these different subgroups and the high-impact tutoring treatment are not statistically significant except in the case of students who were eligible for free or reduced-priced lunch, who showed a statistically significant interaction of 1.85 RIT score points (p < .05).

Cost-Effectiveness

High-impact tutoring programs require substantial financial resources (Guryan et al., 2021). Therefore, we estimated how the overall effect (0.13 SD) of this study's university-led high-impact tutoring program compared with its cost. Appendix Table 5A indicates that the per-student cost of high-impact tutoring was $5,207, with 84% of total costs arising from tutor compensation. The remedial mathematics course was only $467 per student, so the additional per-student cost of high-impact tutoring to achieve a 0.13 SD effect on mathematics RIT scores was $4,740. Based on an analysis of effect sizes from >700 randomized controlled trials (see Kraft, 2020), our study's high-impact tutoring program yielded a medium effect size but had a high cost per student. However, the additional cost of the high-impact tutoring program would decline to $3,495 per tutored student if we had delivered our model using only 3:1 tutor–student group sizes.

Discussion

A robust evidence base suggests that high-impact tutoring programs are effective (Kraft et al., 2024; Nickow et al., 2020). Yet, few U.S.-based studies have investigated outcomes at the high school level, and even less research has isolated the effects of high-impact tutoring from the added instructional time that it provides. Addressing these gaps is significant from a policy standpoint because there can be considerable financial and logistical barriers to bringing high-impact tutoring programs to scale (Kraft & Falken, 2021). In this study, we advanced the literature by randomly assigning low-achieving ninth grade students to either a high-impact tutoring treatment group or a remedial mathematics course delivered solely by a classroom teacher group. Pooled results from three separate ninth grade cohorts showed that students in the high-impact tutoring group gained approximately a half year of additional learning (0.13 SD) over students in the remedial mathematics control group. Evidence also indicated that students in 2:1 student–tutor groups did not outperform students in 3:1 student–tutor groups, suggesting that 3:1 student–tutor ratios might be used for high-impact tutoring during high school with no detrimental effects on academic performance.

This study offers key contributions to the literature (Kraft & Falken, 2021; Robinson et al., 2021). In previous randomized controlled trials, high-impact tutoring at the high school level has exhibited larger positive effects than the effects estimated in this study (de Ree et al., 2021; Guryan et al., 2021). Although this difference could be driven by varying factors, one plausible contributing factor is that the control group in our study was assigned to receive the same amount of additional instructional time in mathematics as the treatment group, whereas other studies have tended to use business-as-usual control conditions (e.g., elective courses) that did not provide added instruction time in the subject area. As an example, de Ree et al. (2021) reported large effects on mathematics achievement (0.44–0.72 SD) for a small sample of 49 high school students participating in high-impact tutoring, but the control group in that study did not receive additional minutes of mathematics instruction. In a larger analysis, Guryan et al. (2021) found strong positive effects on mathematics achievement when tutored high school students were compared with control group students enrolled in an elective course, but effects became smaller when tutored students were compared with control group students enrolled in a remedial mathematics course. Along with these findings, we did not find statistical differences across student subgroups in the sample except in the case of students eligible for free or reduced-priced lunch, who appeared to benefit more from high-impact tutoring than their peers.

In comparison with the literature, our results cohere with findings demonstrating that larger high-impact tutoring programs generate smaller effects. As a relatively large program, our pooled effect (0.13 SD) is similar in magnitude to effects (0.16 SD) presented in recent meta-analytic work for other large high-impact tutoring programs (Kraft et al., 2024). Our study also contributes to the existing literature by testing for differences between 2:1 and 3:1 student–tutor group sizes. In prior work, there were only marginal observed differences in the effects of 1:1, 2:1, and 3:1 high-impact tutoring models (Kraft et al., 2024), but very little work has examined such differences among high school students who potentially have a greater capacity to work independently within groups than elementary and middle school students do (Nickow et al., 2020). Because tutor compensation is the greatest expense for tutoring programs, increasing student–tutor group sizes by even one high school student can reduce program costs considerably. In this study, we found no evidence that raising the student–tutor ratio from 2:1 to 3:1 had detrimental effects on student achievement in our sample. Yet, this finding does warrant cautious interpretation. While we randomized students to 2:1 and 3:1 group sizes in year 3 of the study, our estimates were also strongest in year 3, leaving some uncertainty about how overall program effectiveness may have affected the two group sizes during this year.

For research at the high school level, other studies have reported positive effects of high-impact tutoring on grade-point average (Guryan et al., 2021). Our analyses exhibited mostly no differences in grade-point average between treatment and control groups. The one exception was the group receiving 3:1 tutoring in year 3, which showed a slightly higher grade-point average than the control group. It is possible that students’ overall grade-point averages are less sensitive to a mathematics intervention in a single course. Alternatively, grade-point average may be a less objective measure of academic growth, being influenced by variability in grading standards and subjectivity in assessments, school norms, and other nonacademic factors (Randall & Engelhard, 2010).

In this study, delivering a remedial mathematics class with a classroom teacher was less costly and human resource intensive than high-impact tutoring. Rigorous studies have shown strong academic effects for supplementary instructional time in core subject areas (Angrist et al., 2016; Cattaneo et al., 2017; Cohodes & Parham, 2021; Cortes et al., 2015). Future work is thus needed to test whether remedial high school mathematics courses might be a cost-effective alternative to high-impact tutoring programs. Financial, logistical, and human resource constraints are likely to make high-impact tutoring programs difficult to deliver for many schools over a sustained period. In these cases, other cost-effective approaches are needed (Kraft & Novicoff, 2024).

Study Limitations

There are limitations to this study that must be highlighted. The randomization procedure that we used strengthens internal validity, but this study may have limited external validity because our findings were derived from seven high schools (urban, midsize urban, and rural) in a state that ranks in the bottom quintile nationally for academic performance (NAEP, 2024). While the sociodemographic profiles match many schools across the country, these schools may be shaped by the context for public education in Oklahoma in ways that make our results less generalizable to high schools in other states. For example, each high school in the study experienced significant challenges recruiting and retaining teachers. Nearly half the teachers overseeing the treatment and control group classes were novice emergency-certified teachers. Schools in states with stronger teacher labor markets could find that high-impact tutoring programs are less effective than remedial mathematics classes with more experienced teachers. Similarly, our results might be more generalizable to schools in other states with teacher labor market challenges because tutor effectiveness could increase when tutors are compared with less experienced teachers. It must be acknowledged further that variability across schools may limit how broadly the findings of this study apply to other school settings.

Another important limitation to this study is that our research design did not allow us to make comparisons with students who received no remedial support because we did not have a business-as-usual control group. The unadjusted academic growth for both the high-impact tutoring group and the remedial mathematics group was strong, but our study design only allowed us to compare these two groups against each other. Year 1 of the study was also done at a single school, exerting disproportionate influence on our pooled estimates. Estimates from that year were drawn from a comparatively small sample that could be more sensitive to school characteristics.

When comparing the treatment and control groups, the largest treatment effects occurred in year 3. Funding for our model was allocated to expand the reach of tutoring each year, but this expansion reduced the number of control group students relative to those who participated in high-impact tutoring at school sites by year 3. Although the precise mechanisms behind the year 3 results are unclear, there could have been a stigma associated with participating in the relatively smaller group that may have negatively influenced student performance in the control group in that year. Conversely, year 3 results could be a result of the program's administrative staff and trainers enhancing coordination, training lessons, and tutor recruitment strategies over time. If this latter scenario were the case, new high-impact tutoring programs may need to launch with an improvement mindset, understanding that there will be a learning curve along with ongoing refinements to the program. Moreover, there was a statistically higher percentage of English language learners and Hispanic students in the control group during year 3, possibly biasing results downward. At the same time, there was a large share of students with disabilities in the treatment group in year 3, which could have put downward pressure on test scores in this group in year 3. Future research is needed to explore programmatic and implementation improvements as well as how control group dynamics might influence observed program effects.

In addition to these limitations, we used varying cut scores for study eligibility at school sites because of capacity and school size differences. Even though all cut scores required eligible students to be classified as low or low-average on the MAP assessment, slight differences in cut scores could be a limitation to this study. We also relied on a strong independent measure (i.e., MAP assessment) to evaluate program effects, but we did not employ researcher-designed assessments for comparison because there were concerns about overtesting students. Such researcher-designed assessments can be useful though, providing evidence of mastery on mathematical skills at given points in time and insight into learning generated from specific instructional approaches.

Spillover effects are a potential limitation as well. The classroom teacher for the high-impact tutoring treatment group class section was usually the same teacher for the remedial mathematics control group class. Teachers used the same pacing guide for both treatment and control groups. They also supervised their classrooms on tutoring days, so the pedagogical and instructional strategies used for high-impact tutoring sessions may have affected how teachers ultimately instructed students in their control group sections. This type of spillover is plausible because many teachers in the study were novices. As students in the remedial class control group may have benefited from spillover effects, our estimates for the high-impact tutoring treatment group may be biased downward and more conservative.

Finally, there was no statistical evidence of differential attrition between treatment and control groups, and study participation rates were high in this study compared with other prominent analyses of high-impact tutoring at the high school level (Guryan et al., 2021). Still, overall attrition after initial assignment ranged from 10 to 16% across the 3 years of analysis, which could compromise internal validity and limit our ability to extrapolate effects to broader populations.

Implications for Policy and Practice

This study's findings have policy and practice implications given the rapid growth of high-impact tutoring programs. While our results indicated that high-impact tutoring can be an effective intervention, complex logistical, financial, and administrative factors must coalesce to make high-impact tutoring available in schools. In tutoring programs, tutor recruitment, training, and compensation barriers can limit the reach of quality in-person tutoring provided within the school day (Robinson et al., 2021). University-led programs represent a promising way to overcome expansion obstacles of this nature by recruiting university students to serve as tutors. University students often accept part-time employment, possess foundational high school mathematics skills, and are near peers who can become positive role models for academically struggling high school students (Colvin, 2007; Duran, 2017; Roscoe & Chi, 2007). However, university students have less formal training and familiarity with school procedures than paraprofessionals and teachers who commonly serve as tutors. It is vital that faculty and staff overseeing university-led tutoring programs undertake regular training, supervision, and coordinating activities at school sites. In university programs, participation in tutoring programs may increase tutors’ interest in pursuing teaching as a career. Future work might seek to understand whether tutoring experience leads university students to consider teaching as a career path.

In efforts to expand high-impact tutoring, distance is a barrier that university programs may also be capable of overcoming. Recruiting tutors to travel beyond a 30- to 40-minute radius to rural areas can be difficult for companies providing tutoring services because of time, gas expenses, and other transportation costs. Universities located outside of major urban centers might help to reach schools that otherwise do not have access to in-person high-impact tutoring in their areas. State and regional universities might coordinate efforts and resources to provide tutor training, instructional materials, and communication with schools. Combining expertise and resources among higher education institutions within states could ease capacity challenges faced by small regional universities when launching new high-impact tutoring programs. There are also potential benefits to university students themselves, who can obtain reasonable part-time compensation and quality work experience.

Key program design decisions can influence both tutoring capacity and effectiveness. That is, higher student–tutor ratios can expand the reach of programs by decreasing costs, but larger student–tutor ratios may not be as effective as smaller ratios are. Optimizing this ratio is critical because tutor compensation is the greatest expense for most programs. Together with a growing number of studies, our evidence demonstrates that student–tutor ratios can be expanded to 3:1 student–tutor groups without diminishing effectiveness (Kraft et al., 2024). If 1:1 and 2:1 programs move to 3:1 student–tutor group sizes, they can serve more students with existing funds. In our study, moving from 2:1 to 3:1 group sizes reduced the total cost per student of the program by 26%. The number of tutoring sessions provided per week is another design decision that can affect capacity. This study's program delivered tutoring 3 days a week, which, for the purposes of recruiting university students to work as tutors, could be a more scalable model than 5-day programs.

Among tutoring models, this study's university-led program shared certain design features with prominent programs, such as Saga Tutoring. Like Saga Tutoring's widely used program, this study's university-led program emphasized small-group instruction, structured lesson planning, frequent monitoring, and alignment with students’ current skill levels rather than grade-level pacing. However, whereas Saga Tutoring uses common curricula under a national implementation strategy, our university-led program was faculty designed and tailored to specific district contexts. The program in this study also used near-peer undergraduate and graduate tutors supervised by university faculty and staff, integrating routine lesson planning feedback, site-based observations, and relationship building. For these reasons, this study's program is aligned with core principles of effective high-dosage tutoring but reflects a locally grounded, university-led approach to curriculum design, tutor development, and school partnerships.

Finally, continued program improvements are likely to be critical to long-term success (Bryk et al., 2015). At the outset of this multiyear study, program staff established tutor recruitment and hiring procedures, training strategies, instructional materials, and lesson structures. In the years that followed, the team collected feedback from tutors and held annual listening sessions with school site personnel. By using this information, the tutoring team was able to refine components of the program each year, identifying lessons learned and responding with program enhancements. Nonetheless, the precise mechanisms that drive ongoing improvement for university-led as well as other models of high-impact tutoring warrant systematic investigation. As tutoring providers seek to bring their models to scale, future studies will be needed to examine the mechanisms underlying the effectiveness of different programs.

Footnotes

Appendix

Appendix Table 5A

Cost-Effectiveness Analysis of University-Led High-Impact Tutoring

Factor	Price/ unit ($)	High-impact tutoring ($)	Remedial math class ($)
Teacher compensation	9,300	269,700	204,600
Tutor stipend & scholarship (2:1)	9,400	1,959,900
Tutor stipend & scholarship (3:1)	9,400	338,400
Tutor management	80,000	80,000
Tutor training	40,000	80,000
Supplies	500	500
Training space	5,000	5,000
Total cost		2,733,500	204,600
Total cost/student		5,207	467
Additional cost/student of university- led high-impact tutoring		4,740

ORCID iDs

Daniel Hamlin

Corey Peltier

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded with generous support from the Randall and Lenise Stephenson Family Foundation.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Daniel Hamlin is the Presidential Professor of Education Policy at the University of Oklahoma. He examines the effects of education policies with an emphasis on school climate, family engagement, educational choice, and academic interventions.

Corey Peltier is an associate professor of special education at the University of Oklahoma. His research focuses on identifying effective interventions and assessment practices to improve mathematics outcomes for students with disabilities and on using systematic reviews and meta-analyses to advance understanding of effective interventions.

Stacy Reeder is Dean of the Jeannine Rainbolt College of Education at the University of Oklahoma. In her work, she investigates mathematics teaching and learning, problem solving, and teacher education.

References

Adams

Khojasteh

(2018). Igniting students’ inner determination: The role of a need-supportive climate. Journal of Educational Administration, 56(4), 382–397. https://doi.org/10.1108/JEA-04-2017-0036

Angrist

J. D.

Cohodes

S. R.

Dynarski

S. M.

Pathak

P. A.

Walters

C. R.

(2016). Stand and deliver: Effects of Boston's charter high schools on college preparation, entry, and choice. Journal of Labor Economics, 34(2), 275–318. https://doi.org/10.1086/683665

Aughinbaugh

(2012). The effects of high school math curriculum on college attendance: Evidence from the NLSY97. Economics of Education Review, 31(6), 861–870. https://doi.org/10.1016/j.econedurev.2012.06.004

Banerjee

Banerji

Berry

Duflo

Kannan

Mukherji

Shotland

Walton

(2016). Mainstreaming an effective intervention: Evidence from randomized evaluations of “Teaching at the Right Level” in India (NBER Working Paper No. w22746). National Bureau of Economic Research. https://doi.org/10.3386/w22746

Bryk

A. S.

Gomez

L. M.

Grunow

LeMahieu

P. G.

(2015). Learning to improve: How America's schools can get better at getting better. Harvard Education Press.

Cattaneo

M. A.

Oggenfuss

Wolter

S. C.

(2017). The more, the better? The impact of instructional time on student performance. Education Economics, 25(5), 433–445. https://doi.org/10.1080/09645292.2017.1315055

Cerasoli

C. P.

Ford

M. T.

(2014). Intrinsic motivation, performance, and the mediating role of mastery goal orientation: A test of self-determination theory. Journal of Psychology, 148(3), 267–286. https://doi.org/10.1080/00223980.2013.783778

Cha

S. H.

(2015). Exploring disparities in taking high level math courses in public high schools. KEDI Journal of Educational Policy, 12(1), 3–17. https://doi.org/10.22804/kjep.2015.12.1.001

Chetty

Friedman

J. N.

Rockoff

J. E.

(2014). Measuring the impacts of teachers II: Teacher value-added and student outcomes in adulthood. American Economic Review, 104(9), 2633–2679. https://doi.org/10.1257/aer.104.9.2633

10.

Cohodes

S. R.

Parham

K. S.

(2021). Charter schools’ effectiveness, mechanisms, and competitive influence (NBER Working Paper No. w28477). National Bureau of Economic Research. https://doi.org/10.3386/w28477

11.

Colvin

J. W.

(2007). Peer tutoring and social dynamics in higher education. Mentoring & Tutoring, 15(2), 165–181. https://doi.org/10.1080/13611260601086345

12.

Cortes

K. E.

Goodman

J. S.

Nomi

(2015). Intensive math instruction and educational attainment: Long-run impacts of double-dose algebra. Journal of Human Resources, 50(1), 108–158. https://doi.org/10.3368/jhr.50.1.108

13.

Coyle

T. R.

Purcell

J. M.

Snyder

A. C.

Richmond

M. C.

(2014). Ability tilt on the SAT and ACT predicts specific abilities and college majors. Intelligence, 46, 18–24. https://doi.org/10.1016/j.intell.2014.04.008

14.

de Ree

Maggioni

M. A.

Paulle

Rossignoli

Walentek

. (2021). High impact tutoring in pre-vocational secondary education: Experimental evidence from Amsterdam. SocArXiv. https://doi.org/10.31235/osf.io/r56um

15.

Deci

E. L.

Ryan

R. M.

(2012). Self-determination theory. In Van Lange

P. A. M.

Kruglanski

A. W.

Higgins

E. T.

(Eds.), Handbook of theories of social psychology (pp. 416–436). Sage. https://doi.org/10.4135/9781446249215.n21

16.

Dougherty

Fleming

(2012). Getting students on track to college and career readiness: How many catch up from far behind?ACT Research Report Series. https://files.eric.ed.gov/fulltext/ED542022.pdf

17.

Duncan

G. J.

Dowsett

C. J.

Claessens

Magnuson

Huston

A. C.

Klebanov

Pagani

L. S.

Feinstein

Engel

Brooks-Gunn

Sexton

Duckworth

Japel

(2007). School readiness and later achievement. Developmental Psychology, 43(6), 1428–1446. https://doi.org/10.1037/0012-1649.43.6.1428

18.

Duran

(2017). Learning-by-teaching. Evidence and implications as a pedagogical mechanism. Innovations in Education and Teaching International, 54(5), 476–484. https://doi.org/10.1080/14703297.2016.1156011

19.

Egan

K. L.

Davidson

A. H.

(2017, November 14). Alignment of the NWEA MAP growth and MAP growth K–2 to the common core state standards: English language arts and mathematics. EdMetric.

20.

Flores

(2007). Examining disparities in mathematics education: Achievement gap or opportunity gap? The High School Journal, 91(1), 29–42. https://www.jstor.org/stable/40367921

21.

Gershenson

(2016). Linking teacher quality, student attendance, and student achievement. Education Finance and Policy, 11(2), 125–149. https://doi.org/10.1162/EDFP_a_00180

22.

Guryan

Ludwig

Bhatt

M. P.

Cook

P. J.

Davis

J. M.

Dodge

Farkas

Frye

R. G.

Jr. Mayer

Pollack

Steinberg

(2021). Not too late: Improving academic outcomes among adolescents (NBER Working Paper No. w28531). National Bureau of Economic Research. https://doi.org/10.3386/w28531

23.

Hanushek

E. A.

(2023). Generation lost: The pandemic's lifetime tax. Education Next. https://www.educationnext.org/generation-lost-the-pandemics-lifetime-tax/

24.

Heckman

J. J.

Stixrud

Urzua

(2006). The effects of cognitive and noncognitive abilities on labor market outcomes and social behavior. Journal of Labor Economics, 24(3), 411–482. https://doi.org/10.1086/504455

25.

Heinrich

C. J.

Darling-Aduana

Good

Cheng

(2019). A look inside online educational settings in high school: Promise and pitfalls for improving educational opportunities and outcomes. American Educational Research Journal, 56(6), 2147–2188. https://doi.org/10.3102/0002831219838776

26.

Hickey

A. J.

Flynn

R. J.

(2019). Effects of the TutorBright tutoring programme on the reading and mathematics skills of children in foster care: A randomised controlled trial. Oxford Review of Education, 45(4), 519–537. https://doi.org/10.1080/03054985.2019.1607724

27.

Howard

J. L.

Bureau

Guay

Chong

J. X.

Ryan

R. M.

(2021). Student motivation and associated outcomes: A meta-analysis from self-determination theory. Perspectives on Psychological Science, 16(6), 1300–1323. https://doi.org/10.1177/1745691620966789

28.

Imbens

G. W.

Rubin

D. B.

(2015). Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.

29.

Institute of Education Sciences. (2022). What works clearinghouse standards handbook (Version 5.0). U.S. Department of Education. https://ies.ed.gov/ncee/WWC/Docs/referenceresources/Final_WWC-HandbookVer5_0-0-508.pdf

30.

Joensen

J. S.

Nielsen

H. S.

(2009). Is there a causal effect of high school math on labor market outcomes? Journal of Human Resources, 44(1), 171–198. https://doi.org/10.3368/jhr.44.1.171

31.

Karsenty

(2010). Nonprofessional mathematics tutoring for low-achieving students in secondary schools: A case study. Educational Studies in Mathematics, 74, 1–21. https://doi.org/10.1007/s10649-009-9223-z

32.

Kotok

(2017). Unfulfilled potential: High-achieving minority students and the high school achievement gap in math. High School Journal, 100(3), 183–202. https://www.jstor.org/stable/90024211

33.

Kraft

M. A.

(2020). Interpreting effect sizes of education interventions. Educational Researcher, 49(4), 241–253. https://doi.org/10.3102/0013189X20912798

34.

Kraft

M. A.

Falken

G. T.

(2021). A blueprint for scaling tutoring and mentoring across public schools. AERA Open, 7(1), 1-21. https://doi.org/10.1177/23328584211042858

35.

Kraft

M. A.

Novicoff

(2024). Time in school: A conceptual framework, synthesis of the causal research, and empirical exploration. American Educational Research Journal, 61(4), 724–766. https://doi.org/10.3102/00028312241251857

36.

Kraft

M. A.

Schueler

Falken

(2024). What impacts should we expect from tutoring at scale? Exploring meta-analytic generalizability (EdWorkingPaper No. 24-1031). Annenberg Institute at Brown University. https://doi.org/10.26300/zygj-m525

37.

Larson

Boswell

(2019). Big ideas math: Algebra 1. A common core curriculum. Big Ideas Learning.

38.

Lohmeier

J. H.

Lee

S. W.

(2011). A school connectedness scale for use with adolescents. Educational Research and Evaluation, 17(2), 85–95. https://doi.org/10.1080/13803611.2011.597108

39.

Long

M. C.

Iatarola

Conger

(2009). Explaining gaps in readiness for college-level math: The role of high school courses. Education Finance and Policy, 4(1), 1–33. https://doi.org/10.1162/edfp.2009.4.1.1

40.

Moffatt

(2020). Experimetrics: Econometrics for experimental economics. Bloomsbury Publishing.

41.

National Assessment of Education Progress (NAEP). (2024). NAEP US math score trends. https://www.nagb.gov/naep/mathematics.html

42.

National Assessment of Education Progress (NAEP). (2022). Reading and mathematics scores decline during the pandemic. https://www.nationsreportcard.gov/highlights/ltt/2022/

43.

National Center for Education Statistics (NCES). (2022). Revenues and expenditures for public elementary and secondary education: FY 20. U.S. Department of Education. https://nces.ed.gov/pubs2022/2022306.pdf

44.

National Council of Teachers of Mathematics. (2014). Principles to actions: Ensuring success for all. National Council of Teachers of Mathematics. https://www.nctm.org/Store/Products/Principles-to-Actions–Ensuring-Mathematical-Success-for-All/

45.

National Research Council and Mathematics Learning Study Committee. (2001). Adding it up: Helping children learn mathematics. National Academies Press.

46.

Nickow

Oreopoulos

Quan

(2020). The impressive effects of tutoring on pre-K–12 learning: A systematic review and meta-analysis of the experimental evidence (NBER Working Paper No. w27476). National Bureau of Economic Research. https://doi.org/10.3386/w27476

47.

Nomi

Allensworth

(2009). Double-dose Algebra as an alternative strategy to remediation: Effects on students’ outcomes. Journal of Research on Educational Effectiveness, 2(2), 111–148. https://doi.org/10.1080/19345740802676739

48.

Northwest Evaluation Association (NWEA). (2019). MAP Growth technical report. https://www.nwea.org/uploads/2021/11/MAP-Growth-Technical-Report-2019_NWEA.pdf

49.

Office of State and Grantee Relations (OESE). (2024). Elementary and secondary school emergency relief fund. Formula Grant. https://oese.ed.gov/offices/education-stabilization-fund/elementary-secondary-school-emergency-relief-fund/

50.

Oklahoma Educational Quality and Accountability. (2024). Oklahoma school report cards. https://schoolreportcards.ok.gov/

51.

Randall

Engelhard

(2010). Examining the grading practices of teachers. Teaching and Teacher Education, 26(7), 1372–1380. https://doi.org/10.1016/j.tate.2010.03.008

52.

Robinson

C. D.

Kraft

M. A.

Loeb

Schueler

B. E.

(2021). Accelerating student learning with high impact tutoring. EdResearch for Recovery Project. https://files.eric.ed.gov/fulltext/ED613847.pdf

53.

Roscoe

R. D.

Chi

M. T.

(2007). Understanding tutor learning: Knowledge-building and knowledge-telling in peer tutors’ explanations and questions. Review of Educational Research, 77(4), 534–574. https://doi.org/10.3102/0034654307309920

54.

Rose

Betts

J. R.

(2004). The effect of high school courses on earnings. Review of Economics and Statistics, 86(2), 497–513. https://doi.org/10.1162/003465304323031076

55.

Siegler

R. S.

Braithwaite

D. W.

(2017). Numerical development. Annual Review of Psychology, 68, 187–213. https://doi.org/10.1146/annurev-psych-010416-044101

56.

Taylor

Jungert

Mageau

G. A.

Schattke

Dedic

Rosenfield

Koestner

(2014). A self-determination theory approach to predicting school achievement over time: The unique role of intrinsic motivation. Contemporary Educational Psychology, 39(4), 342–358. https://doi.org/10.1016/j.cedpsych.2014.08.002

57.

Thum

Y. M.

Kuhfeld

(2020). NWEA 2020 MAP growth: Achievement status and growth norms tables for students and schools. NWEA. https://teach.mapnwea.org/impl/MAPGrowthNormativeDataOverview.pdf

58.

Trusty

Niles

S. G.

(2003). High-school math courses and completion of the bachelor's degree. Professional School Counseling, 7(2), 99–107. https://www.jstor.org/stable/42732549

59.

Tyson

Lee

Borman

K. M.

Hanson

M. A.

(2007). Science, technology, engineering, and mathematics (STEM) pathways: High school science and math coursework and postsecondary degree attainment. Journal of Education for Students Placed at Risk, 12(3), 243–270. https://doi.org/10.1080/10824660701601266