Abstract
Whole-school reforms have received widespread attention, but a critical limitation of the current literature is the lack of evidence around whether these extensive and costly interventions improve students’ long-term outcomes after they leave reform schools. Leveraging Tennessee’s statewide turnaround reforms, we use difference-in-differences models to estimate the effect of attending a turnaround middle school on student outcomes in high school, including test scores, attendance, chronic absenteeism, disciplinary actions, drop out, and high school graduation. We find little evidence to support improved long-run student outcomes—mostly null effects that are nearly zero in magnitude. Our results contribute to a broad call for educational researchers to examine whether school reforms meaningfully affect student outcomes beyond short-term improvements in test scores.
Introduction
Policymakers have made substantial investments in building organizational capacity to support improvements in the public sector. In education, one salient example involves the Obama administration’s efforts to support chronically low-performing schools by encouraging states and districts to adopt federally approved school turnaround policies (Duncan, 2009). These reform efforts have received billions of dollars in resources through Race to the Top (RttT) and School Improvement Grants or SIGs (Dragoset et al., 2019). More recently, under No Child Left Behind (NCLB) waivers and the current Every Student Succeeds Act (ESSA), the federal government has committed to continuing the long-standing national interest in reforms aimed at persistently low-performing schools. Supporting chronically low-performing schools is particularly important now because these schools serve disproportionately high shares of low-income and racially minoritized students, and they experienced the largest decreases in student achievement as a result of the COVID-19 pandemic (Dorn et al., 2020; Kogan & Lavertu, 2021; Kuhfeld et al., 2020). The long history of investment, current requirements under ESSA, and historically high need exacerbated by COVID-19 suggest that school reforms will continue to prominently factor into national efforts to support equitable educational opportunities.
As part of these federal initiatives, states and districts across the country have now implemented whole-school reform policies for over a decade. While there is a long and growing list of studies that have examined the short-term effects of these reforms, few studies have examined long-term effects (for reviews, see Redding & Nguyen, 2020; Schueler et al., 2021). Among the smaller set of studies that have examined long-term effects, most of the interest has been in following new cohorts of schools placed into an existing reform model (Pham et al., 2020) or effects in reform schools after interventions have ended (Sun et al., 2021). Another broadly important, but understudied, aspect of long-term effects is the impact on student outcomes after they leave the reform school. Our study sheds light on this much less understood aspect of long-term effects. To our knowledge, the only studies to examine this type of long-term effect focuses on the Recovery School District (RSD) reforms that essentially converted all schools in New Orleans into charter schools following Hurricane Katrina (Glenn & Harris, 2020; Harris & Larsen, 2023). While the RSD studies document positive long-term effects, the RSD is designed as a wholesale market-based reform, so these results may not readily apply to models that rely less on school choice. Additionally, given current ambiguities in the extant literature, we develop and contribute an overarching conceptual framework for delineating different types of long-term effects, with concrete nomenclature that will sharpen future academic and policy discourse on this topic.
Examining long-term effects on students is broadly consequential in educational research because interventions producing only short-term gains may not merit investment if they do not meaningfully improve students’ educational trajectory and life opportunities. Additionally, effects across a large range of interventions, including many outside of school reform, may only be realized in the long run. For instance, in the charter school literature, studies have suggested that charter schools may not have positive effects on test scores in the short run but can have long-term effects on educational attainment and labor market outcomes (Booker et al., 2011; Sass et al., 2016). However, the disruptive nature of school reform could also have adverse effects in the long run if students are receiving lower-quality instruction while the school implements major changes. Examining this issue has implications for a wide array of educational interventions because reforms focusing solely on short-term outcomes, such as end-of-year test scores, could ultimately be detrimental to students’ longer-term educational opportunities. Finally, much of the existing literature focuses on student achievement with few studies using outcomes beyond test scores.
To help address these gaps in knowledge, we study Tennessee’s two statewide reform models: the state-led Achievement School District (ASD) and district-led Innovation Zones (iZones). Tennessee is an highly informative and generalizable case because the state’s reform models overlap substantially with current ESSA-aligned reform models used across the country in states such as Michigan, North Carolina, Florida, Illinois, Louisiana, and Massachusetts (Dragoset et al., 2019; Jochim, 2016; U.S. Department of Education, 2012). Initiated in 2012–2013, Tennessee’s reforms began when school turnaround, as defined by the SIG program, was the dominant approach to whole-school reform. Although states’ current reform plans under ESSA are no longer dictated by the SIG-prescribed turnaround models, many states, including Tennessee, continue to use reform plans developed for turnaround (Rentner et al., 2017; Tennessee Department of Education [TDOE], 2018). Substantial overlap between Tennessee’s turnaround models and other reform efforts nationwide suggest that our results can inform similar initiatives in other states.
We describe both of Tennessee’s turnaround models in detail below but note that the ASD relies on dramatic changes in school governance where chronically low-performing schools are removed from local districts, placed under state governance, and restarted, mainly by charter management organizations (CMOs). In contrast, iZone schools remain under the governance of their local district but are placed into an intradistrict network that is supported by a team of full-time district staff. Although the ASD and iZones differ in their governance structures, both models replace teachers and principals when schools first begin implementing interventions, and the long-term effects of both can be estimated because they have been in place for over a decade. Previous research has examined short-term effects over time as these reforms matured (Pham et al., 2020; Zimmer et al., 2017). Specifically, previous research found that iZone reforms had positive effects on test scores, while ASD reforms produced null results (Pham et al., 2020). However, these prior studies have not examined whether effects either persist or materialize after students exit turnaround schools.
Our specific objective is to examine student outcomes in high school after they exit turnaround middle schools. We use a difference-in-differences (DiD) model to obtain intent-to-treat (ITT) effect estimates for students assigned to turnaround middle schools on their outcomes in high school. We examine a wide range of student outcomes when they reach high school, including attendance rates, chronic absenteeism, test scores, disciplinary outcomes, drop out, and graduation. We focus on middle schools, rather than elementary schools, because middle school is most proximate to meaningful long-term student outcomes in high school (e.g., graduation). Moreover, fewer studies have examined effects from turnaround interventions that occurred during middle school (Redding & Nguyen, 2020). To address concerns with estimating dynamic treatment effects (Callaway & Sant’Anna, 2021; Goodman-Bacon, 2021; Roth et al., 2023), we estimate DiD models using both a canonical approach and a “heterogeneity-robust” estimator recently developed by Callaway and Sant’Anna (2021). Results across both approaches lead to the same conclusions.
Overall, we find no evidence to support improved long-term student outcomes after they leave a turnaround middle school, with mostly null effects in high school for students assigned to either ASD or iZone middle schools. If anything, we find suggestive evidence that assignment to an ASD middle school produced negative effects on high school test scores, though these estimates are not consistently significant across all models. Likewise, assignment to iZone middle schools produced almost entirely null results except for a negative effect on math scores that is marginally significant at the 10% level. In the next section, we describe the turnaround context in Tennessee and review relevant literature. Then, we describe our data and methods. The final sections present results and conclusions.
School Turnaround in Tennessee
Tennessee’s current turnaround models began under the state’s 2010 First to the Top (FttT) legislation. FttT requires the TDOE to intervene in the state’s priority schools, which are the lowest-performing 5% of schools. Beginning in 2012–2013, Tennessee’s priority schools were placed into the ASD, integrated into a district iZone, closed, or received no interventions. The ASD is Tennessee’s boldest turnaround model—a statewide school district that removes priority schools from their local district to be governed directly by TDOE. If chosen for the ASD, priority schools cannot opt out, are required to replace the principal and at least 50% of teachers, and are restarted under new management. Most ASD schools are restarted under a CMO, except five that are directly managed by TDOE (Zimmer et al., 2017). The ASD relies on CMOs because TDOE leaders wanted to develop a model that would remove bureaucratic oversight and give school leaders flexibility in adapting reforms to individual school needs.
As the ASD began operating schools in summer 2012, TDOE also approved the creation of local iZones in Shelby County Schools (Memphis) and Metro-Nashville Public Schools (TDOE, 2012). Local iZones differ from the ASD primarily in that iZone schools remain part of their local education agency (LEA) and are placed into a district-within-a-district with other priority schools. Like the ASD, iZone schools must also replace the principal. In contrast to the ASD, iZone schools are not required to replace teachers, but in practice, iZone schools did replace at least 50% of teachers in the first year of reform (Henry et al., 2014). After the first year of principal and teacher replacements, iZone school leaders have broad autonomy to manage daily school operations with support and oversight from a dedicated iZone office within the local district. Since 2012, two additional districts have opened iZones: Hamilton County Schools (Chattanooga) and Knox County Schools (Knoxville).
Importantly, neither ASD nor iZone schools are schools of choice because they are required to continue enrolling students from their local catchment area. This distinction sets our study apart from previous work examining long-term student outcomes in the New Orleans RSD, where market-based competition is a key feature of the model (Harris & Larsen, 2016). Instead, Tennessee’s models are rooted in a belief that a set of disconnected interventions are insufficient to produce swift and dramatic improvements in student performance, which aligns with federal school reforms models that emphasize coordinated, schoolwide changes in how schools are governed, managed, and operated (Herman et al., 2008). These bold changes include teacher and principal replacement, flexibility and support for school leaders, and in the case of the ASD, even removal of chronically low-performing schools from local district governance.
Table 1 shows the number of schools in each turnaround model by school level (elementary, middle, and high). 1 Tennessee identified an initial list of 83 priority schools in 2012 (TN Press Release Center, 2012). A second list of 85 priority schools was released in 2014, though only 33 of these 85 schools were not already on the 2012 list (Caroll, 2014). Only schools on these two priority lists were eligible to join the ASD or an iZone, so as additional cohorts of schools were added to the ASD and iZones in each year, the remaining priority schools not receiving interventions grew smaller over time. Our analysis only compares students who were assigned to ASD or iZone middle schools, based on the feeder pattern of their elementary school, with students who were assigned to a non-ASD, non-iZone priority middle school, which we call comparison schools. Table 1 shows that five cohorts of schools joined the ASD or an iZone between 2012–2013 and 2016–2017. After 2016–2017, both models continued to operate existing schools, but no new schools were added to either model during the time of our study. As of 2018–2019, the final year of data used in our analysis, no school selected for the ASD or iZones had exited. Thus, the ASD and iZone treatments are an absorbing state, such that once a school is treated, it remains treated through the last year of the panel. Below, we describe how we address potentially heterogeneous treatment effects across staggered turnaround adoption in the five ASD/iZone cohorts. However, it is important to highlight that the turnaround interventions did not change across the five cohorts. All ASD schools were removed from their local district; all iZone schools were placed into a district-led network that received additional support from district staff; and both implemented principal and teacher replacement.
Number of Schools in the Sample by ASD, iZone, and Comparison Groups and by School Level
Note. Elem. stands for elementary school. The ASD and iZone rows are schools added from lists of priority schools released by TDOE in either 2012–2013 or 2014–2015. The comparison school rows represent the remaining priority schools that have not joined either the ASD or and iZone in each year. The number of comparison schools decreases each year as more schools join the ASD or an iZone, except in 2014–2015 when additional priority schools were added. During this time period, the ASD opened new-start schools that did not exist previously and were not named a priority school by Tennessee on its 2012 or 2014 priority list. Also, in these years, the ASD began operating in untested grades in some priority schools. New start schools and schools where the ASD had not yet began operating tested grades are not included in this study.
The number of comparison schools at all levels does not equal to the sum of elementary, middle, and high schools because there some comparison schools that do fit into an “other” level category (e.g., K–8 schools). The number of priority schools closed in each of the five cohorts are 1, 3, 8, 3, and 7 respectively.
Two schools were placed into the Nashville iZone in 2012–2013 even though they were not on the 2012 priority list. These two schools are on the 2014 priority list.
The number of priority schools increases in 2014–2015 because Tennessee released a new list of priority schools in 2014.
Two schools in the fourth iZone cohort were closed in the following year.
To better understand how priority schools are selected to join either the ASD or an iZone, we interviewed leaders of both reform models. Our interviews suggested that there were no systematic inclusion criteria used to select schools, with the exception of targeting existing feeder patterns. However, in practice, even this criterion does not appear to strongly drive ASD or iZone school selection. Only 4 of the 26 ASD schools receive students from other ASD schools as part of the same feeder pattern, and only 8 of the 42 iZone schools receive students from other iZone schools as part of the same feeder pattern. Specifically, only three ASD and four iZone middle schools receive students from ASD and iZone elementary schools, respectively. Similarly, only one ASD and four iZone high schools receive students from ASD or iZone middle schools. Overall, our interviews with administrators suggested that priority schools were not systematically selected for specific turnaround strategies, but we cannot rule out potential selection based on unobserved school characteristics related to long-term student outcomes. To address these concerns, we show evidence below of baseline equivalence and parallel trends between ASD, iZone, and comparison schools.
Literature Review
Whole-school reforms under RttT and SIGs emphasized school turnaround as federally specified models for supporting chronically low-performing schools (Dragoset et al., 2019; U.S. Department of Education, 2009). These turnaround models emphasized rapidly improving school performance and relied on bold interventions, for example, replacing the principal and at least half of all teachers (Aladjem et al., 2010; Herman et al., 2008; Peurach & Neumerski, 2015).
Two recent meta-analyses of the school turnaround literature (Redding & Nguyen, 2020; Schueler et al., 2021) report mixed results and show that the literature almost exclusively examines short-term outcomes, especially test scores. Redding and Nguyen (2020) review 35 studies of turnaround and find that these reforms are associated with improved test scores, student attendance, and graduation rates, but the latter two outcomes have been examined by a much smaller set of studies. All studies in Redding’s and Nguyen’s meta-analysis examined outcomes while students were still in the turnaround school, not after they left the turnaround school. In a second meta-analysis, Schueler et al. (2021) report a more expansive analysis that includes evaluations of any whole-school reform implemented since the adoption of NCLB. They find a moderate positive effect on math but no effect on English Language Arts achievement on high-stakes exams, positive impacts on low-stakes exams in STEM (science, technology, engineering, and mathematics) and humanities subjects, and no effect among the small set of studies that have examined non-test-score outcomes. Among the studies reviewed by Redding and Nguyen (2020) and Schueler et al. (2021) are two evaluations of Tennessee’s ASD and iZone reform efforts (Pham et al., 2020; Zimmer et al., 2017). These two studies found no effect in either the first 3 or 6 years of implementation for the ASD but modest positive effects both in the first 3 and 6 years of implementation for iZone schools—though the positive iZone effects are concentrated in the initial cohorts with later cohorts showing no positive effects.
Collectively, the research examining turnaround reforms have helped to illuminate their short-term effects. However, with the exception of research on the RSD in New Orleans (Glenn & Harris, 2020; Harris & Larsen, 2016, 2023), research has not examined the long-term impact of these reform policies for students after they exit a turnaround school. The ASD and iZone models in Tennessee share commonalities with multiple other reform models across the country. First, Tennessee identifies its lowest-performing schools, and the public attention to these schools aligns with accountability systems in states such as North Carolina (Henry & Harbatkin, 2020), Florida (Chiang, 2009; Rouse et al., 2013), and New York (Rockoff & Turner, 2010). Second, both the ASD and iZone models emphasize efforts to staff low-performing schools with effective principals and teachers, which mirrors turnaround models across the country including initiatives in California (Dee, 2012; Strunk et al., 2016; Sun et al., 2021), Ohio (Carlson & Lavertu, 2018), and Washington, D.C. (Dee & Wyckoff, 2015), among many others (Dragoset et al., 2019). Furthermore, the ASD model, where low-performing schools are removed from local district governance, has been a topic of wide-ranging interest, with parallel policies in place across 34 states (Jochim, 2016). Comparable examples include charter conversion models in Boston and Baton Rouge (Abdulkadiroğlu et al., 2016) and state takeover of entire school districts that have occurred in 28 states since the 1980s (Schueler & Bleiberg, 2022). As the district-level, the iZones’ district-within-a-district approach has been growing in popularity in districts such as Chicago, Denver, and Indianapolis (Iyengar et al., 2017).
Furthermore, less attention has been given to the effects of turnaround reforms implemented in middle schools (Redding & Nguyen, 2020; Schueler et al., 2021). Across their comprehensive review of this literature, Redding and Nguyen (2020) found only 4 of 35 studies (11%) that specifically focused on turnaround reforms implemented in middle schools. Of those four studies, only one met Redding’s and Nguyen’s criteria for a high-quality study with a low risk of bias. Additionally, researchers have long documented heterogeneity in the effects of school interventions by grade level (Edgerton & Desimone, 2019; Firestone & Herriott, 1982), so increased attention to reforms in middle schools will advance our collective understanding of how effective interventions differ across grade levels.
Overall, we contribute to the extant literature by:
developing a novel and broadly applicable framework for differentiating the currently ill-defined types of long-term effects in educational research;
focusing on turnaround models in Tennessee that are more comparable to reforms in other states than the whole district, market-based RSD model;
estimating effects on student outcomes after they leave a turnaround school;
examining the effects of reforms implemented during middle schools on high school outcomes; and
examining student outcomes beyond test scores to include important outcomes such as high school graduation.
Conceptual Framework for Differentiating Long-Term Effects
In reviewing literature on long-term effects of educational interventions, we find that the current academic conversation lacks clarity because prior literature conflates four distinct types of long-term effects: program maturation effects, dosage effects, institutionalized effects, and persistent/latent effects. To sharpen academic discourse in this area, we developed a conceptual framework for understanding and delineating these different types of long-term effects, as shown in Figure 1. Although we tailor our discussion to long-term effects of whole-school reforms, this conceptual framework can be adapted to a broad range of educational interventions. We begin by separating long-term effects along two dimensions: (a) a temporal dimension that considers whether the interventions are still active or whether they have ended and (b) a level dimension that considers whether the research study is interested in examining effects on the school as whole or effects on students.

Conceptual framework for defining long-term effects of whole-school reform.
While interventions are actively being implemented, two types of long-term effects of interest include school-level program maturation effects and student-level dosage effects. Program maturation effects primarily capture the experience of new schools (or new cohorts) that join an existing reform model. Although these schools may be implementing reforms that are new to them, the reform program itself is not new because it has been implemented in other schools. Therefore, program maturation effects capture the long-run effect of a reform model as it grows, matures, and scales up in more schools over time. One study of program maturation effects is the evaluation of the Public School Choice Initiative (PSCI) in Los Angeles Unified School District (Strunk et al., 2016). Using data across three cohorts of PSCI schools, Strunk et al. (2016) reported variation across cohorts: null effects on student achievement in Cohort 1 schools that began implementing PSCI reforms in 2010–2011, positive effects on Cohort 2 schools that began in 2011–2012, and negative effects on Cohort 3 schools that began in 2012–2013. Strunk and colleagues could not extend their evaluation beyond three cohorts because PSCI was terminated after 2012–2013; however, their discussion of program maturation effects provided a rich understanding of how early implementation hurdles diminished potential effects in Cohort 1, while abrupt changes in program structure likely led to negative effects in Cohort 3. A growing number of evaluations have reported maturation effects across cohorts (Berends et al., 2002; Burns et al., 2023; Dragoset et al., 2019; Papay et al., 2022; Pham et al., 2020, 2024; Zimmer et al., 2017).
In contrast, dosage effects capture the impact on students who have attended reform schools over multiple years. Students who attend reform schools over a longer time period will have received a higher “dose” of the interventions, and some researchers have pointed out that reforms have a better chance of success if students experience increased supports for longer (Peurach & Neumerski, 2015). Abdulkadiroğlu et al. (2016) provide an example of dosage effects in their study of reforms that closed low-performing traditional public schools and reopened them as charter schools. Overall, they found positive effects on student achievement from the charter conversions. However, when they examined students exposed to 1, 2, 3, and 4 years of reforms following the charter conversion, they found that effects were larger for students after 1 or 2 years of exposure to reforms, relative to students exposed to 3 or more years of reforms. Thus, their examination of dosage effects at the student level highlighted diminishing returns to the charter conversion model. Within the school reform literature, dosage effects are relatively common, though the vast majority of studies only report dosage through 2 to 4 years (Dee & Dizon-Ross, 2019; Henry & Harbatkin, 2020; Papay & Hannon, 2018; Schueler et al., 2017; Sun et al., 2017). An important area of need is for future studies to estimate dosage effects over a longer time period.
After a turnaround program’s core interventions have been completed, institutionalized effects occur at the school level and refer to whether school operations have changed in lasting ways without ongoing support from external actors. Institutionalized effects are relevant because an implicit goal of many reform models is to improve organizational characteristics (e.g., culture, climate, practices, shared expectations) in low-performing schools so these schools can continue effectively serving students after external interventions and resources are no longer available. Sun et al. (2021) provide an illustrative example of institutionalized effects. Using data from Boston, San Francisco, North Carolina, and Washington, the authors examined schools that received SIG funding to implement reforms. They find that SIG funds improved student achievement during the 3 years when the SIG reforms were actively implemented. After the SIG reforms ended, Sun et al. (2021) found that positive student achievement effects in these schools diminished but did not completely fade away and continued to be positive 4 years after the reforms ended. The authors explain that schools implementing more dramatic reforms later demonstrated a greater ability to maintain positive effects after the SIG reforms ended. Estimates of institutionalized effects are much less common than either maturation or dosage effects, but some studies of earlier whole-school reforms models have documented diminished institutionalized effects after active supports ended (e.g., Bifulco et al., 2005).
In contrast to the three previous types of long-term effects, this article focuses on the fourth category of long-term effect, which we call persistent/latent effects at the student level. Persistent/latent effects focus on how students fare after they leave reform schools. When reforms produce positive short-term effects that students continue to experience after leaving the reform school, we consider this a persistent long-term effect. On the other hand, it is also possible that a reform model does not produce short-term effects, but the latent effect becomes apparent after students leave the reform school. For example, reforms that focus on building foundational academic skills may not produce immediate short-term effects but may improve academic gains later, after students have had time to master the skills. Although persistent/latent effects are rarely studied, they are arguably more meaningful than short-term effects, because these persistent/latent effects would show that the reforms successfully changed students’ long-run educational trajectories in consequential ways, even after students leave the reform school. There are very few examples of persistent/latent effects in the school reform literature. One example involves studies of the RSD in New Orleans. Examining postsecondary outcomes after students leave RSD schools, Harris and Larsen (2023) find that students are more likely to attend college, and Glenn and Harris (2020) found that students are more likely to attend higher-quality colleges. Outside of the school reform literature, studies of persistent/latent effects are more common in the preschool fadeout or convergence literature, which examine students outcomes after they leave preschool (e.g., Bassok et al., 2019; Lipsey et al., 2018). For example, Lipsey et al. (2018) find that the positive effects of preschool on student achievement largely disappear by the end of kindergarten and are even reversed by second and third grades. These studies of the persistent/latent effects of preschool have had a substantial impact on current academic and policy conversations around preschool. We believe other areas of educational research would benefit from similar attention to persistent/latent effects, and by differentiating persistent/latent effects from other long-term effects, our conceptual framework is intended to help increase attention to persistent/latent effects both within and outside of the school reform literature.
We recognize that studies of long-term effects may comprise some mixture or overlap between these four types of effects; however, scholarship on long-term effects would benefit from clearer language to facilitate communication on what types of effects are being reported. Better nomenclature will also clarify future comparisons of long-term effects across different contexts and reform models. Overall, this conceptual framework allows us to better clarify differences between our work and other studies of long-term effects of school turnaround that have examined program maturation effects (Strunk et al., 2016), dosage effects (Abdulkadiroğlu et al., 2020), and institutionalized effects (Sun et al., 2021). However, we reiterate that persistent/latent effects of whole-school interventions are not well explored and should become a more central focus across a wide range of educational evaluations.
Methods
Data
Data are provided by TDOE and managed by the Tennessee Education Research Alliance. The administrative datasets are available for students in all Tennessee public schools. These data contain rich student characteristics and outcomes such as test scores, disciplinary actions, whether students graduate, gender, race, eligibility for free or reduced-price meals (FRPM), multilingual learner status (ML) and special education eligibility (SPED). Data are available from 2006–2007 through 2019–2020, except for test scores, which are only available through 2018–2019. Even though non-test-score outcomes are available in 2019–2020, we primarily use only data through 2018–2019 because the COVID-19 pandemic may have affected students’ educational outcomes in spring 2020 in ways that are still not well understood. In an auxiliary analysis, we show that our results are robust when we add data on all available outcomes in 2019–2020 (see Appendix Table A9 in the online version of the journal).
Sample
Our goal is to estimate the long-term effect of turnaround reforms implemented in middle school on student outcomes in high school. Results in high school are particularly valuable for this study because high school outcomes like graduation are consequential for students’ college and career options. Thus, we focus on turnaround middle schools because they are most proximate to outcomes measured while attending high school, which makes for a more straightforward definition of the treatment. To construct our sample, we use school feeder patterns to identify the middle school that students are assigned to attend when they complete the final grade offered in their elementary school. Using these middle school assignments, we restrict our analytic sample to include only students who are assigned to an ASD, iZone, or comparison middle school. This sample allows us to estimate ITT estimates based on students’ assigned middle school. We use this ITT sample instead of a treatment-on-the-treated (TOT) sample of students who actually attend ASD or iZone middle schools to account for student-level selection. Student-level selection bias could occur if choosing to attend a turnaround school is endogenous and related to unobserved factors that also affect student outcomes. This possibility is mitigated to some degree because Tennessee’s turnaround schools are residentially assigned, not schools of choice. Nevertheless, parents could make a residential move that would allow their children to attend a nonturnaround school to shield them from reform-induced disruptions. If these parents also tend to invest more resources in education such that their children are likely to have better schooling outcomes, their choice to avoid turnaround schools could negatively bias our estimate of the long-term reform effects. Our ITT approach addresses this issue by retaining all students who are assigned to ASD or iZone middle schools, regardless of whether they ultimately attend their assigned school. Below, we show that residential assignment to an ASD, iZone, or comparison school strongly predicts attendance in the assigned school (Appendix Table A3 in the online version of the journal).
Thus, our preferred sample compares only students assigned to ASD or iZone middle schools with students assigned to similarly low-performing priority middle schools that receive no turnaround interventions. In auxiliary analyses, we restrict this preferred sample to include only students who can be observed attending all grades offered at an ASD, iZone, or comparison middle school and instrument the indicator for attending a school with the indicator for whether students were assigned to the school. This sample of students who attend all grades at the same middle school allows us to avoid issues from students who move and experience the effects of different middle schools, but it is also a much smaller group of students than our preferred ITT sample: about one-third of students who ever attend a priority middle school in Tennessee attend all grades at that middle school. Results from this more restricted sample of students who attend all grades at a turnaround middle school lead us to similar conclusions, though significance levels vary between the two samples (Appendix B in the online version of the journal). Finally, given available data, we can only examine students who are observed in a Tennessee public high school. Across all years, about 11% of students who are assigned to a priority middle school are not observed in any Tennessee high school. To examine potential bias from these missing students, we estimate the relative likelihood of attending a public high school in Tennessee between students assigned to ASD, iZone, or comparison schools. This analysis finds no significant differences (Appendix Table A10 in the online version of the journal).
Table 2 shows a visual representation of our sample including the number of years we can observe outcome measures in high school for each student cohort. The table shows the four cohorts of students in turnaround middle schools that are included in our analysis, 2 with the years in which they were assigned to an ASD or iZone middle school in gray and their high school years observed in our data in white. 3 For example, the first cohort of students would have started attending middle school in 2012–2013 (the first year of turnaround reforms in Tennessee), attended middle school through 2014–2015, and entered high school in 2015–2016. These students are included in our analysis only between 2015–2016 and 2018–2019 when their outcomes are available in high school. Note that graduation from high school is primarily available only for the first cohort because most students spend 4 years in high school before graduating.
Visual Representation of Each Cohort of Students in the Years They Attend Middle and High School Including the Number of Years the Outcomes Can Be Measured
Note. For presentational clarity, the table shows students who attend middle school for 3 years (e.g., sixth through eighth grade), but the logic is similar for students attending middle schools with a different grade configuration.
Part of our primary identification strategy relies on the DiD assumption of parallel trends in outcomes, which we show evidence to support further below. Although this assumption does not require ASD, iZone, and comparison schools to have similar baseline characteristics (only parallel trends), evidence of comparable baseline characteristics further supports our use of non-ASD, non-iZone priority schools as a valid counterfactual. Appendix Table A1 in the online version of the journal shows descriptive means for student demographics and each outcome of interest, 4 within ASD, iZone and comparison middle schools before any reforms began in 2012–2013. The table also shows results from t-tests comparing ASD versus comparison schools and iZone versus comparison schools. These tests show similar mean characteristics between all three groups of priority schools. The only statistically significant difference suggests that a larger proportion of students in iZone middle schools were white (6%) relative to comparison schools (2%). This difference makes sense in the Tennessee context because most priority schools are located in Memphis where most students are not white, but iZones were formed in other districts across the state, and schools in these districts tend to serve more white students. Nevertheless, a difference of 4 percentage points is quite modest, and we control for student race in all models.
Measures
We examine four categories of student outcomes in high school: test scores, attendance, discipline, and drop out/graduation. For test scores, we include scores on both the ACT and on state-required end-of-course (EOC) exams. For the ACT, we use only composite scores for students from the first time they take the exam to avoid bias from differential retake. Since the ACT is a norm-referenced test, we keep these scores on their original scale ranging from 1 to 36. Tennessee’s end-of-course exams are administered in seven subjects: English I, English II, English III, Algebra I, Algebra II, Biology, and Chemistry. Given these different courses, we standardize the EOC scores statewide by course, year, and semester to have a mean of zero and unit variance.
Attendance outcomes include the student’s annual attendance rate and an indicator for chronic absenteeism. The attendance rate is the proportion of enrolled instructional days in which the student attended school, and chronic absenteeism is an indicator that equals one if students miss more than 10% of enrolled instructional days—the definition used in Tennessee’s accountability policies (TDOE, 2021). Disciplinary actions are measured with indicators for whether the student commits a zero-tolerance disciplinary action and whether the student is expelled. Tennessee’s zero-tolerance policy for offenses include possessing and/or using drugs, possessing a firearm, staff battery, and bullying. We chose zero-tolerance actions and expulsion as disciplinary outcomes because they are major offenses that would likely be recorded regardless of individual school or district discipline policies. Finally, we use state graduation records to code an indicator for whether the student drops out of school in each year and an indicator for whether the student receives a high school diploma in any year. Besides these outcomes, we also code whether students are assigned to ASD, iZone, or comparison middle schools based on the feeder pattern of their elementary school. Finally, we include a set of demographic student and school characteristics as covariates, as described below.
Analytic Method
Our analytic approach is a DiD model that estimates the ITT effect of assignment to a turnaround middle school. The DiD model allows us to address school-level selection, which could occur if schools chosen for either the ASD or an iZone differ systematically from nonturnaround priority schools prior to implementing any reforms. For example, if schools chosen for turnaround reforms have comparably better systems to improve long-term student outcomes (e.g., more effective school leaders), these preexisting school-level differences could positively bias our estimate of the long-term reform effects.
As a starting point, we model outcomes,
Equation 1 includes indicators for students who were assigned to middle schools that ever implement turnaround (
Additionally, Equation 1 includes vectors of student (
The coefficients of interest in Equation 1 are interpreted as the postreform minus prereform difference in outcomes for students assigned to ASD (
Recent developments in the DiD literature have extended the canonical model shown in Equation 1 to staggered DiD setups that address bias from potentially heterogenous treatment effects across cohorts receiving turnaround reforms at different points in time. Because five different cohorts of ASD and iZone schools began reforms in each year between 2012–2013 and 2016–2017, results from Equation 1 are potentially biased from this staggered reform adoption if the long-term effects of turnaround are heterogeneous across the different ASD/iZone cohorts. To address this issue, we follow methods proposed by Callaway and Sant’Anna (2021), hereafter the CS approach. The CS approach begins by estimating separate effects for each treatment cohort in each year compared only to never-treated (or not yet treated) students to avoid problematic comparisons with already-treated students (Goodman-Bacon, 2018). We use only never-treated students who attend comparison middle schools as the control group and rely on the doubly robust DiD estimator (i.e., regression adjustment and inverse probability weighting) from Sant’Anna and Zhao (2020) to obtain each of these cohort-year specific effects. For reporting, we aggregate these separate cohort-year estimates using a simple weighted average of each cohort relative to its frequency in the treated population. We also report event-study parameters that are the weighted average effects for students who were assigned to ASD/iZone middle schools in each year after the school began reforms. For full transparency, we report both results from a canonical DiD model and the CS approach, which we implement using the csdid package in Stata (Rios-Avila et al., 2022). All models cluster standard errors at the school level.
Some researchers have argued that controlling for pretreatment, or lagged, outcomes can lead to comparison units that have uncharacteristically low outcomes in the pretreatment period, which leads to bias from mean reversion (Daw & Hatfield, 2018). Therefore, our primary results using both the canonical and CS estimators do not include a pretreatment, or lagged, outcome. This approach simplifies our interpretation of the results as effects on the level of each outcome. However, researchers have also found evidence that controlling for lagged outcomes measured prior to treatment can help reduce bias (Wilkins, 2018). Therefore, Appendix Table A4 in the online version of the journal shows auxiliary analyses that use the CS estimator with controls for lagged outcome measures. To avoid endogeneity from a covariate that may be affected by turnaround interventions in middle school, results in Appendix Table A4 are from a model that includes pretreatment lags measured in the year prior to students attending middle school. 7 Results from models that control for pretreatment lagged outcome measures are interpreted as effects on gains in each outcome, rather than levels. Results from models that control for lagged outcome measures align with our main conclusions, though the coefficients that are somewhat larger in magnitude.
To show descriptive evidence supporting the parallel trends assumption, Figure 2 uses our full ITT sample to graph trends in average EOC test scores on the y-axis for students who started middle school in each year before and after the school began turnaround reforms. Panels A through C in Figure 2 graphs trends in reading, math, and science EOC scores, with years on the x-axis centered such that Year 0 is the year just before schools begin turnaround and Year 1 is the first year of reforms. Although scores fluctuate from year to year, the figure shows little evidence of diverging trends in the EOC scores between students assigned to ASD, iZone, and comparison middle schools in years prior to reforms. In the postreform years, we again do not see divergent trends in EOC scores between students assigned to ASD and iZone middle schools, relative to students assigned to comparison middle schools. Also, the figures suggest that, prior to reforms, students assigned to ASD middle schools had average EOC scores that were slightly higher than the scores among students assigned to iZone and comparison middle schools. However, after reforms began, EOC scores for students assigned to ASD middle schools are no longer distinguishable from EOC scores among students assigned to iZone and comparison middle schools. To conserve space, we provide plots for the other outcomes of interest in Appendix C in the online version of the journal, all of which support parallel trends between ASD, iZone, and comparison schools. Below, we discuss formal statistical tests for parallel trends (Figure 3).

Trend in average high school EOC scores for students who began middle school in each year before and after turnaround reforms began. Panel A: EOC reading standardized scores. Panel B: EOC math standardized scores. Panel C: EOC science standardized scores.

DiD estimate on test scores by years before and after reforms began. Panel A: EOC reading. Panel B: EOC math. Panel C: EOC science.
Results
Descriptive Results
Table 3 shows average characteristics of students assigned to ASD, iZone, and comparison middle schools in our ITT sample. For ASD and iZone schools, we show student characteristics before and after the school began implementing reforms. For comparison schools that never implement any turnaround interventions, we compare means from before and after 2012–2013, the year when Tennessee first began operating turnaround schools. Tennessee’s priority middle schools primarily serve Black students, ranging from 82% to 93%, depending on the turnaround model and years. Most students are eligible for FRPM (73% to 84%), and a small proportion are MLs (1% to 3%). These student characteristics reflect the fact that most of Tennessee’s priority schools serve historically Black communities in Memphis. Moreover, the table shows that student characteristics of ASD, iZone, and comparison middle schools do not substantially change after reforms began.
Descriptive Characteristics of Students Who Attended ASD, iZone, or Comparison Middle Schools in the Years Before and After Turnaround Interventions Began
Note. Standard deviations in parentheses. Students are only included in the sample if they either attended all grades at a priority (ASD, iZone, or comparison) middle school during the years before or after turnaround interventions were implemented.
Table 3 also shows descriptive averages of student outcomes in high school. The table shows that students assigned to ASD middle schools have higher average EOC scores than students assigned to comparison schools in the years before reforms began. However, in the years after reforms began, students assigned to ASD middle schools have average EOC scores that are lower than those assigned to comparison middle schools. This suggests that cohorts assigned to ASD middle schools postreforms performed worse on their high school EOCs than the cohorts prior to reforms. However, students assigned to iZone middle schools post reforms had EOC test scores that were either similar or slightly higher than the prereform cohorts. While it is worth noting the relative achievement levels, what is more important to the validity of our DiD model is the pretreatment trends, which are largely parallel (Figure 2 and Appendix C). Nevertheless, Table 3 suggests that EOC scores largely do not improve for students assigned to ASD and iZone middle schools after reforms began relative to the cohorts before reforms were put in place. Likewise, none of the other long-term outcomes of interest meaningfully improved when comparing the prereform and postreform cohorts of students assigned to turnaround middle schools. In fact, we observe a decrease in ACT scores, an increase in the chronic absenteeism rate, and a decrease in the graduation rate when we compare prereform and postreform cohorts assigned to ASD middle schools.
Difference-in-Differences Results
Before turning to the DiD results, Appendix Table A3 in the online version of the journal shows that assignment to an ASD or iZone middle school strongly predicts whether attend that middle school. The table estimates our full DiD model (Equation 1) on whether the student attends a school that is (a) ever an ASD school, (b) an ASD school in the years after reforms began, (c) ever an iZone school, and (d) an iZone school in the years after reforms began. The results show that assignment to an ever ASD middle school strongly and significantly predicts attendance in an ever ASD middle school, and assignment to an ASD middle school in the years after the school began reforms significantly predicts attendance in an ASD school, postreforms. The same is true for assignment and attendance in iZone schools. The results suggest that assignment to an ASD or iZone middle school increases the probability that students will attend these schools by about 30 percentage points. These results support using indicators for assignment to ASD or iZone middle schools as a reduced-form ITT estimate, which can be rescaled to calculate TOT effects.
Table 4 shows results from estimating our canonical DiD model. The coefficients of interest are the DiD effect from interacting an indicator for assignment to an ASD or iZone middle school with an indicator for beginning middle school after reforms are put into place. Overall, the table shows primarily null long-term effects in high school for students who attended either ASD or iZone middle schools, and most of the coefficients are small in magnitude and precisely estimated. 8 Similar to the descriptive results, DiD estimates for students assigned to ASD schools suggest negative effects, but the coefficients are not statistically significant for test scores, disciplinary actions, or dropout/graduation. The only outcomes where we observe a significant result is a 4 percentage point increase in the probability of being chronically absent. We also observe a potential decrease in attendance rate, but the result is modest (1 percentage point) and only marginally significant. For the iZones, we find no significant result on any outcome of interest, and most coefficients are very nearly zero.
DiD Effect of Assignment to ASD or iZone Middle Schools on Outcomes in High School
Note. All models include high school and year fixed effects. Standard errors in parentheses clustered at school level. All estimates are from an ITT sample of students assigned to ASD, iZone, or comparison middle schools based on their elementary school. Student characteristics included as controls include the student’s gender, race, eligibility for FRPM, ML status, and SPED status. High school characteristics included as controls include proportion non-white students, proportion FRPM, proportion ML, and proportion SPED.
p < .1, *p < .05, **p < .01, ***p < .001.
Table 5 shows DiD results based on the CS approach to account for staggered adoption of turnaround reforms across five cohorts of schools. In alignment with results from the canonical DiD specification, we find no statistically significant effect on any test score, attendance, discipline, or dropout/graduation outcomes. Although the results are not statistically significant, we find that the DiD estimate for students assigned to ASD middle schools is consistently negative across all EOC subjects. Similarly, for students assigned to iZone middle schools, the coefficient on EOC math scores is negative and marginally significant. Similar results between the traditional specification and the CS approach suggest that effects were not heterogeneous across different cohorts, which aligns with our understanding that across all five ASD and iZone cohorts, students experienced the same set of interventions, respectively.
DiD Effect of Assignment to ASD or iZone Middle Schools Using Callaway and Sant’Anna (2021) Estimator
Note. Standard errors in parentheses clustered at school level. Results in this table use the DiD estimator proposed by Callaway and Sant’Anna (2021), which relies on comparisons with students who never attend an ASD or iZone school, and avoids potentially problematic comparisons with already treated students. All models control for the same student and school characteristics as in Equation 1. Student characteristics included as controls include the student’s gender, race, eligibility for FRPM, ML status, and SPED status. High school characteristics included as controls include proportion non-white students, proportion FRPM, proportion ML, and proportion SPED.
p < .1, *p < .05, **p < .01, ***p < .001.
Next, we use the CS approach to capture effects for students who began middle school in each year before and after turnaround reforms began, akin to an event-history model. Figure 3 plots each of these coefficients for EOC reading, math, and science scores. Similar plots for all other outcomes are in Appendix D in the online version of the journal. These event-history estimates allow us to formally test the parallel trends assumption by examining coefficients for students assigned to ASD or iZone schools prior to reforms. Figure 3 shows that coefficients in each of the prereform years are near zero and nonsignificant. We also conduct a joint significance test for all coefficients in the prereform years and find no significant differences. Furthermore, coefficients are not significant in the years after reforms began, and we do not observe strong trends across the years, except for potential trends downward in science EOC scores for students assigned to ASD middle schools. Year-by-year coefficients for the other outcomes are also primarily not significant (Appendix D).
Finally, the null results may be somewhat surprising for students assigned to iZone middle schools because previous research finds significant effects of iZone interventions on student achievement, at least while they are in the iZone school (Pham et al., 2020; Zimmer et al., 2017). Because results from the prior papers were averaged across all school levels, it is possible that effects for students in iZone middle schools were always statistically insignificant, and the positive effects reported in prior studies were driven by iZone elementary and high schools. To test this theory and better compare results in this article with ASD/iZone effects on students while they were still in the turnaround school, we use methods similar to Pham et al. (2020) to estimate effects on the specific subsample of only middle schools. Results from these subsamples (Appendix Table A5 in the online version of the journal) align with the previously reported findings: null ASD effects and positive iZone effects. These results suggest that iZone middle schools had some positive short-term effects on student achievement while they were in middle school, but the positive effects do not carry over after students leave middle school.
Robustness Checks
A threat to the validity of our results could occur if ASD and iZone schools actively recruited or pushed out students who are systematically different from the types of students who would have attended these schools in the absence of turnaround. While this practice is unlikely because schools in both models are required to continue serving students in their local enrollment zones (TDOE, 2012), we empirically examine this possibility using characteristics of all students who enter and leave these middle schools in the post-turnaround years. Appendix Table A7 in the online version of the journal shows that the average rate of student in-migration and out-migration are similar in ASD, iZone, and comparison schools between 2012–2013 and 2018–2019. Additionally, Appendix Table A8 in the online version of the journal shows that the observed characteristics of incoming and outgoing students are nearly the same in all three groups of schools. These results suggest that our findings are not driven by differential student mobility into and out of turnaround middle schools. Finally, we test the robustness of our results by estimating models that remove all covariates. Appendix Tables A6 in the online version of the journal shows that our results are robust to this alternative specification.
Conclusion
While empirical literature on school reform has been growing in recent years (Redding & Nguyen, 2020; Schueler et al., 2021), long-term effects of reform are critically understudied. Indeed, our review of the literature found conceptually distinct categories of long-term effects that were previously commingled, which has muddled current academic and policy discourse between studies focusing on different types of long-term effects. In this study, we add clarity to the study of long-term effects by developing a broadly applicable conceptual framework for delineating four categories of long-term effects: program maturation, institutionalized, dosage, and persistent/latent effects. Guided by this framework, we identified growing attention in the school reform literature to dosage (Abdulkadiroğlu et al., 2020), program maturation (Strunk et al., 2016), and institutionalized effects (Sun et al., 2019). However, we could find no study examining persistent/latent effects outside of the very distinct RSD model (Glenn & Harris, 2020; Harris & Larsen, 2023). Thus, in this study, we estimate persistent/latent effects within the context of Tennessee’s more generalizable statewide school reforms.
Overall, we find very little evidence of positive persistent/latent effects after students leave an ASD or iZone middle school. Results on high school test scores in reading, math, and science are negative and inconsistently significant for students assigned to ASD middle schools, which begins to suggest negative effects but will require additional studies to sufficiently test. We also find no significant changes in ACT scores; nor did students experience any effect on the probability of graduating from high school. Results using the canonical DiD model suggests that students assigned to ASD middle schools have a significantly higher probability of being chronically absent in high school, but this result is not robust to using the CS approach to address staggered treatment timing. Long-term effects from students assigned to iZone middle schools are no more encouraging. None of our results are significant at the 5% level, and most are extremely small in magnitude (i.e., precisely estimated zeroes). The only coefficient that is marginally significant at the 10% level suggests that students assigned to iZone middle scores posted lower math EOC scores than students assigned to comparison middle schools. These null results across multiple student outcomes suggest that the turnaround reforms implemented in Tennessee, which are similar to interventions used in many other states, are showing few signs of improving student outcomes after they leave the turnaround school.
Though only suggestive, evidence of negative effects on EOC scores from students attending ASD middle schools deserves further attention. Previous research found that the ASD reforms produced null short-term (Zimmer et al., 2017) and maturation effects (Pham et al., 2020), and our findings point toward a potentially detrimental latent effect from these reforms. Since the reforms likely disrupted the schools’ staffing, culture, and daily operations when the principal and most teachers were replaced, part of the null or even negative effects in high school may be explained by these instabilities in students’ instructional experiences. Previous research has shown that whole-school reforms requiring teacher and principal replacements can lead to high levels of instability in the school’s daily routines and procedures (Malen et al., 2002; Rice & Malen, 2010). The theory of action for bold turnaround reforms implies that these disruptions could be worth the effort if they can overcome preexisting barriers to improvement and are replaced with effective interventions and supports to quickly improve the school (Herman et al., 2008; Kutash et al., 2010); however, for the ASD, these disruptions were not met with effective interventions to support the improvement process.
The iZone model did improve student achievement while students attended the turnaround school (Pham et al., 2020; Zimmer et al., 2017). 9 Yet the positive iZone effects on achievement do not persist when these students reach high school. Most estimates are nearly zero and statistically insignificant, except a marginally significant negative effect on math test scores. These results are important to consider as the iZone model grows in prevalence in other districts throughout the country (Iyengar et al., 2017) because they suggest that Tennessee’s iZone reforms achieved short-term gains, but the gains faded after students left the turnaround school. A likely explanation is that students in the lowest-performing schools continue to need support, and attending a few years at an improved iZone middle school is insufficient to change their educational trajectories. Our findings suggest that reform policies may need to be designed in a way that is connected across school levels to support students throughout their K–12 educational experience. Importantly, our results suggest that future reform models should put more thought into interventions that support persistent improvements. The emphasis in iZone schools was likely directed toward instructional improvements, but more comprehensive supports at the system level, including supporting students’ social-emotional learning (Carr, 2021), investing in wraparound services (Hill, 2020), and soliciting support from the local community around the school (Anderson-Butcher et al., 2010; Oakes et al., 2017) may be needed for improved outcomes to persist.
This article contributes novel insights to the literature on school reform by highlighting persistent/latent effects on students’ long-run outcomes as an important but understudied aspect of turnaround. We find scant evidence of long-term positive effects from either of Tennessee’s two turnaround models, but a critical avenue for future research in this area is an in-depth explanation of why the iZone model showed positive short-term effects that did not persist. Our results also suggest that systemic reforms (beyond individual school buildings) may be needed for positive effects on students to persist. In general, we emphasize the need for future evaluations of school reform to regularly examine, and clearly differentiate, multiple types of long-term effects. Ongoing research in this area will provide important evidence to help policymakers and educators to fully understand returns to investments in school reform.
Supplemental Material
sj-pdf-1-aer-10.3102_00028312241284026 – Supplemental material for Do the Effects Persist? An Examination of Long-Term Effects After Students Leave Turnaround Schools
Supplemental material, sj-pdf-1-aer-10.3102_00028312241284026 for Do the Effects Persist? An Examination of Long-Term Effects After Students Leave Turnaround Schools by Lam D. Pham, Sean P. Corcoran, Gary T. Henry and Ron Zimmer in American Educational Research Journal
Footnotes
Correction (November 2024):
Notes
L
S
G
R
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
