Abstract
The long-term impacts of early childhood education (ECE) programs at scale are smaller than those produced by influential demonstration programs half a century ago. Several factors likely contribute, some of which are in the control of policymakers and the designers of ECE programs (e.g., targeting and quality), and some of which are not (quality of conditions for children not served by ECE programs and children's experiences outside of ECE). Current efforts to measure and improve quality at scale are admirable, but this review argues for a complementary approach: the government should pilot a Cadillac public ECE program using the infrastructure of the federal Head Start program, called Head Start Plus. This program would be targeted toward the most disadvantaged children in the U.S. (those with the lowest family income and maternal education, and other dimensions of adversity) and would include several high-quality features; most notably, low child-to-teacher ratios (∼6:1 for preschoolers) and small group sizes. A pilot evaluation of the program would offer insights into the feasibility of achieving impacts and cost-benefit ratios in the range of those from demonstration programs and would provide benchmarks for effect sizes on measures of classroom processes and child outcomes that can guide the design, improvement, and evaluation of existing and future programs.
Social Media
The government should pilot a Cadillac public ECE program, called Head Start Plus. This program would be high quality and targeted toward the most disadvantaged children in the U.S. The aim is to generate large returns on investment similar to demonstration programs.
Key Points
Demonstration programs, such as Perry Preschool and Abecedarian, generated large returns on investment; however, evaluations of more modern programs have generated smaller short-, medium-, and longer-run impacts.
Program quality and targeting are two actionable mechanisms that help explain the inconsistencies between the impacts of demonstration programs and those from modern evaluations of ECE.
A Cadillac public ECE program, Head Start Plus, may offer a promising opportunity to generate larger returns on investment for the most disadvantaged U.S. children.
Targeting based on additional factors, such as maternal education, could increase the potential for larger impacts on important outcomes.
Head Start Plus would be run by the Office of Head Start; approximately 1/4 of children currently eligible for Head Start would be eligible for Head Start Plus, and child-to-teacher ratios would be approximately half of those for Head Start.
Federal and state subsidized early childhood education (ECE) programs likely provide benefits for children, families, and society at large through multiple potential mechanisms, including direct financial benefits to families (Burchinal et al., 2022; Forry, 2009; Yavorsky & Ruppanner, 2022) and opportunities for maternal employment and education (Landivar et al., 2022; Morrissey, 2017). However, much of the focus of ECE's value has been on program benefits conferred to children, both in the near-term (Phillips et al., 2017), and on longer-run educational, economic, health, and other life outcomes (U.S. Congress Joint Economic Committee, 2024). Demonstration programs operating in the 1960s and 1970s—the Perry Preschool and Abecedarian programs—had the intention of improving long-run outcomes and provide the most prominent evidence of this possibility. The results from these two randomized trials generated optimism that ECE can provide large returns on investment (ROI) for society over the course of children's lives through their improved outcomes (Elango et al., 2016).
Although demonstration programs yielded large ROIs, causally informative evaluations of modern programs have yielded smaller impacts on child outcomes, on average, in the short- and medium-term (Whitaker et al., 2026). Although some of the hypothesized reasons for these differences are outside the control of the ECE program designers (e.g., improved societal and educational experiences of children today compared to a half century ago), several key factors are within their control (for reviews, see Jenkins & Whitaker, 2026; Whitaker et al., 2026). Although the key program features contributing most to this discrepancy have not been empirically established, experimentation with ambitious programs that make major investments in best-bet quality features would be valuable for early childhood education and research.
Our review describes two actionable factors within the control of policymakers and those overseeing subsidized ECE programs that would impact program ROI: (1) the populations targeted by such programs (i.e., targeting), and (2) program quality (including per-pupil spending). Using extant Early Head Start and Head Start infrastructure, which likely represents the closest scaled-up approximation of demonstration programs, we propose piloting a highly targeted, high-quality “Cadillac” ECE program, which is referred to as Head Start Plus here. From a policy perspective, the goal would be to improve the future outcomes of children in the U.S. most likely to struggle academically and economically as adults by maximizing program impacts prior to school entry. In addition to potentially providing larger individual and societal returns by benefiting the most disadvantaged children, the program would also provide opportunities to study the key features of quality, which have eluded the field for some time, and set benchmarks for effect sizes that the universe of less targeted, less expensive public programs can expect, and use this knowledge to improve existing programs.
Levers for Generating Larger ECE Program Impacts
The Perry Preschool and Abecedarian demonstration programs were targeted and high-quality. But several features make approximating their ROIs implausible in contemporary life. First, families were targeted based on maternal IQ and perceived risk of developmental delay (Ramey, 1974). They were also delivered at a time when the conditions faced by poor American children were different in many respects, including much lower federal spending on healthcare and nutrition programs, racially segregated public institutions, less access to childcare, and less time spent with parents (Whitaker et al., 2026). Targeting, quality, and contextual factors likely all contributed to these programs’ impressive impacts on the children they served, and their large ROIs (e.g., Campbell et al., 2012; Heckman et al., 2010).
Based on empirical research on demonstration programs and modern ECE efforts, two key factors likely contribute to, and moderate, the magnitude of ECE program impacts: (1) the degree to which it targets children most likely otherwise to experience negative life outcomes, and (2) the quality of the program. Figure 1 graphs the theorized moderation of program impacts based on population served and ECE quality. The light grey, dashed line represents to what extent children from the entire population reach their developmental potential for positive life outcomes in the absence of ECE. Children low on this dimension will struggle to fulfill their potential due to resource scarcity (e.g., lack of cognitive stimulation, nutritionally supportive meals, safe, physical spaces for play and exploration). Children high on this dimension will fulfill their potential even in the absence of ECE.

Theoretical Impacts of No ECE, Average Quality ECE, and Higher Quality ECE Based on the Risk for Negative Life Outcomes of the Population.
Imagine scenarios in which otherwise identical children attend one of two alternative ECE programs, either one of average quality (dark grey line) or one of higher quality (black line). For the children most at risk of not achieving their potential, either program would be beneficial (represented by the black and dark grey arrows), but greater potential will be achieved with the higher quality program. On the right side of the distribution, average quality ECE will not likely offer benefits and could even do harm. Although high quality ECE will likely offer benefits to most children, some home environments may offer better cognitive stimulation than even high quality ECE by virtue of child-to-adult ratios (i.e., 1:1).
Holding quality constant, this figure illustrates that as programs become less targeted, they inevitably capture more of the population with less potential for large gains from their ECE experiences (i.e., shrinking of the arrow sizes). Head Start programs today serve the left side of the distribution represented by the dashed vertical line. Head Start Plus, like demonstration programs, would provide a mechanism to target children on the far left-hand side of this distribution—those least likely to reach their potential in their absence—represented by the dotted vertical line. Smaller program impacts of today's scaled-up ECE programs are likely due to a combination of less aggressive targeting than demonstration programs (i.e., moving from left to right on the distribution) and a weaker contrast in quality between scaled-up ECE programs and counterfactual conditions (e.g., more available ECE alternatives in the absence of a public option; more accessible health care for children).
Program Targeting
As states have expanded their own pre-k programs over time, scholars have long debated the virtues of offering targeted versus “universal” eligibility (Blau, 2022; Yavorsky & Ruppanner, 2022; Zigler et al., 2011). However, these distinctions are sometimes less salient in practice: children, regardless of eligibility, are often served in the same ECE programs and the same classrooms, with providers blending various funding sources (Botvin et al., 2026; Duer & Jenkins, 2023). This outcome partially reflects political tradeoffs, wherein policymakers expand services to garner the most wide-spread support for ECE programs and broaden the coalition of beneficiaries—a decidedly different aim than maximizing child-level impacts. There are good reasons to invest in universal ECE programs, including reducing financial strain of families, boosting maternal employment and career opportunities, and treating early childhood education and care as fundamental societal “infrastructure,” akin to highways and K-12 schooling (Burchinal et al., 2022; U.S. Department of the Treasury, 2021; Yavorsky & Ruppanner, 2022). We do not argue against universal ECE programs here. However, these goals are markedly different than maximizing child-level impacts at the end of preschool to look like those of demonstration programs. Targeting is a central component of such calculations.
Consider the following argument from President Biden's 2023 State of the Union Address: “Studies show that children who go to preschool are nearly 50% more likely to finish high school and go on to earn a two- or four-year degree, no matter their background they came from” (Biden, 2023). Biden was a strong advocate of public investments in ECE, and quoted cost-benefit and impact estimates from demonstration programs several times during his presidency. However, his claim, which may have been an attempt to summarize the results of the quasi-experimental evaluation of the Chicago Child-Parent Center for children who attended preschool in the early 1980s (Jacobson, 2023), is far out of step with the long-term evaluations of more recent ECE programs at scale. For example, a large-scale randomized analysis of the effects of Boston's universal pre-k program on long-term outcomes reported a null effect of less than 1 percentage point on graduating from a college or university, relative to a graduation rate of 21% in the control group (Gray-Lobe et al., 2023). Perhaps the benefits of universal programs, including on parental employment, are sufficient to pay for themselves, or justify their societal expense (e.g., Gibbs et al., 2026). However, optimizing programs for supporting parental employment would likely require a different policy approach than optimizing programs for supporting children's long-run outcomes.
Current scaled-up programs likely fail to match the impacts of demonstration programs in part because the marginal child to whom services have been expanded is at much lower absolute risk for many negative outcomes with high social costs (i.e., moving from left to right on Figure 1). Consider the problem of high school dropout, which is estimated to inflict a social cost of billions of dollars over the course of a single cohort of dropouts’ careers (Belfield & Levin, 2007): 45% of the control group from Perry Preschool graduated high school, compared to over 85% of U.S. children in recent years (NCES, 2024). Even assuming a very large increase in the odds of graduating high school from a contemporary program, total dropout reduction is capped by today's high rates of graduation. In fact, it would be impossible for a truly universal program (i.e., serving all children) to increase high school graduation by 20 percentage points (as observed in Perry) due to the current high rates of graduation for the overall population (i.e., 85%). This is great for society; however, it limits the potential impacts of modern ECE programs on such outcomes, especially when the parts of the population least likely to drop out gain access to public preschool.
In turn, one way to increase the magnitude of ECE program effects is to design a very targeted program that serves children facing various forms of adversity and most at risk of not meeting their developmental potential. Consistent with this possibility, analyses of the Head Start Impact Study indicate that children with the lowest early math, reading, and vocabulary skills benefitted most from Head Start in the short-term, with some suggestive evidence of this in the medium-term (Bitler et al., 2014). Because Head Start programs already use a variety of family and child factors, in addition to income, in determining eligibility for services, existing providers have the administrative procedures and expertise on the most critical local hardships facing families with young children. As such, Head Start programs are well-positioned to identify appropriate targeting components for Head Start Plus eligibility.
Maternal Education as an Illustrative Example of Additional Targeting
Social programs use several child, family, and neighborhood conditions to determine eligibility for services. The Head Start program considers these factors holistically, determining eligibility using not only income, but also enrollment in WIC, SNAP, and Medicaid, as well as foster care and housing status of children. One straightforward dimension for additional targeting to increase potential impacts is maternal education. Head Start centers serving children whose mothers who did not graduate high school show larger impacts on child outcomes (Walters, 2015). Further, maternal education is a strong predictor of child educational attainment, above and beyond other sociodemographic controls and maternal vocabulary (Duncan et al., 2024), raising the potential for long-term impacts on high school graduation. Two clear patterns emerge when looking at income, maternal education, and degree attainment when using data from the Future of Families and Child Wellbeing Study (FFCWS), which intentionally oversampled children of unmarried mothers, resulting in a socioeconomically diverse sample of families. First, the probability of high school or college degree attainment is lower at lower income levels, and second, even below the common income eligibility threshold for Head Start eligibility (130% of the Federal Poverty Level), there are drastically different probabilities of high school or college degree attainment based on maternal education (see Supplemental Figure 1). Specifically, for children under the Head Start income eligibility threshold, predicted high school graduation rates are above 80% for children whose mothers completed high school, compared to close to 60% for children whose mothers did not graduate high school. These patterns indicate that targeting on both income and maternal education would increase the potential for larger impacts relative to targeting based on income alone. In other words, targeting eligibility based on maternal education, along with income, would allow an ECE program to reach children with much lower rates of predicted high school or college graduation.
Quality of ECE Programs
By definition, higher quality ECE programs delivered to the same population in the same settings will generate larger child-level impacts than lower quality ones. What does this mean in practice? On the low end of the program quality, imagine an ECE classroom with twelve three-year-olds for every one teacher in the classroom, without meals provided, with limited physical space that is safe and supportive of movement, and without play stations to stimulate cognition and promote exploration. There also may be no curriculum used, or one that has no evidence base. In this classroom, children are unlikely to reach their developmental potential. On the high end of the distribution, imagine four three-year-olds for every teacher in the classroom, a nutritionally-supportive meal plan including breakfast, lunch, and snacks, physical space that is safe and supportive of movement both inside and outside of the classroom, and with cognitively stimulating activities, such as block play and a small library. This classroom may also have a developmentally informed curriculum and additional resources for supporting families and the home learning environment. This would be a “Cadillac” ECE program because children are more likely to reach their developmental potential. In other words, all ECE quality exists along a spectrum, one that includes both structural and process related aspects that research has identified as supportive for children's developmental potential (Farran, 2017).
Clearly, ECE quality is important; however, the task of improving quality in a cost-effective way has proven challenging. One reason for this is that commonly used measures of ECE quality are very weakly associated with children's developmental outcomes (Brunsek et al., 2017 reported small meta-analytic associations between a commonly used quality measure and child outcomes of approximately r = 0.05). In contemporary studies, there is no single measure of preschool program quality that reliably forecasts improvements in child outcomes. Importantly, the labor-intensive nature of ECE—especially for infants and toddlers—implies that reducing child-to-teacher ratios is an effective lever for improving ECE impacts on children (Bowne et al., 2017). Indeed, this is the key mechanism driving the short, medium and long-term impacts of the Tennessee STAR study, which experimentally reduced kindergarten class sizes (Chetty et al., 2011). Yet it is also an expensive societal investment (it would require hiring more teachers) and there exists little modern evidence of impacts.
In turn, policymakers must decide whether to use limited available funds to either increase access (enrollment) or to improve the quality of experiences of the children enrolled—this is known as the quality-quantity tradeoff (Moran et al., 2026). For public preschool, expanding access has won the day; enrollment has increased steadily over time, with 37% of 4-year-olds participating in public preschool today (Cascio, 2021; Friedman-Krauss et al., 2026). Yet average per-pupil spending for state-funded prekindergarten programs averages between $7-$8k and are around $14–$15k for Head Start, compared with ∼$26k per-child per year in Abecedarian (in 2024 dollars; US DHHS, 2026; Garcia et al., 2020; Friedman-Krauss et al., 2026). These substantial differences in per-pupil expenditure play a clear role in expectations of child development impacts.
The intensity of demonstration programs is not scalable to the whole population without a several-fold increase in public ECE expenditures; contemporary scaled-up public preschool programs offer large group sizes, and fewer hours, meals, and comprehensive services. Although incremental quality improvements in large-scale preschool programs are worthwhile, a complementary approach is to design a program that invests substantially in features of quality present in demonstration programs widely agreed to contribute to ECE effectiveness.
Child-to-Teacher Ratios as Quality Lever
Child-to-teacher ratios are a clear quality mechanism under the control of policymakers. The largest relations between ratios on children's outcomes occur below 7.5 (child-to-teacher), with larger classes showing more attenuated associations (Bowne et al., 2017). Children in ECE programs rely on adults to provide stimulating and responsive interactions that support their development (Duncan et al., 2024). Indeed, child-to-teacher ratios are a strong predictor of conversational interactions for preschool children. In an analysis of over 500,000 h of ECE audio data, a child in full time preschool for one year was predicted to have nearly 100,000 more conversational interactions with adults in a classroom with a ratio of 5:1 compared to a classroom with a ratio of 20:1 (Duncan et al., 2026).
There is some risk in going all-in on child-to-teacher ratios: they are expensive, and as noted above, there is little consensus in the field about which program features are the active ingredient(s) generating variation in program effectiveness. Indeed, within Head Start centers, an analysis found no association between student to staff ratios and student outcomes (Walters, 2015). However, the meta-analytic work finding very low ratios may have the largest impacts, along with a strong theory of change, motivates this choice. As noted above, we encourage experimentation with ambitious programs that make major investments in other best-bet quality features (e.g., curriculum, staff qualifications, and professional development) as well. But starting with a pilot study on relatively easy-to-implement levers mitigates the cost of a potential program failure.
Comparisons between Demonstration Programs and Modern Models of ECE
It is useful to directly compare program features of Perry and Abecedarian to modern ECE programs, particularly state or city provided preschools and modern versions of Head Start. There are two main points when comparing the demonstration program participants and modern Head Start participants. First, is that poor children in the 1960s and 1970s faced more challenging contexts than poor children today, including less social welfare, more racial discrimination, larger family sizes, and fewer alternative childcare opportunities (Whitaker et al., 2026). Second, the families served by the demonstration programs were among the most vulnerable to negative life outcomes, living in very high poverty and mothers with low education levels, among other correlated sociodemographic factors (Campbell et al., 2012; Heckman et al., 2010). One prominent feature of the Perry and Abecedarian samples were that both captured mothers who averaged approximately 9 to 10 years of education (i.e., less than a high-school degree; Campbell et al., 2012; Heckman et al., 2010).
Differences in per-pupil expenses of demonstration programs versus today's scaled-up programs are large, up to 400%, or ∼$17,000 per child per year. What structural features account for these large differences? Abecedarian children enjoyed full-day care and education, available from 7:30–5:30, for 50 weeks a year, with child-to-teacher ratios of 6:1 for preschoolers (3:1 for infants and toddlers); all teachers had bachelor's degrees and were paid salaries competitive with those of local public schools (Masse & Barnett, 2002); they also received ongoing professional development and coaching (Garber et al., 2026). Both the treatment and control groups received family support, social work services, nutritional supplements (meals at centers for treatment group, formula and other nutritional supplements for control), free medical care at university clinics, transportation, disposable diapers, and pay for participating in assessments (Ramey, 1974). Although not relevant to the treatment contrast, the set of services is noteworthy for its comprehensiveness, and in stark difference to most of today's programs (critically, exclusive of Head Start). Perry Preschool classes met 2.5 h a day, 5 days a week, for a 30-week school year, all teachers had bachelor's degrees, child-to-teacher ratios of 6:1, and weekly home visits (Barnett, 2011). Both programs used novel developmental curricula that evoked teacher-child interactions, child-initiated play and discovery that centered around educational games and individualized development (Campbell et al., 2002; Garcia et al., 2023).
In contrast, many of today's universal programs are designed to serve a larger number of children with less intensive services. State-funded prekindergarten programs offer classroom-based education for 3–6.5 h a day, with child-to-teacher ratios ranging from 10:1 to 12:1 (Friedman-Krauss et al., 2026; although New Jersey's court-mandated Abbott Preschool Program caps class sizes at 15 with a teacher and assistant; Barnett & Jung, 2021). Rarely do they include comprehensive services, transportation or meals, and most have vague curricular requirements. Head Start is the closest to the demonstration model, providing comprehensive family supports and social workers who connect families with community resources, nutritious meals, medical and dental checks, and school-day (8:00–2:00) or full-day (7:30–5:30) coverage, which closely resemble the features of Abecedarian and Perry (hours and scheduling vary by grantee; Office of Head Start, 2024). A notable exception to Head Start's similarity to demonstration programs is in its child-to-teacher ratios, which range between 8:1 to 10:1, closer to those of state preschool programs.
Envisioning Head Start Plus
Head Start is unique from universal programs in that it is already targeted based on the risks of negative life outcomes of children largely due to family's financial means. Additionally, Head Start already includes many features of the Perry Preschool and Abecedarian programs that likely contributed to their effectiveness. Head Start Plus would build upon the infrastructure of Head Start in two key ways. First, it would be even more targeted than current Head Start efforts (see the dotted versus dashed line in Figure 1). Based on the nationally representative sample of participants from the Head Start Impact Study, approximately 23% of families in Head Start would fall into a lower income (less than $1500 per month) and lower maternal education strata (less than high-school degree; Puma et al., 2010). Thus, Head Start Plus would target these families and children.
Based on our comparison of manipulable program components between Head Start and classic demonstration programs, the essential quality lever proposed for Head Start Plus is reducing child-to-teacher ratios to 6:1 (for preschool) and 3:1 (for infants/toddlers) to more closely approximate the ratios of Abecedarian and Perry Preschool. Under the assumption that reducing ratios by these amounts would approximately double the per pupil expense (i.e., from approximately $14k to $26k) for the 23% of eligible Head Start children, Head Start Plus would be expected to increase the total Head Start budget from approximately $12 billion to $15 billion per year. Ultimately, Head Start Plus could be even more carefully targeted if considering other child and family characteristics or be even more ambitious by further lowering ratios.
In essence, Head Start Plus is intended to provide greater resources, in the form of quality, for those children with the greatest needs, without building a separate public program. While a combination of family and child factors are proposed for targeting due to their close associations with children's educational attainment (e.g., maternal education), other ecological and geographical factors are also possible, such as serving neighborhoods with particularly high crime rates, or communities facing environmental disasters.
Besides reducing child-to-adult ratios, Head Start Plus may also include additional trained clinicians to identify children with early developmental challenges to provide supportive services as soon as eligible. This would all occur within the Head Start program's existing infrastructure, which includes partnerships with local health agencies to ensure healthy development of children and home environments, and other nonprofit and government agencies supporting families facing homelessness, domestic violence, food insecurity, and the myriad systemic challenges co-occurring with poverty. When extremely high quality ECE is provided to a highly targeted population who are most vulnerable for negative life outcomes, such efforts will more closely resemble early demonstration programs.
Conclusion
There are likely two actionable factors that policymakers could target to boost ROI of modern ECE programs if attempting to replicate findings from demonstration programs: (1) the population served and (2) the quality of the programs. Over the past few decades, increased funding for ECE appears to have been focused primarily on expanding access, particularly as in state- or city-level universal programs. To do so, many states have increased their child-to-adult ratios and limit preschool to classroom-based education, with few other services.
Head Start Plus offers an opportunity to leverage the existing Head Start system to try to recreate large ROI that generated initial optimism around the potential of ECE. Children of families facing the most substantial adversity, with the lowest incomes and the lowest levels of maternal education would be strong candidates for the more intensive targeting of Head Start Plus. Reducing the child-to-adult ratio to 6:1 for preschoolers and 3:1 for infants and toddlers—what appears to be the key programmatic differentiator between today's Head Start and yesterday's Perry Preschool and Abecedarian models—will substantially increase the quality of experiences offered to the most vulnerable children to deliver the “Cadillac” model of ECE. This fosters interactive experiences, focused on cognitive stimulation through developmentally appropriate games and curriculum, support children's physical health through nutrition and physical activity, and linking children and families with supportive resources as early as possible.
A pilot evaluation of Head Start Plus, varying both quality dimensions and an offer for a slot in the program, would offer several benefits. First, we would learn about a program designed to improve the lives of the most disadvantaged U.S. children. Second, it would allow for a formal test of whether more aggressive targeting can improve impacts on child outcomes. Finally, the evaluation would provide benchmarks for effect sizes that evaluations of less intensive programs can use in designing additional program features or evaluations thereof. Programs could also attempt to learn from variation in quality features and processes in Head Start Plus classrooms designed to have significantly lower child-to-teacher ratios. In turn, our proposal for Head Start Plus would be universally positive for ECE writ large and for population-level child well-being.
Supplemental Material
sj-docx-1-bbs-10.1177_23727322261462426 - Supplemental material for Head Start Plus: The Case for Piloting a Cadillac Public ECE Program
Supplemental material, sj-docx-1-bbs-10.1177_23727322261462426 for Head Start Plus: The Case for Piloting a Cadillac Public ECE Program by Robert J. Duncan, Jade M. Jenkins and Drew H. Bailey in Policy Insights from the Behavioral and Brain Sciences
Footnotes
Acknowledgements
We thank W. Steven Barnett, Margaret Burchinal, Kenneth Dodge, Dale Farran, Emma Hart, and Tyler Watts for comments on a previous draft.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
