Abstract
Black teachers are critical resources for children and schools. In experimental data, I document large effects of Black upper-elementary teachers on the self-efficacy (0.9 SD) and classroom engagement (0.7 SD) of their Black but not non-Black students, potentially driven by role modeling. Black teachers also benefit the test scores (0.2 SD) and absences (over 20% decrease) of all students—no matter their race/ethnicity—that often persist years later in high school. Furthermore, Black teachers bring unique mindsets and practices to their work (differentiated instruction, growth mindset beliefs, well-organized classrooms) that mediate a moderate to large share of effects on student outcomes. These findings help bridge the quantitative race/ethnicity-matching literature with theoretical discussion and qualitative exploration on why Black teachers matter.
Keywords
Introduction
For far too long, education systems have failed students of color. Systemic racism—exhibited through school-based segregation (Johnson, 2011), exclusionary discipline (Fenning & Rose, 2007), and limited access to instructional resources (Jackson, 2009), among other sources—has created stark disparities in educational opportunity between Black and other historically marginalized and minoritized students of color versus their White peers. Constrained opportunity impacts and ripples across a range of educational and life outcomes, including academic performance (Fryer & Levitt, 2004), high school graduation (Hernandez, 2011), and success in the labor market (Rivkin, 1995).
Compelling lines of theoretical and empirical research show that one of the most effective levers to better support Black and other students of color is to provide opportunities to learn from a teacher from the same racial or ethnic group. Although the teacher workforce is overwhelmingly White (roughly 80%; U.S. Department of Education, 2019), Black and other teachers of color are described as uniquely positioned to understand and address the social, political, and economic inequalities that students of color face (Ladson-Billings, 1994). Building from this theory, causally oriented studies document substantively meaningful teacher-student race/ethnicity-matching effects on students’ academic outcomes (for reviews, see Bristol & Martin-Fernandez, 2019; Redding, 2019).
In the current analyses, I orient the quantitative teacher-student race/ethnicity-matching literature toward “why” and “how” questions, which is important for at least two reasons. First, the theoretical literature on this topic poses several, likely overlapping hypotheses related to social dynamics inside schools and classrooms that are largely untested quantitatively. Is it that Black teachers serve as role models for their Black students, which in turn drives improved outcomes (Villegas & Lucas, 2004)? Do Black teachers also engage in “culturally relevant” (Ladson-Billings, 1995b), “culturally responsive” (Gay, 2000), and “culturally sustaining” (Paris, 2012) pedagogies that are particularly beneficial for their students of color—or that may benefit all students?
Second, understanding which of these mechanisms—or others—drives effects of Black teachers on student outcomes is critical for policy and practice. As Gershenson et al. (2022) pointed out, if the effects of Black teachers on student outcomes are explained by a set of mindsets, practices, and skills they bring to their work, then it may be possible to train the mostly White teacher workforce in these areas. Alternatively, if effects are driven by role modeling, then the only real option is to engage in different approaches to recruitment and retention of Black and other individuals of color to substantially alter the demographics of the population of public school teachers.
Motivating Literature
Although the research literature linking teacher-level characteristics to student outcomes has crystallized around the benefit of same-race/ethnicity matching, less is known from this same research tradition about the mechanisms driving these effects. Theory, largely grounded in sociological and human development perspectives, suggests three possible pathways. First, Black and other historically marginalized and minoritized students of color benefit from having teachers who look like them as role models, particularly given the way in which their career and training exemplify academic success (Villegas & Lucas, 2004). Seeing a more equitable distribution of power in schools—relative to society more broadly—can help draw students of color into the classroom environment and build a sense of self-efficacy and belonging (Bristol & Martin-Fernandez, 2019).
Second—and not mutually exclusive from the first pathway—Black and other teachers of color may be better equipped than White teachers at teaching students of color. White teachers are not inherently unable to teach students of color but may be more likely to adopt and maintain deficit views and colorblind ideologies that presume that individual factors—rather than systemic racism—are responsible for the academic challenges that students of color may experience (Lewis, 2001). In contrast, Black teachers who situate high expectations for academic success at the “base” of instruction (Ladson-Billings, 1995a, p. 160) can help offset “stereotype threat” and the risk of confirming a negative stereotype about a group (Steele & Aronson, 1995). The culturally relevant pedagogy of Black teachers also is described as one of “opposition,” where teachers use their understanding of students’ culture and lives to guide instruction (i.e., “cultural competence”) and support students to critique cultural norms, values, and institutions that produce and maintain social inequities (i.e., “critical consciousness”; Gay, 2000; Ladson-Billings, 1995b; Paris, 2012).
Third, Black and other teachers of color may be better at teaching all students. Although race-conscious instruction is central to Black scholars’ definition and description of culturally relevant pedagogy, Ladson-Billings (1995a) also argued that the “pedagogical excellence” of Black teachers includes additional features of “good teaching”: differentiating instruction to meet the needs of individual students as they pursue academic goals, ensuring that classrooms are well organized for learning without creating an exclusionary climate, and building strong interpersonal relationships with students to support engagement in the classroom environment. Gay (2000) similarly described “the power of caring” as a key component of culturally responsive teaching. Potentially driven by these practices, Asian, Black, Hispanic, and White students report feeling better cared for and more academically challenged when they have a teacher of color (Cherng & Halpin, 2016).
I summarize this conceptual framework in Figure 1, which links Black teachers to outcomes of Black and non-Black students through both role-modeling channels and specific mindsets and pedagogical practices. In the framework, Black teachers are hypothesized to improve components of students’ social-emotional learning (SEL) first, potentially translating into increased school attendance and ultimately into test scores. Black students are most likely to benefit from Black teachers—inclusive of both role-modeling and pedagogical practices—although non-Black students can benefit, too—primarily through Black teachers’ mindsets and practices. The conceptual framework further lays the groundwork for the current analyses by identifying the measures that are or are not visible in the available data and the pathways that can be tested causally versus in an exploratory way.

Conceptual framework linking Black teachers to student outcomes.
Despite rich theoretical discussion and qualitative exploration on why Black teachers matter, quantitative scholars generally have been quite limited by available data to explore these mechanisms and mediating pathways in any rigorous way. Exploiting the random assignment of teachers to students in the Project STAR/class size experiment in Tennessee from the 1980s, Gershenson et al. (2022) argued in favor of the role-modeling hypothesis given that Black teachers only impacted the test scores, absences, and long-run educational attainment of Black students and not White students. Furthermore, these effects persisted even when accounting for observable background characteristics of teachers (i.e., experience, highest degree attained, status on a career ladder) that the authors suggested may be proxies for good teaching. Edmonds (2022) made a similar argument in a nonexperimental study by estimating effects on test scores and suspensions.
At the same time, making a case for role-modeling effects—which are difficult to observe directly—generally requires ruling out other possible explanations and, thus, demands fairly detailed data on both teachers and students. The background characteristics of teachers available in the large administrative data sets used by Edmonds (2022) and Gershenson et al. (2022) do not align with the theoretical literature on teaching pedagogy that emphasizes specific mindsets and practices. These studies also focus on test scores, school behaviors, and educational attainment measures, which could be influenced by role-modeling channels but likely only insomuch as students first develop positive school experiences and a sense of belonging in the classroom (Bristol & Martin-Fernandez, 2019). A growing number of studies examine links between teacher-student race/ethnicity-matching and SEL measures, including engagement, motivation, and social ties (e.g., Egalite & Kisida, 2018; Rasheed et al., 2020; Wright et al., 2017). However, none of these studies can support robust causal claims through experimental designs, and they do not examine teacher-level perceptions, expectations, and practices that may drive these effects.
To my knowledge, the current study is one of just a handful of experiments to estimate the effects of Black teachers on student outcomes (Constantine et al., 2009; Dee, 2004; Gershenson et al., 2022) and the only experiment to incorporate student SEL measures and teacher mindsets and practices.
Data
Sample and Experimental Design
The sample and data for this study come from a research project called the National Center for Teacher Effectiveness (NCTE), which examined characteristics of effective teachers and effective teaching in upper-elementary classrooms (i.e., fourth and fifth grades). The four anonymized partner districts all are urban contexts, geographically located in the Northeast, Mid-Atlantic, and Southeast regions of the United States. The project was interested primarily in math instruction, and so some of the teacher and student measures focus on this content area. At the same time, all participating teachers were generalists who taught all core subjects, suggesting that measures may generalize across teachers’ work and instruction. Other studies show that teachers who are effective in math—as measured by valued added to test scores and classroom observations—also tend to be effective in English language arts (ELA; Cohen et al., 2018; Goldhaber et al., 2013).
In academic year 2012–2013, which is the final year of the 3-year study, the project conducted an experiment that randomly assigned teachers to class rosters within schools. The research team worked with district and school leaders to identify schools and school-grade combinations that were eligible for random assignment, meaning that there were at least two teachers in each school grade and principals considered the set of teachers as capable of teaching any of the rosters of students that they (or their leadership team) created. Out of 91 eligible teachers in the partner schools in the 2012–2013 school year, 71 were randomly assigned to rosters (n = 1,283 students) and constitute the main sample for this analysis. 1
Students and teachers in the experiment look similar to the larger NCTE sample, broader populations in the partner districts, and urban school districts across the United States (see Appendix Table A1). Over 35% of students are Black, and 25% are Hispanic, with at least two-thirds of students eligible for free or reduced-price lunch. Roughly one-fourth of teachers are Black, and 70% are White. Black versus White teachers in the experiment also have similar background characteristics to each other, including undergraduate major in education, certification pathway, advanced degree, and teaching experience (p = .396 on joint test of significance; Supplemental Table 1 available on the journal website). 2 The moderately sized experimental sample includes only a handful of Asian and Hispanic teachers, who I keep in the analysis to ensure fidelity of the randomized design. However, I cannot draw strong inferences about these groups.
Measures
Student outcomes
I examine effects of Black versus White teachers on six student outcomes. Three SEL measures come from a student survey administered in the spring as part of the NCTE project (see Appendix Table A1 for descriptive statistics, Supplemental Table 2 available on the journal website for item text, and Blazar & Kraft, 2017, for exploratory factor analyses): (a) Student-reported self-efficacy captures students’ effort, initiative, and perception that they can complete tasks (10 items, internal consistency reliability [a] = 0.76); (b) engagement and happiness in class asks students about their affect, happiness in, and enjoyment of class activities (fives items, a = 0.82); and (c) self-regulation captures the extent to which students regulate their behavior to align with teachers’ expectations (three items, a = 0.74).
District records include three additional outcomes: state assessments in (d) math and (e) ELA and (f) absences from school. Although the surveys were available in just 1 year, district records were available during the experiment, prior school years, and all subsequent years through 2018–2019 (i.e., the year before Covid interrupted district data collection). Therefore, I estimate effects on measures captured at the end of the experimental year when students were in elementary school and up to 6 years later in high school. 3 I standardize test scores and survey responses to have a mean of 0 and a standard deviation of 1. Given the skewed nature of the absence data (see Appendix Table A1), I take the natural log of absences plus 1. Correlations between the student-level measures show that SEL dimensions, absences, and test scores all correlate with each other and across time points (see Supplemental Table 3 available on the journal website).
Teacher mindsets and practices
The NCTE project also collected a range of teacher-level measures from both a teacher survey administered in the fall and videotaped lessons. Teachers contributed an average of three lessons per school year, spaced across the year, which trained raters scored on the Classroom Assessment Scoring System (CLASS) observation instrument (Pianta et al., 2012). Across these two sources of data, I focus on five total measures that align to theoretical discussion on “good” teaching practices and mindsets likely to show up in the classrooms of Black teachers. The survey includes three measures (see Appendix Table A1 for descriptive statistics, and Supplemental Table 4 available on the journal website for survey item text): (a) Teacher-reported growth mindset beliefs captures the extent to which teachers view student intelligence as malleable versus fixed (seven items, a = 0.82), which aligns with Ladson-Billings (1995b) observation that culturally relevant teachers hold and then act on beliefs that “knowledge is not static” (p. 481); (b) preparation for instruction (15 items, a = 0.78) identifies the amount of time teachers spend planning for instruction and collecting formative assessment data—some items specific to math and some not—and the extent to which they use this information to deliver differentiated instruction that attends to individual students’ needs (for connections to literature on culturally responsive teaching, see Kieran & Anderson, 2019; Ladson-Billings, 1995a); and (c) relationships with students and families (four items, a = 0.63) includes the rapport teachers develop with students inside and outside of the classroom and the amount of time teachers spend talking with parents and families about students’ learning and behavior (Gay, 2000). 4
Observations of classrooms include two additional measures (see Supplemental Table 5 available on the journal website for item text and Blazar et al., 2017, for exploratory and confirmatory factor analyses) 5 : (d) Classroom support focuses on teachers’ interpersonal relationships with students around classroom activities and content, including creating a positive classroom climate, and teachers’ sensitivity to and respect for student ideas and perspectives (nine items, a = 0.90, adjusted intraclass correlation [ICC] of between- vs. within-teacher variation = 0.63), and (e) classroom organization captures teachers’ behavior management skills and the extent to which teachers’ approach to addressing student (mis)behaviors avoids creating a negative classroom culture (three items, a = 0.72, ICC = 0.47); this measure aligns with scholars’ observation that the behavior, physical movements, and language of minoritized students often are misunderstood (Fenning & Rose, 2007; Gay, 2000). Reliability indices are similar to other large-scale video studies of classroom instruction (e.g., Kane & Staiger, 2012).
I standardized all teacher-level measures to have a mean of 0 and a standard deviation of 1. Correlations between the teacher mindset and practice measures follow expected patterns (see Supplemental Table 6 available on the journal website). For example, outside observers’ assessment of teachers’ classroom support is most strongly correlated with teacher-reported preparation for instruction (r = .37) and relationships with students and families (r = .22).
Empirical Strategy
Average or Total Effects of Black Versus White Teachers on Student Outcomes
The randomized design allows for a straightforward approach to estimate the average or total effect of Black versus White teachers on student outcomes. I begin with the following model:
where
Estimates from Equation 1 provide evidence of the effect of Black versus White teachers, on average, across students. I probe the differential effect of Black teachers on the outcomes of Black students (i.e., race-matching effects) versus on non-Black students by dividing the sample, reestimating effects for each subgroup, combing the variance-covariance matrices, and conducting post hoc Wald tests of coefficient equivalence. These subgroup analyses provide insight into whether Black teachers are more effective than White teachers overall or whether Black teachers serve as role models or engage in pedagogies that are uniquely beneficial to Black students.
The internal validity of resulting estimates rests on two assumptions, both of which are met in this study (see Appendix Table A2). First, I confirm baseline balance by showing that pretreatment student characteristics are unrelated to the race of their randomly assigned teacher (p = .313 on joint test of significance). Second, I show that noncompliance (i.e., students moving out of their randomly assigned teachers’ classroom) and missing data do not lead to imbalanced groups (p = .466 on joint test of significance for noncompliance, ps = .103–.397 across tests for missing data on each data source). 6
Mediating Pathways
In a set of exploratory analyses, I further examine two types of mediation outlined in the conceptual framework. The first examines whether the effect of Black versus White teachers on proximal outcomes—such as components of students’ SEL—mediate effects on more distal student outcomes—such as test scores—using the following equation:
Following a mediation framework, the difference between the direct effect of Black versus White teachers from Equation 2, ω1, and the average or total effect, β1, from Equation 1 is interpreted as the mediated or indirect effect (VanderWeele, 2015). If this difference is zero, then there is no mediating effect. If the difference is large and negative, this is indication of mediation because the more proximal student outcomes reduce the magnitude of the total effect and, thus, explain some of the effect of Black versus White teachers on the more distal outcome. In the results presented in the following, I consider the percentage of the total effect that is explained by the mediated effect, or (ω1 – β1) / β1.
Similarly, I assess mediating pathways that run through the set of observed teacher mindset and practices by including these measures in place of
The mediation analyses are only suggestive of the mechanisms that drive effects of Black versus White teachers on student outcomes because of potential biases caused by unobserved confounders and the failure of “sequential ignorability” (Imai et al., 2011). Because it is not possible to randomly assign self-efficacy, engagement, self-regulation, absences, and so on to students, these student-level mediators may be affected by other unobserved factors (e.g., parental involvement). Similarly, it is not possible to randomly assign teacher mindset and practice measures to teachers. At the same time, in this study, students were randomly assigned to teachers who vary not only in their race/ethnicity but also in the knowledge and skills they possessed up until that point. Therefore, to limit potential biases in the teacher-level mediation analyses, I focus on mindset and practice measures captured in years prior to the experiment.
Results
Average or Total Effects of Black Teachers on Student Outcomes
In Table 1, I present estimates of the average or total effect of Black versus White teachers on the set of short- and longer-term student outcomes, on average, across students and in subgroups of Black and non-Black students. 7 Like Gershenson et al. (2022) and Edmonds (2022), I find that teacher background characteristics do not change estimates meaningfully. The same is true for inclusion versus exclusion of student characteristics, which is a useful test of internal validity and consistent with the baseline balance test.
Average Effects of Black Versus White Teachers on Student Outcomes
Note. Estimates in each cell come from a separate regression model that regresses the outcome listed in each row on dummy indicators or whether students’ randomly assigned teacher was Black or Asian or Hispanic (estimates for Asian or Hispanic teachers not shown; see supplemental materials available on the journal website), with White as the left-out category, and school-grade fixed effects. Some models include background teacher characteristics: gender, bachelor’s in education, traditional certification, master’s degree, years of teaching experience, and math knowledge. Some models include background student characteristics: gender, race/ethnicity, eligibility for free or reduced-price lunch, eligibility to receive special education services, limited English proficiency status, and prior-year absences, suspensions, and achievement in math and ELA. All outcomes other than absences are standardized. Robust standard errors clustered at the teacher level are in parentheses. ELA = English language arts.
p < .1. **p < .05. ***p < .01.
I find very large effects of Black versus White teachers on intrapersonal components of students’ SEL that are localized to Black students. Assignment to a Black versus a White teacher increases Black students’ self-reported self-efficacy by 0.85 SD and their engagement and happiness in class by 0.69 SD. For both outcomes, effects of Black versus White teachers for Black students are statistically significantly larger than effects for non-Black students. The magnitude of the coefficient of the effect of Black teachers on the self-regulation of their Black students is potentially meaningful (0.22 SD) but not statistically significantly different from zero.
Expanding to other outcome measures, I find that Black teachers have large effects on short-term absenteeism of all students, although the effect is larger for Black students (47% decrease, calculated by exponentiating the coefficient that is captured in log units) compared to non-Black students (22% decrease). Effects on Black students’ absences persist to a large degree up to 6 years later, when students are in high school (43% decrease). In the short term, Black teachers also improve the math test scores of both Black and non-Black students (0.24 SD, on average, across students, with no differential effect between subgroups). None of the effects on math achievement in high school are statistically significantly different from zero. However, point estimates suggest possible persistence over time for Black students (0.12 SD) but not non-Black students. The reverse is true for effects on ELA achievement. In the short term, Black teachers have very large impacts on the ELA test scores of Black students (0.38 SD), with potential persistence over time (0.17 SD, although not statistically significantly different from zero). For non-Black students, short-term effects are positive but small, whereas effects in high school are larger (0.07 SD vs. 0.22 SD, not statistically distinguishable).
The large number of estimates presented from the same randomization could result in seeing false positives due to multiple hypothesis testing. To address this concern, I consider a Benjamini-Hochberg (Benjamini & Hochberg, 1995) adjustment that accounts for the number of tests conducted (n = 63) and an allowable false discovery rate, which I set at 20%. The resulting critical value for statistical significance of .09 is above the standard threshold of .05. This is possible given that a large share of the estimates reported in Table 1 are statistically significant. 8 I further examine the robustness of the main findings to potential lingering concerns regarding differential attrition and missing data by reestimating results across 10 multiply imputed data sets. Magnitudes of estimates are very similar to the main findings but often estimated less precisely (see Supplemental Table 8 available on the journal website).
Mediating Pathways
Next, I explore the extent to which effects of Black versus White teachers on proximal student outcomes mediate effects on more distal outcomes (Table 2) and potential mediation through teacher mindset and practice measures aligned to “good teaching” (Table 4, discussed in the following). In both tables, estimates are percentage changes in the total effect (see Table 1) after accounting for a given mediator (see Supplemental Tables 9 and 10 available on the journal website for direct effect estimates used to calculate percentage change and for student- and teacher-level mediation analyses, respectively). The sample sizes are slightly different between Tables 2 and 4 because in student-level mediation analyses, I construct a consistent sample of students who have data on each outcome and all relevant mediators.
Percentage Change in Effects of Black Versus White Teachers on Student Outcomes After Accounting for Student-Level Mediators
Note. “—" indicates the student outcome listed in each column, which all are captured in upper-elementary grades, is not used as a mediator when predicting the outcomes in each row. The percentage change estimates are calculated by dividing the direct effect of Black versus White teachers on student outcomes in models that include the mediator minus the total effect, divided by the total effect. Cells are highlighted in gray to reflect gradations of mediation: white for small negative, zero, or positive percentage change estimates; light gray for negative percentage change greater than 0% and less than 10%; medium gray for negative percentage change greater than or equal to 10% and less than 50%; and dark gray for negative percentage change greater than 50%. Mediating pathways are excluded when the total effect is not statistically significant and less than 0.1 (in absolute value). ELA = English language arts.
Following the conceptual framework and the main results, I allow mediating pathways to vary between Black and non-Black students. But I exclude several mediating pathways when the average or total effects are not statistically significant and below 0.1 (in absolute value). The technical literature on causal mediation argues that mediation can occur only when there is a statistically significant average or total effect (VanderWeele, 2015). I set a slightly lower bar (at the 0.1 effect size threshold) given the moderate sample size and the exploratory nature of the mediation analyses. Readers should interpret these results with caution. Percentage change estimates can get quite large when the total effect is small and estimated with noise because the total effect is the denominator in the percentage change equation. Cells are highlighted in gray-scale to identify larger versus smaller degrees of mediation.
Results in Table 2 suggest that the effects of Black versus White teachers on the self-efficacy and engagement of their Black students meaningfully mediate effects on all subsequent outcomes. Each of these two measures mediates the effect of the other, with large percentage change estimates (54% and 81% of the total effect explained by the mediator) because self-efficacy and engagement are strongly correlated (r = .65; see Supplemental Table 3 available on the journal website) and because Black teachers have large effects on both outcomes. These two measures also mediate effects on Black students’ self-regulation, with percentage change estimates above 100%. This is one instance where a smaller total effect that is substantively meaningful but not statistically significant may lead to large percentage change estimates. Effects of Black teachers on self-efficacy and engagement further mediate effects on short- and longer-term absences (13% and 10%) and short-term math test scores (over 100%), as well as longer-run math and ELA test scores (43% to 58%). Short-term effects on absences mediate the largest share of longer-run effects on absences (39%) and the largest share of longer-run effects on math test score (59%). For non-Black students, I observe very little to no mediation through their self-efficacy and engagement, which is largely mechanical: Black teachers do not impact these outcomes of non-Black students. In contrast, short-term effects of Black teachers on the test scores of non-Black students mediate a very large share of effects on longer-run test scores (over 70%). This finding may explain why the effect of Black teachers on non-Black students’ ELA achievement grows over time.
To explore teacher-level mediation, I first show differences in the mindset and practice measures between Black and White teachers (Table 3). Given the exploratory nature of these analyses, I present estimates for the experimental sample and the broader sample of teachers from the NCTE project to inform generalizability. I estimate these between-group differences using the same set of controls used in the student-level regressions because the teacher mindset and practice measures also are included in the student-level analyses. Black teachers outperform their White colleagues on most measures, although differences are not always statistically significant given the limited sample size. In the experimental sample, between-group differences are largest for classroom support (0.51 SD, preferencing the model with school-grade fixed effects and background student and teacher characteristics) and classroom organization (0.58 SD). Estimates for growth mindset beliefs and preparation for instruction also are substantively meaningful (roughly 0.2 SD in the preferred model) but not statistically significant. The pattern is reversed in the nonexperimental sample, with large between-group differences on growth mindset beliefs and preparation for instruction (0.5 SD to 0.8 SD) and smaller but still meaningful differences for classroom support and classroom organization (0.2 SD to 0.4 SD). In both samples, differences between Black and White teachers on relationships with students are smaller and more sensitive to control set.
Differences in Mindsets and Practices Between Black and White Teachers
Note. Estimates in each cell come from a separate regression model that regresses the teacher mindset or practice listed in each row on dummy indicators of whether the teacher is Black or Asian or Hispanic (not shown), with White as the left-out category. Some models include background teacher characteristics: gender, bachelor’s in education, traditional certification, master’s degree, years of teaching experience, and math knowledge. Some models include background student characteristics: gender, race/ethnicity, eligibility for free or reduced-price lunch, eligibility to receive special education services, limited English proficiency status, and prior-year absences, suspensions, and achievement in math and English language arts. All outcomes other than absences are standardized. Robust standard errors are in parentheses.
p < .1. **p < .05. ***p < .01.
In Table 4, I find that teacher reports of preparation for instruction is a fairly consistent mediator of the effect of Black versus White teachers across a range of student outcomes, for Black and non-Black students, but primarily in the short term. This pattern makes sense given that preparing for and differentiating instruction can immediately change classroom experiences in the short term. Preparation for instruction mediates a sizeable portion of the effect of Black teachers on Black students’ self-efficacy (42%), engagement (26%), short-run absences (11%), and math test scores (24%). I observe similar patterns of mediation for non-Black students. In contrast, growth mindset beliefs is a strong mediator of longer-run outcomes (i.e., 73% for Black students’ ELA achievement), with more moderate mediation in the short term (e.g., 39% for this same outcome). It may be that teacher mindsets take longer to influence students’ own mindsets and beliefs. Classroom organization is a strong mediator of both short- and longer-run outcomes (e.g., roughly 40% to 50% for Black students’ short-term SEL measures and over 100% for Black students’ short- and longer-run math achievement). That said, percentage change estimates go in the “wrong” direction (i.e., positive) for short-term absences, potentially reflecting a trade-off between organizational and engagement-oriented classroom practices. Indeed, the classroom organization of Black teachers is a very strong predictor of Black students’ self-efficacy (0.9 SD) but is associated with more short-term absences (see Supplemental Table 11 available on the journal website for estimates linking the teacher mindset and practice measures to student outcomes).
Percentage Change in Effects of Black Versus White Teachers on Student Outcomes After Accounting for Teacher-Level Mediators
Note. The percentage change estimates are calculated by dividing the direct effect of Black versus White teachers on student outcomes in models that include the mediator (not shown) minus the total effect, divided by the total effect. Cells are highlighted in gray to reflect gradations of mediation: white for small negative, zero, or positive percentage change estimates; light gray for negative percentage change greater than 0% and less than 10%; medium gray for negative percentage change greater than or equal to 10% and less than 50%; and dark gray for negative percentage change greater than 50%. Mediating pathways are excluded when the total effect is not statistically significant and less than 0.1 (in absolute value). ELA = English language arts.
Finally, the two measures related to teacher-student relationships and rapport explain only a small share of the effect of Black versus White teachers on student outcomes and often just for non-Black students. One explanation is that differences between Black and White teachers on relationships with students and families are small (see Table 3). Another explanation is that even though Black teachers outperform White teachers on classroom support, this measure is associated with worse student outcomes in several instances (see Supplemental Table 11 available on the journal website). This pattern similarly reflects a possible trade-off between engagement-oriented versus other types of practices.
I describe the mediation analyses as exploratory because I cannot definitively rule out potential confounders. At the same time, both student- and teacher-level mediation analyses pass a robustness test proposed by Imai et al (2011) that examines whether unobserved factors that show up in the error term are correlated across the system of equations. More specifically, I correlate residuals from a model that regresses a given mediator on treatment and a model that regresses student outcomes on treatment and the same mediator. Imai et al. argued that “sequential ignorability” holds when the correlation is zero. I find that correlations often are zero (to three decimal places) and no higher than 0.13 (in absolute value; see Supplemental Table 12 available on the journal website). In a similar application of Imai et al.’s procedure, Tran and Gershenson (2021) interpreted a correlation of –0.9 as “relatively small” because correlations are bounded between –1 and 1 (p. 194).
Discussion and Conclusion
The results of this study generally confirm long-standing theory and qualitative inquiry regarding the importance of Black teachers to students’ classroom experiences and outcomes, which—to my knowledge—have not been fully tested in quantitative and experimental research. First, I find that Black teachers have very large effects on components of their Black students’ SEL (upward of 0.85 SD). These effects compare quite favorably to other analyses of SEL-oriented interventions in schools (roughly 0.23 SD; for a meta-analysis, see Durlak et al., 2011) and to natural gains in these sorts of measures from one year to the next (roughly 0.02 SD to 0.03 SD for upper-elementary students; Soland et al., 2022). Mediation analyses further suggest that the effects on Black students’ SEL likely translate into subsequent effects on short- and longer-term school attendance and test scores, with a remarkable degree of persistence over time particularly in the effects on absences. This pattern of persistence is almost unheard of in education research (Bailey et al., 2017).
Second, I find that Black teachers benefit all students—with meaningful decreases in absenteeism and increases in test scores—not just students who look like them. This finding is consistent with theory (Villegas & Lucas, 2004) and some exploratory quantitative research on students’ perceptions of teachers of color (Cherng & Halpin, 2016) but differs from prior experimental analyses showing that Black teachers impact Black students only (Gershenson et al., 2022). One explanation may be that most of the non-Black students in this study are Hispanic and Asian—reflecting demographic diversity across the four districts that come from the Northeast, Mid-Atlantic, and Southeast regions—whereas most of the non-Black students in the Project STAR experiment from Tennessee in the 1980s and analyzed by Gershenson et al. (2022) are White. Black teachers may benefit other minoritized students of color more than they impact White students. That said, when I limit the analysis sample to roughly 20% of the student sample that is White, I find some evidence of positive effects of Black teachers on White students (e.g., 30% decrease in short-term absences), although precision is limited substantially (hence not shown or discussed in the main analyses). It also is possible that results may differ in other geographic contexts with different demographic makeups and different racialized histories of schooling.
Third, helping to explain the two patterns described previously, I find suggestive evidence that the effects of Black versus White teachers likely are driven both by role-modeling channels and by the unique set of mindsets and practices that Black teachers bring to their work. I cannot directly observe and prove the existence of role modeling. However, the fact that Black teachers impact the self-efficacy and classroom engagement of their Black students but not non-Black students points in this direction. As a psychological construct related to self-image and sense of self, role modeling may be most likely to impact these sorts of intrapersonal competencies.
I also provide evidence on the “pedagogical excellence” of Black teachers (Ladson-Billings, 1995b) that helps generalize beyond small-scale qualitative studies. The teacher-level mediating pathways are not always consistent, with most mediators associated with increases in some student outcomes but decreases in others. It also is puzzling that the two measures capturing teachers’ relationships with students explain very little of the effects of Black teachers on student outcomes, given discussion of a “culture of caring” being so central to some definitions of culturally responsive teaching (Gay, 2000). Consistent with the theory, Black teachers do exhibit caring for their students, with scores on classroom support higher than those for White teachers. The fact that these scores are negatively associated with some student outcomes may seem counterintuitive but aligns with other mixed-methods explorations in the same data set (Blazar & Pollard, 2023). Instead, the teacher-level mediation findings focus attention on growth mindset beliefs, preparation for instruction, and classroom organization, which all mediate sizeable shares of the effect of Black teachers on student outcomes. These patterns underscore Ladson-Billings’s (1995a) argument for situating high expectations for academic success at the “base” of instruction (p. 160).
Of course, there also is room to explore additional mediators, particularly for student outcomes where I observe only a moderate degree of teacher-level mediation (e.g., short-term absences). One place to focus is to measure and include “opposition” pedagogies that are central components of culturally responsive teaching and are thought to drive Black students’ classroom engagement but are not observed in this study. There may also be indirect paths that run through students’ families and communities that also are not observed in this study in a comprehensive way. The extent to which Black teachers engage families in schooling activities may play a key role in explaining effects on student absences in upper-elementary grades, for example, where students likely have less control compared to their parents and guardians. Markowitz et al. (2020) showed that in the context of Head Start, teacher-child race/ethnicity matching is associated with increased parental engagement and attendance.
Ultimately, the experimental and mediation analyses provide guidance for policy and practice related to teacher training and recruitment. The findings suggest that both approaches are necessary. At the same time, knowing what the policy and practice goals are does not make them simple or straightforward to achieve. On the recruitment side, increasing teacher diversity is a numbers problem at a bare minimum given very large demographic mismatches between Black teachers and Black students. In the full NCTE study, 22% of Black students had a Black teacher compared to 83% of White students who had a White teacher. More challenging, the task of shifting demographics requires school systems to wrestle with systemic racism in workforce policies, including the fact that Black teachers were systematically ushered out of schools following school integration efforts in the mid-20th century (Thompson, 2022) and current Black teachers often are underappreciated for their work (Griffin & Tackie, 2017). Furthermore, although I concur with other scholars who advocate for training of the current teacher workforce that is mostly White (Gershenson et al., 2022), the skills that Black teachers have more than White teachers, on average, and that benefit student outcomes may not be teachable, or at least not easily taught. Racial biases—instantiated, for example, in exclusionary discipline (e.g., Fenning & Rose, 2007)—are not easy to overcome.
Because of these challenges, I conclude by reiterating the fundamental points of this article: Experimental evidence shows that assignment to a Black teacher produces some of the largest effects on student outcomes across all of the education research literature (Fryer, 2017). These effects show up in SEL, school attendance, and test scores. They persist over time, and they extend to Black and non-Black students alike. We must use these findings to compel large changes in policy and in practice.
Supplemental Material
sj-pdf-1-edr-10.3102_0013189X241261336 – Supplemental material for Why Black Teachers Matter
Supplemental material, sj-pdf-1-edr-10.3102_0013189X241261336 for Why Black Teachers Matter by David Blazar in Educational Researcher
Footnotes
Appendix
Internal Validity Assumptions
| Baseline student characteristics (interacted with noncompliance or missing data indicators in Columns 2–7) | Baseline balance | Non-compliance | Missing data | ||||
|---|---|---|---|---|---|---|---|
| Survey (end of year) | Absences (end of year) | Test scores (end of year) | Absences (high school) | Test scores (high school) | |||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | |
| Female | 0.003 (0.012) |
0.008 (0.024) |
–0.020 (0.031) |
0.007 (0.036) |
0.016 (0.035) |
–0.005 (0.025) |
0.003 (0.023) |
| Asian | 0.036 (0.027) |
0.055 (0.045) |
0.057 (0.058) |
0.110 (0.074) |
0.111 (0.069) |
0.117**
(0.045) |
0.099**
(0.040) |
| Black | –0.032**
(0.014) |
0.007 (0.035) |
–0.050 (0.050) |
–0.065 (0.061) |
–0.065 (0.056) |
0.007 (0.033) |
0.008 (0.032) |
| Hispanic | –0.026 (0.019) |
0.043 (0.051) |
–0.011 (0.056) |
–0.003 (0.070) |
0.002 (0.065) |
–0.001 (0.039) |
–0.027 (0.039) |
| White | –0.059 (0.038) |
0.068 (0.052) |
–0.017 (0.079) |
0.024 (0.100) |
–0.009 (0.104) |
–0.006 (0.063) |
0.059 (0.053) |
| Free or reduced-price lunch | –0.001 (0.010) |
–0.011 (0.041) |
–0.041 (0.042) |
–0.030 (0.060) |
–0.030 (0.059) |
0.004 (0.043) |
–0.005 (0.042) |
| Special education | –0.025 (0.041) |
–0.030 (0.090) |
0.159**
(0.073) |
0.149 (0.108) |
0.128 (0.094) |
0.067 (0.068) |
0.099 (0.063) |
| Limited English proficiency | 0.013 (0.025) |
0.033 (0.062) |
0.095 (0.058) |
0.042 (0.074) |
0.030 (0.066) |
–0.008 (0.032) |
0.043 (0.031) |
| Prior absences (log + 1) | 0.012 (0.008) |
0.050*
(0.029) |
0.013 (0.032) |
0.060 (0.081) |
0.084 (0.056) |
–0.016 (0.012) |
0.001 (0.018) |
| Prior suspensions (log + 1) | 0.028 (0.029) |
–0.090 (0.139) |
–0.037 (0.163) |
0.006 (0.518) |
–0.017 (0.476) |
–0.040 (0.137) |
–0.059 (0.129) |
| Prior math achievement | –0.004 (0.009) |
–0.059*
(0.032) |
–0.068**
(0.027) |
–0.114***
(0.041) |
–0.117***
(0.038) |
–0.021 (0.020) |
–0.012 (0.022) |
| Prior ELA achievement | –0.003 (0.010) |
0.064*
(0.033) |
0.075**
(0.033) |
0.120***
(0.043) |
0.122***
(0.042) |
0.022 (0.023) |
0.016 (0.028) |
| p value on joint test of significance | .313 | .466 | .359 | .185 | .103 | .397 | .103 |
| Noncompliance/missing data rates for | |||||||
| Full sample | NA | 0.312 | 0.296 | 0.207 | 0.218 | 0.421 | 0.483 |
| Black teachers | NA | 0.295 | 0.279 | 0.148 | 0.154 | 0.352 | 0.418 |
| White teachers | NA | 0.327 | 0.311 | 0.232 | 0.247 | 0.445 | 0.505 |
| p value on difference | NA | .667 | .670 | .218 | .159 | .106 | .133 |
| Students | 1,283 | 1,283 | 1,283 | 1,283 | 1,283 | 1,283 | 1,283 |
Note. Estimates in each column come from separate regression models that predict a dummy indicator for whether students’ randomly assigned teacher is Black as a function of baseline student characteristics and school-grade fixed effects that are equivalent to randomization block. In Column 2, baseline student characteristics are interacted with an indicator of whether students moved classrooms after random assignment. In Columns 3 through 7, baseline student characteristics are interacted with an indicator for whether students have outcome data on each source. To isolate comparison between the students of Black versus White teachers, models also include indicators for whether students’ randomly assigned teacher is Asian or Hispanic (one group; estimates not shown) and interactions with baseline student characteristics. Models that assess noncompliance and missing data further condition on the interaction between these student-level indicators, the additional teacher race/ethnicity dummy, and baseline student characteristics. Joint tests of significance test the null hypothesis that baseline student characteristics (Column 1) or baseline student characteristics interacted with noncompliance/missing data indicators (Columns 2–7) are jointly equal to zero. Robust standard errors clustered at the teacher level are in parentheses. ELA = English language arts.
p < .1. **p < .05. ***p < .01.
Author
DAVID BLAZAR, EdD, is an associate professor at the University of Maryland College Park, College of Education, 2311 Benjamin Building, University of Maryland College of Education, College Park, MD, 2074; dblazar@umd.edu. His research focuses on the equitable allocation of educational resources, with a particular focus on teachers and teaching.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
