Abstract
A renewed call for replications has emerged in social science research. An important form of replication involves exploring the extent to which findings from a given study hold in other contexts. This study draws on opinion polling data to replicate key findings across time and space based on an original study in one location analyzing attitudes toward public school assignment policies. The replication finds that many of the original findings hold, though one important exception reflects the changing context. We note that the increasing availability of relatively inexpensive methods of quantitative data production facilitates replication and comment on how the temporal interval between the original study and the replication may influence the extent to which findings replicate. We argue that largely successful replications help to clarify the conditions under which findings replicate, and that sociologists are in the early stages of determining which strategies work best for replicating which findings.
Keywords
A crisis of confidence in several areas of social science research has led to a renewed call for replication in its many forms (Open Science Collaboration, 2015). Strict forms of replication, verification or reproduction in Clemens’ terminology or verifiability in Freese and Peterson’s terminology, are often challenging. However, exploring the extent to which findings identified in one study hold in other contexts represents an important form of replication, robustness to Clemens (2017), and generalizability to Freese and Peterson (2017), as well as an opportunity for theory construction and refinement (Lucas et al., 2013; Small, 2009; Zussman, 2004). Replicability across studies generally is a concern shared across fields and research designs, such as in the subfields of social psychology and in political science, where scholars debate the conditions under which laboratory experiments are or are not limiting (Druckman & Kam, 2011; Webster, 2016).
Despite these concerns, sociological examples of replication remain relatively rare, and language concerning replication often refers to experimentation (Freese & Peterson, 2017). In contrast, in this study, we demonstrate several features of using opinion polling data, one of several emerging, relatively inexpensive data collection strategies, for replicating findings in a non-experimental setting. We illustrate these features by replicating results of a recent study concerning attitudes toward public school assignment policies (Parcel & Taylor, 2015). Using multiple-groups confirmatory factor analysis (CFA), we find that the same attitudinal dimensions underlying public opinion in the original site replicated in that site 4 years later, as well as in four other locales in the latter time period. We note, however, that the relationship between two dimensions, neighborhood school support and support for diversity, did substantially shift over time in ways that reflect the changing context. We then use multiple-groups structural equation modeling (SEM) to show that predictors of relevant attitudes replicated both temporally and cross-sectionally, with some findings replicating more successfully than research expectations had suggested. Our demonstration illustrates the ways in which replication can be used as a tool not only for assessing the broader external validity of a set of research findings but also adding to our knowledge of the social phenomenon under investigation.
We first discuss in more detail the crisis of replication in the social sciences. We also define what would be a “successful” replication for non-experimental quantitative research. We then develop the rationale for the replication across time and space, provide background and findings from the original study, and explain the methods used in the replication. We describe the findings that replicated and the one that did not. We then provide comments regarding the usefulness of this strategy for other studies with similar design elements. We conclude by identifying limitations to our approach and describe the extent to which these findings speak to concerns about replicability of social science research more generally.
The Crisis of Replication in Social Sciences
Whether findings produced in a given study can be replicated is a concern across several fields in social science (Bollen et al., 2015; Clemens, 2017; Freese & Peterson, 2017). Replicating one hundred psychological studies, one multi-site study found effects that were roughly half the size of the original studies’ effects (Open Science Collaboration, 2015). Specifically, 97% of these original psychological studies found statistically significant results, but less than 40% of the effects replicated, calling into question the validity of many of the original findings. These authors argue that replication should be an increasingly valued scientific activity, allowing us to observe whether the same findings hold over time by either confirming or clarifying previous conclusions.
On a related note, one definition of external validity asks whether findings produced in one context can be replicated in another (Campbell & Stanley, 2015). Many studies in psychology use this definition, with researchers sometimes attempting to replicate basic findings under different conditions. Studies limited to male participants may be replicated with females and studies using subjects from colleges and universities may be replicated using adults from more general populations (e.g., see Burger, 2009). Sidman (1960) refers to this as systematic replication. To the extent that original findings replicate in different populations or contexts, researchers can be more confident in the broader generalizability of the underlying processes or mechanisms. On the other hand, if a replication of this sort fails either partly or entirely, the investigator derives important information regarding the conditions under which such findings do and do not hold.
What is a Successful Replication of Non-Experimental Quantitative Results?
There are two sources of ambiguity in defining whether a replication of non-experimental quantitative results is “successful.” The first source stems from the perspective of the original study; in this case, a successful replication is one in which the primary findings are found to hold in another context. From the perspective of the social scientific community, however, a successful replication is one that adds knowledge to our understanding of the social phenomenon investigated in the original study. This knowledge may be that key findings hold in another context, but it may also be that some findings do not hold, which provides a sense of the limits of generalizability of the original study.
The second source of ambiguity lies in the scope of the replication. Many non-experimental quantitative studies involve a package of findings, including multiple key findings. In our standard discourse, we typically think of a study replicating or not, but in many cases, particularly in non-experimental quantitative research, it may make more sense to think of specific findings replicating. As we will illustrate in our example, some key findings replicate, one does not, but in ways that we can understand. As such, we might say either that the original study as a whole “largely replicates” or we might refer to specific key findings that replicate and others that do not. Non-experimental quantitative studies may be especially likely to largely replicate, if only because social changes may have understandably resulted in some differences between the original study and any that follow. We return to the issue of social change within the context of replication below.
Background of the First Study: Public School Assignments
The return to racial and socioeconomic segregation among public schools across the Upper South, an area that once was desegregated, presents challenges to educators, parents, and policy makers. This issue has been the subject of many studies over time, with relevant analyses based on a long tradition within the sociology of education that investigated the dynamics of school desegregation and resegregation in specific urban and suburban school districts across the country. Analyses are of places such as Richmond (Ryan, 2010), Cleveland (Saatcioglu, 2010), Austin (Cuban, 2010), Rock Hill, South Carolina (Smith, 2010; Smith et al., 2008), and Charlotte, North Carolina (Mickelson et al., 2015), to name a few. Parcel and Taylor (2015) built upon this tradition by investigating policy change in assigning children to public schools in Raleigh, North Carolina, with their study coming in 2011, a time proximate to heated debate in Raleigh regarding the relative importance of neighborhood schools and diverse schools, as well as controversy involving public school assignment policies more generally.
Despite these findings and the associated rich tradition of studying school desegregation across locales, quantifying the similarities and differences across relevant studies has proved elusive (Yin, 2014). One difficulty is these past analyses were heterogeneous in approach; authors framed questions and invoked theory differently, with some studies using theory to guide their investigations and others not. Even in Smrekar and Goldring (2009), where there were detailed reports on seven studies of school desegregation, most chapters eschewed theory entirely, thus rendering cumulation of findings to inform theory modest at best. A second difficulty is that, although about the same topics, the studies frequently did not produce the same data. Such a strategy tailors findings to the particular location but does not facilitate comparisons across these locales (e.g., Frankenberg and Orfield, 2012).
Partial exceptions include studies of more than one locale; Grant (2009) compared Syracuse and Raleigh and Parcel et al. (2016) compared Raleigh and Charlotte. Such investigations are designed to promote comparison, thus assisting readers looking for replicability. But comparisons across these sources are much more problematic because theories and methods vary notably across them. For example, we can compare the segregation dynamics of Syracuse (Grant, 2009) with Charlotte (Mickelson et al., 2015) only in very general terms.
In addition to their study of Raleigh, Parcel and Taylor (2015) also attempted to go further by comparing findings across 23 of these individual studies. However, they were limited in what they could conclude owing to the abovementioned differences in theory and data. Their result understandably fell far short of the quantifiable precision that we should require to claim knowledge of replication or lack thereof. We therefore undertook the replication we report here to improve upon these limitations. We now sketch the methods and findings of original study as foundation for describing the current study.
Methods and Findings of the Original Study
Parcel and Taylor (2015) studied attitudes toward school assignment policies at a time of heated debate regarding how local children should be assigned to schools. They constructed a questionnaire that was administered by interactive voice response (IVR) to a sample of Raleigh adults in 2011. 1 They over-sampled African Americans to obtain a large enough number of cases for this racial group to facilitate subgroup analysis. Because IVR poll respondents are self-selected, they are not fully representative of the Raleigh population; hence, the project used post-stratification weighting based on results from the 2010 U.S. Census to adjust the survey percentages to be representative (Groves, 2006; Kalton, 1983). 2
Both popular and academic treatments of attitudes toward school assignments often imply that there is one dimension along which support for diversity/neighborhood schools would fall, with diversity at one end of the continuum and support for neighborhood schools at the other. However, Parcel and Taylor (2015) identified two separate dimensions: support for neighborhood schools and support for diversity. Findings suggested that many people supported neighborhood schools while only a subset of those respondents also supported diversity. Thus, rather than a strong negative correlation between the two dimensions, the study reported a moderate negative correlation (r = −.40). Findings suggested that those with conservative political philosophies and those registered Republicans more strongly endorsed neighborhood schools than Democrats. Findings also showed that those with more liberal philosophies and those registered Democratic would more strongly supported diversity as a principle for neighborhood school organization.
Beyond the two key dimensions, the original study also identified three dimensions of concern with children’s school reassignments: challenges, dangers and uncertainties. It is logical to expect moderate interrelationships among these three dimensions such that those who were more concerned about the uncertainties surrounding reassignment processes were also more concerned with its perceived dangers. Indeed, Parcel and Taylor (2015) found that those who were more concerned with the uncertainties surrounding children’s school reassignments were also more concerned with dangers to child learning and friendships. They also found that higher levels of social class appeared to insulate respondents from some of these concerns, given that higher levels of income and education would permit parents to mitigate reassignment uncertainties, and compensate for any perceived disruptions to learning and friendships by enriching child cognitive and social environments in other ways. Women were more concerned with challenges, dangers, and uncertainties because, if they are mothers, part of their overall responsibility for child well-being includes managing children’s school assignments (Parcel et al., 2016). Finally, they found whites were more acutely concerned with these sentiments, possibly because they have less self-interest in school arrangements implementing diversity plans (Taylor & Parcel, 2019).
In this replication, modeling includes analysis of five dependent variables: support for neighborhood schools, diverse schools, and concerns with challenges, dangers, and uncertainties with student reassignment. Predictors include political attitudes and affiliation, sociodemographic background includes gender and race, and socioeconomic resources include income and education. Expectations for the replication follow those from the original study: we expect the same relationships among key variables, plus predictable effects of controls including gender, race, and social class.
Methods Used in the Replication
Study Design
We fielded opinion polls in five school districts in 2015. We pursued replication over time by building on the earlier Parcel and Taylor (2015) survey, produced in 2011, to investigate whether Raleigh citizens in 2015 were still concerned with challenge, dangers, and uncertainty as they had been, as well as whether the negative, although not perfectly inverse, relationship between support for diversity and for neighborhood schools continued. This allowed us to evaluate whether the same sentiments identified in the 2011 poll data were also present in 2015 and whether these sentiments were trending in any way. Even if they remained present, predictors of each might have changed, which is important information for assessing the extent of the replication. Thus, this design enables us to discern whether findings produced in 2011 were found 4 years later, after what had triggered the political debate had settled down, but not disappeared.
In addition, we investigated whether the findings produced for Raleigh in 2011 were found elsewhere. If so, this promotes replication by suggesting that similar social processes were also present in other sites. To address this, we fielded very similar questionnaires in four additional southern school districts. While almost all of the questions were the same across the five 2015 questionnaires, we also used our knowledge of the respective areas from earlier literature to include questions unique to each area and to modify standard questions making them appropriate to each case. This meant that each of the five surveys could be analyzed on its own. Fielding five surveys also allowed us to expand the number of survey respondents in our study so that we have not only survey respondents from diverse areas but also more data to support multivariate analyses.
Choosing Sites for Our Study
We chose the new locations in part because each had been the subject of prior investigations, thus providing a base on which we could draw for further analysis. These locations included Charlotte NC (Mecklenburg County); Rock Hill, South Carolina; Louisville, Kentucky (Jefferson County); and Nashville, Tennessee (Davidson County). All, including Raleigh (Wake County), but Rock Hill have county-wide school systems; Rock Hill has a city-wide school system, but York County, SC, where it is located, is home to three other school systems, as well. Respective prior studies of each of these locales constituted the “pilots” for each poll, as well as the sources of guidance for polling questions, as noted above.
We also chose these five sites because they are similar to one another in some ways but different in others (see Campbell & Parcel, 2010; Parcel et al., 2012 for similar arguments). For example, all districts are in the upper South because it is a region that permits analysis of resegregation, in contrast to other parts of the country where decades of white flight have rendered this question moot. In contrast, although Charlotte and Raleigh are in the same state, and large and white collar, Raleigh’s desegregation efforts have been more sustained and pro-diversity compared to Charlotte’s. This is true, in part, because Raleigh was never under a court order to desegregate having pursued school desegregation voluntarily (Parcel et al., 2016). Rock Hill and Raleigh are an interesting comparison because they have both undergone intense public desegregation debates and both have sustained desegregation over time. Their contrasting sizes may help us to understand the role of size in these social dynamics.
Like Charlotte, Nashville has a contentious desegregation history. Smrekar and Goldring (2009) and Smrekar (2013) show how Nashville schools resegregated after the district was declared unitary in 1998. Nashville remains formally committed to diversity but with its wealthier residents having fled the system, it is finding it very difficult to meet its goals (see also Houston, 2012). We expect this locale to be similar to Charlotte given its turbulent history of desegregation/resegregation. In contrast, Phillips et al. (2009) find that despite a rocky start in the mid-1970s, by 1978, Louisville embraced integration, a path it continued to follow through the 2000s. We expect this locale to be similar to Raleigh, both in terms of the structure of sentiments toward school assignments and the predictors of these dimensions. We also expect it to be different from Charlotte and Nashville. Evidence subsequent to the Phillips et al. (2009) investigation, however, cautions us regarding these expectations because of more recent contention regarding reassigning children across schools, thus suggesting Louisville to be more similar to Charlotte than Raleigh (Phillips, private communication).
Recall that we sought both replication at a later time point and replication across sites. We use post-stratification weighting for each location to render its polling sample representative of the area from which it was derived. This is an important foundation for comparing the success of replication across the sites; post-stratification weighting assures that each survey is representative of the given area, thus removing the possibility that study results would be skewed by failing to achieve sample representativeness in one or more of the areas, thus interfering with deriving success or failure of replication.
Questionnaire Design
Our approach involved a replication of the Parcel and Taylor (2015) questionnaire used for their 2011 data production in Raleigh, with additions and changes needed to render the questionnaires suitable to locales that varied in some ways. We included the original Parcel and Taylor questions that were relevant to all five sites. However, of the five sites, only Raleigh had year-round schools, which meant that questions involving year-rounds were not asked in the other four areas. Also, given increased interest in school choice between 2011 and 2015, we added questions on preferences for both magnet and charter schools to the 2015 questionnaires. Questionnaires for each of the five areas are available from the first co-author upon request.
Survey Implementation
Our survey design also improved upon the Parcel and Taylor approach by including a separate cell phone sample. Households with cell phones only (CPO) continue to grow (Kempf & Remington, 2007). CPO usage varies by state, with those in the West more likely to be CPO (Idaho, 44.6%; Arkansas 44.4% in 2011 vs. 16.5% in New Jersey and 15% Rhode Island (Blumberg et al., 2012). The states in which we polled respondents are in the 33–36% range, thus suggesting moderately high but similar levels of usage. But given that men, those of lower SES and Latinos are more likely to be CPO users (Blumberg & Luke, 2012), adding this component to the 2015 data collection attempted to increase the number of poll respondents with these characteristics. Our goal was to obtain an additional set of 200 live polls from each site from cell-phone-only users.
The overall response rate for the entire sample was 2.8% based on American Association for Public Opinion Research Response Rate (AAPOR RR) #5 and 8.7% based on AAPOR RR #6. These modest response rates are not atypical for polling results and underscore the importance of using post-stratification weighting to render the results obtained representative of the larger population.
In sum, our survey strategy employed automated polling to landlines and live cell phone interviews with randomly selected respondents in five school districts associated with Raleigh, Charlotte, Louisville, Rock Hill, and Nashville. For each of these interviewing strategies, we included an oversample of African Americans to allow large enough Ns for subgroup analyses. With the exception of the Rock Hill school district, the other school districts are county-wide systems whose largest cities reflect the names we designate as the locales of our surveys.
Assessing Replication of Five Attitudes
Analytic Strategy
In the following analyses, we assess two forms of replicability. First, we reproduce the original 2011 quantitative case study findings from Raleigh by developing measurement models for each dimension and examining the correlates using structural equation models (SEMs) (Bollen, 1989). The general form of the SEM for this analysis is given by
Second, we explore whether the results hold over time in the same geographic context by conducting a multiple-group analysis (Bollen, 1989) based on comparing the original public opinion poll data from Raleigh in 2011 and new data from Raleigh obtained in 2015. The multiple-group analysis represents an extension of the SEM given in equations (1) and (2) as
Third, we explore whether the results hold across social contexts by conducting a multiple-group analysis based on comparing data obtained from five cases (Charlotte, Louisville, Nashville, Rock Hill, and Raleigh) in 2015. This analysis has the same structure as the multiple-group analysis for the two time periods.
All of the models incorporate post-stratification weights specific to the sites and address missing data with a maximum likelihood estimator (Arbuckle, 1996). The indicators of the dimensions and most of the correlates are missing data for less than 1% of cases. The exceptions are political ideology which is missing for 3–5% of cases and income which is missing for 15–20% of cases over time and across sites. All of the indicators of the dimensions are ordinal with five response categories. We treat these measures as continuous in our measurement models, though we explored treating them as categorical in the measurement models in preliminary analyses and found substantively similar results. All analyses were conducted in Stata 15 and Mplus 7 and the code is maintained on a publicly available GitHub repository (identifying link omitted) (Muthén & Muthén, 2012; StataCorp 2017).
Results
Preliminary Analysis of Raleigh 2011
We begin with reproducing the results of the original study with latent variables for each of the dimensions. Neighborhood school support, support for diversity, and dangers of reassignment are each measured by four indicators, while challenges and uncertainties of reassignment are each measured by two indicators (see Appendix A, Table A1 for question wording and item level descriptive statistics). Preliminary investigations of measurement models for each of the dimensions revealed that the model for support for diversity and dangers of reassignment required specifying two correlations among indicator error terms to account for shared variances in the indicators beyond the latent variable. In addition, with only two indicators each, the measurement models for challenges and uncertainties of reassignment were identified by constraining the factor loadings to equal one. Table A2 in Appendix A provides selected parameter estimates for the measurement models for all of the dimensions. In terms of model fit, the neighborhood school support was the only measurement model with degrees of freedom to test model fit; it fits well with the data with a chi-square p-value of .02, a BIC of 7.31, an RMSEA of .04, and a CFI and TLI both greater than .96.
The correlation between the latent measures of neighborhood school support and support for diversity is −.57, which is a bit larger than the correlation of −.40 found in the original study but is still considered moderate in magnitude. It is likely that accounting for random measurement error as part of the CFA accounts for the strengthened correlation. In addition, the estimates of the associations between the correlates and each of the latent dimensions reveal similar patterns of associations as found in the original study. For instance, in general, the correlates of neighborhood schools support are in the opposite direction of the correlates of support for diversity. Republicans and those with conservative political philosophies are more supportive of neighborhood schools while Democrats and those with more liberal political philosophies are more supportive of diversity as a principle guiding public school assignments. Whites are more concerned with the challenges, dangers, and uncertainties of student reassignments than are non-whites, and women are more concerned with these dimensions than are men. Replication of these associations provides a foundation for analyses described below.
Replication 1: Raleigh Over Time: 2011 and 2015
We next explore the extent to which the relationships between the various dimensions of support for public school assignments and the sociodemographic correlates of these dimensions hold over time in the same geographic context. Our first step involves conducting a multiple-group analysis to test the various forms of measurement invariance discussed above for each of the latent dimensions. We find clear evidence of metric, scalar, and factorial invariance for all dimensions with the exception of neighborhood school support (see Table A3 in Appendix A). For neighborhood school support, the evidence is mixed with significant chi-square tests, but BICs and all other model fit statistics indicating the models with factor loadings, intercepts, and error variances constrained to be equal over the two time points having a reasonable fit with the data.
Figure 1 illustrates the means and variances of the latent dimensions across the two time points. The means for 2015 are all fixed to 0 for model identification. For neighborhood school support, we see clear evidence of an increase in the mean and a decrease in the variance from 2011 to 2015. This indicates a shift toward greater support for neighborhood schools over the 4-year period. With the exception of reassignment uncertainties, the other measures also show increases in the mean over the 4-year period, but none are as dramatic as with neighborhood school support. The variances of the other dimensions all remain largely similar between 2011 and 2015. The correlation between neighborhood school support and support for diversity, however, substantially decreased from −.57 in 2011 to −.19 in 2015. This suggests that polarization between favoring neighborhood schools and diversity as the basis for school assignments dropped over the interval, possibly a function of the Raleigh School Board’s decision to reduce inter-school reassignments, which had produced considerable controversy. Estimates of latent means and variances over time. NS = neighborhood school support, DS = diversity support, CR = challenges of reassignment, RD = dangers of reassignment, and RU = reassignment uncertainties.
Figure 2 illustrates the regression coefficients for the predictors of each latent dimension in 2011 (top) and 2015 (bottom) along with 95% confidence intervals. The asterisks indicate estimates that are significantly different between the two time points. Perhaps not surprisingly given the changing distribution over time, we find quite a few differences in the coefficients predicting support for neighborhood schools between 2011 and 2015. In particular, the associations for having a very favorable view of Martin Luther King, holding conservative or moderate ideologies relative to liberal ideologies, and being white relative to non-white all substantially attenuated from 2011 to 2015. The regression coefficients for the other latent dimensions remained largely similar across the two time points with only a few statistically significant changes and no particular pattern to the shifts. Estimates of latent means and variances across sites in 2015. NS = neighborhood school support, DS = diversity support, CR = challenges of reassignment, RD = dangers of reassignment, and RU = reassignment uncertainties.
Our assessment of replication with respect to the two time points reveals a high degree of stability in four of the five key dimensions related to public school assignments but substantial change in one of the key dimensions, neighborhood school support. This manifests in a changing distribution for the latent variable, a changing relationship with support for diversity, and substantial shifts in the predictors of support for neighborhood schools. Nonetheless, the high degree of stability in four key dimensions of public opinion in Raleigh with respect to school assignments policy advances the replicability of findings over time.
Replication 2: Five Locations in 2015
Our second examination of replication explores the extent to which the relationships between the various dimensions of support for public school assignments and the sociodemographic correlates of these dimensions hold across geographic contexts. As with our analysis across time points, we begin with a multiple-group analysis testing various forms of measurement invariance for each of the measurement models for the latent dimensions. We find that metric, scalar, and factorial invariance hold for all dimensions with the exception of neighborhood school support (see Table A4 in Appendix A). For neighborhood school support, evidence is mixed for metric and scalar invariance but broadly points toward models imposing these constraints as having a good fit with the data. The evidence is less clear for factorial invariance, so we relax the constraint that the error variances are equal across sites for the measures of neighborhood school support in the following analyses.
Figure 3 illustrates the means and variances of the latent dimensions across the five cases in 2015. The means for Raleigh are all fixed to 0 for model identification. For neighborhood school support, we see some variation in the means across sites with Charlotte standing out as having the lowest average level of support for neighborhood schools. In addition to neighborhood school support, we find evidence of variation across cases in the challenges and dangers of reassignment with Charlotte and Rock Hill showing more concern on average with reassignment than, in particular, Raleigh in 2015. The remaining two dimensions, support for diversity and reassignment uncertainties, have broadly similar means across cases. In contrast to the means, the latent variances are all generally similar for each dimension across cases. Finally, the correlations between neighborhood school support and support for diversity ranged from −.19 in Raleigh to .18 in Nashville with several not reaching statistical significance (including Nashville). These correlations are all much lower than the correlation observed in Raleigh in 2011. As noted above, this suggests less polarization regarding the principles by which students are assigned to schools in the 2015 period relative to the 2011 period in Raleigh. Estimates of correlates of each latent dimension over time. Unstandardized estimates with 95% confidence intervals. Asterisks indicate covariates with statistically significant different estimates over time. ∗p < .05, ∗∗p < .01, ∗∗∗p < .001.
Figure 4 illustrates the regression coefficients for the predictors of each latent dimension across sites along with 95% confidence intervals. In contrast to the findings over time, we see limited evidence of difference in the relationships across sites in 2015 with the exception of dangers of reassignment. For this dimension, the relationship between income and dangers of reassignment has a stronger negative association in Nashville than in the other sites. In addition, age has a stronger positive association with dangers of reassignment in Charlotte than in other sites. Estimates of correlates of each latent dimension across sites in 2015. Unstandardized estimates with 95% confidence intervals. Asterisks indicate covariates with statistically significant different estimates across sites. ∗p < .05, ∗∗p < .01, ∗∗∗p < .001.
Our assessment of replicability with respect to the five cases reveals a relatively high degree of comparability in the five key dimensions related to public school assignments during 2015. Differences remain from the original study for neighborhood school support in terms of the distribution of the latent variable, the changing relationship with support for diversity, and the predictors of neighborhood school support across geographic contexts as well as time. It may be that neighborhood school support as a construct is more volatile owing to changes in school assignment policies across time than are the other dimensions that we studied. This speaks to the complex relationship between social change and replication and, as suggested earlier, does not necessarily suggest that the original findings could not be replicated because of failure to implement the replication properly or that the original findings were not true. We return to this issue below.
Discussion
How Successful Was the Replication?
In our illustrative analysis, we found considerable evidence for replication of key findings from Parcel and Taylor’s (2015) study of attitudes toward pupil assignment in Raleigh, North Carolina. We have demonstrated that the dimensions of sentiment relevant to public school assignments in Raleigh in their 2011 data collection continued to be relevant there 4 years later, and that the determinants of these attitudes behaved in predictable ways in both time periods. In addition, we found that the sentiments observed in Raleigh in 2011 and 2015 were relevant in four additional southern locations and that there are many commonalities in predictors of key sentiments across them.
These latter findings also mean that some of the specific findings we expected, such as the similarity of Charlotte and Nashville, or the possible dissimilarity between Raleigh and Louisville, were not apparent. Thus, in general, our replication was more successful than we had expected because locational differences we had expected did not appear. What we did find, however, is that the sentiments surrounding neighborhood school support shifted over time and this had some impact on the correlates of this construct. In our view, this does not mean that our replication was unsuccessful. Rather, it signals that social changes had occurred, even within the 4-year interval between 2011 and 2015, resulting in these differences. Thus, although not every finding from the original study replicated, in the aggregate, these findings suggest that the attitudinal and value dynamics regarding school desegregation and resegregation are applicable beyond the original study. More importantly, our findings offer a strategy for addressing the prospect of replicating additional findings such as these in the future, possibly with data from additional locations, and/or repeated analyses of the locations studied here. We conclude that our replication was largely successful.
Substantively, we now have greater evidence indicating that support for neighborhood schools and diverse schools are not polar opposites, each anchoring one end of a single dimension, than we had before this replication. Rather, findings suggest that although neighborhood schools appear popular with a wide variety of respondents, a subset of those who favor neighborhood schools are also very supportive of diverse schools and classrooms. In addition, three additional dimensions of concern are important across time and across locales: school reassignments prompt concerns of the challenge parents whose children are reassigned experience; such reassignments may expose both child learning and child friendships to perceived dangers; and the processes families experience when reassignments are pending cause worrisome uncertainty. Regardless of the sites in which future studies of school desegregation and resegregation are investigated, researchers would be wise to consider expanding their investigations to include dimensions of sentiment beyond support for diversity and neighborhood schools.
Limitations
Resources limited us to five cases for our replication. Although we have argued that these cases contain some variation in past experience with school desegregation/resegregation, we fall short of exploring attitudes surrounding these policies across a national sample or population of case studies of reassignment policies. In part, this goal may be inappropriate. As noted earlier, the dynamics of school assignment concerns we have studied are not relevant in places such as highly segregated northern urban school districts where decades of white flight have resulted in majority–minority districts (see Darling-Hammond, 2010). In addition, although we have found considerable stability in the correlates involving the five dependent variables we studied over time and across sites, it is possible that analyses of other attitudes might show varying results.
Recall that we selected our cases for replication based on location, with those locations being confined to school districts in the Upper South. We identified some similarities across the districts, but also some differences in terms of district size and relevant school desegregation histories. Our empirical work occurred 4 years after the original survey. We have found substantial replication despite these differences, which we interpret as contributing to spatial and temporal consistency in the findings.
Theoretically, these findings suggest that the social forces underlying attitudes toward school assignment policy are enduring, at least for the 4-year interval in Raleigh, and generalizable to the several districts we have studied here. Parcel and Taylor (2015) used a social capital framework in their study of Raleigh. Thus, our current findings suggest that framework to also be useful for interpreting our current findings (see Parcel, 2021).
We also recognize that a longer time interval from the original fieldwork might reveal greater differences so that a survey in Raleigh in 2025, for example, might reveal different findings both from Parcel and Taylor (2015) and from the findings we report here. And, as we have signaled, the same dimensions of concern in these five locations might resonate less well elsewhere, especially if those locations have very different histories of school desegregation/resegregation. These issues await future research.
Broader Considerations
The methods we have used here are easily applicable to other forms of survey data that researchers may want to access in the interest of replication. For example, the General Social Survey contains a core set of questions asked of respondents across many years (Marsden et al., 2020; see also Freese & Peterson, 2017). A specification estimated for 1 year could be replicated with the same data from other years, thus tracing whether the process of predicting any given dependent variable replicates over time, or across subgroups. The common data production of the International Social Survey Programme allows expansion of this notion even further, allowing attempts at cross cultural replication.
Perhaps replications have low priority because, given limited resources, investigators prefer to generate new findings instead of replicating existing ones. But replication need not be costly. Some firms, such as Amazon via Mechanical Turk and Prolific Academic, can be used to obtain convenience samples at very low costs (Shank, 2016). And a recent study found that the lack of representativeness of samples obtained from Amazon’s Mechanical Turk did not lead to appreciable bias across a range of studies (Weinberg et al., 2014). Other firms and organizations, such as the GfK Knowledge Panel and NORC’s Amerispeak (which is used for the popular Time-sharing Experiments for the Social Sciences platform), provide similar capabilities. YouGov and various public opinion firms can produce data from either nationally representative probability samples or probability samples designed to be representative of smaller geographic areas (Weinberg et al., 2014). Firms and organizations such as these make it much easier and more affordable than in the past for individual researchers or small research teams to gather new data for a variety of purposes. Such data can be designed to explore the replicability of quantitative findings on a variety of topics, which could be analyzed in ways similar to those we have illustrated here. It is likely that post-stratification weighting, as we have argued, will remain important to render samples produced using these methods representative of desired populations.
Additional strategies to pursue replication remain possible. For example, when multiple investigators use a common database, such as Fragile Families or the National Longitudinal Survey of Youth (NLSY), to conduct secondary analyses, they are most likely to be conducting at least partial replications of one another’s work because subsets of researchers are likely to be modeling the same dependent variables with overlapping sets of predictors; in many cases, they will adopt the same measurement decisions for these variables. These are valuable endeavors that also contribute to demonstrating replication and accumulation of findings. Such data also can be used as we have here to determine if models estimated on one subgroup hold when estimating the same models on other subgroups, such as children of different ages or groups that vary by race and ethnicity.
Conclusions
We believe our findings contribute to broader conversations regarding replication (Clemens, 2017; Freese & Peterson, 2017). A key issue in replication is that the time interval between the original analysis and the replication may be a factor in the extent to which the original study’s findings replicated. Too short an interval may not permit sufficient social change to occur that, had the interval been longer, might reduce the extent of replication; this short interval may lead researchers to falsely infer a more substantial replication than might otherwise have been observed. Too long an interval may result in so much intervening social change that it is difficult to parse out why some findings replicated and other did not. This problem may be more acute in disciplines such as sociology and political science, especially when social and political changes are rapid. In this analysis, we have noted that the replication of the 2011 Raleigh findings in 2015 was done across only a 4-year interval, thus reducing the possibility that actual social changes occurred between the two phases of work, changes in sentiments toward neighborhood schools noted above notwithstanding. Such findings prompt the question of whether additional findings might fail to replicate should the interval between the original and additional data production be longer. As we have noted, this issue may be less acute in some subfields, such as social psychology, where researchers focus on fundamental social processes potentially less influenced by macro-level events. In the case of our illustration, we were able to tie findings that did not replicate to understandable social changes that had occurred in the interval between the first and second studies.
We believe that our field is in the early stages of discovering which methods are most appropriate for evaluating the replicability of findings produced across a wide variety of studies. We look forward to additional demonstrations involving these and differing methods useful in pursuing these important scientific objectives. Such investigations will provide additional opportunity for our field to both confirm and refine findings that heretofore could not be evaluated in this way. Surely, this work should have high priority on our collective research agendas in the years ahead.
Footnotes
Acknowledgments
We thank Robert Johnson for assistance in developing post-stratification weights and Andrew Taylor for serving as Guest Editor for this paper. We appreciate helpful comments on an earlier draft from Joe Whitmeyer, Paul von Hippel, Kenneth Land, Murray Webster, Annette Lareau, and Elliot Weininger.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Preparation of this paper was supported by grants from the National Science Foundation SES-1528559, 1527285 and 1527762. Earlier versions of portions of the paper were presented at the Annual Meetings of the American Sociological Association, 2017, in Montreal, Canada, and the Southern Sociological Society Meetings, 2016, in Atlanta, GA.
Notes
Appendix
Model Fit Statistics for Multiple Group Measurement Models Over Sites; N = 5301.
| Chi-sq | df | p-value | BIC | RMSEA | CFI | TLI | |
|---|---|---|---|---|---|---|---|
| Neighborhood school support | |||||||
| Model 1: loadings and intercepts | 57.39 | 34 | .007 | −234.19 | .03 | .93 | .94 |
| Model 2: residual variances | 129.25 | 50 | .000 | −299.55 | .04 | .75 | .85 |
| Diversity support | |||||||
| Model 1: loadings and intercepts | 27.41 | 24 | .286 | −178.41 | .01 | 1.00 | 1.00 |
| Model 2: residual variances | 53.62 | 48 | .268 | −358.02 | .01 | 1.00 | 1.00 |
| Reassignment challenges | |||||||
| Model 1: intercepts | 8.08 | 4 | .089 | −26.22 | .03 | .99 | .98 |
| Model 2: residual variances | 59.82 | 12 | .000 | −43.09 | .06 | .85 | .94 |
| Reassignment dangers | |||||||
| Model 1: loadings and intercepts | 20.34 | 24 | .677 | −185.48 | .00 | 1.00 | 1.00 |
| Model 2: residual variances | 67.49 | 48 | .033 | −344.15 | .02 | .99 | .99 |
| Reassignment uncertainty | |||||||
| Model 1: intercepts | 9.71 | 4 | .046 | −24.59 | .04 | .99 | .98 |
| Model 2: residual variances | 20.61 | 12 | .056 | −82.30 | .03 | .98 | .99 |
