Abstract
In recent decades, with the rapid marketization of educational resources in K–12 education, private tutoring has become extremely popular in China, yet the previous research has not yet reached a consensus on the impact of private tutoring on academic outcomes, and has also overlooked the influence of individual choice under the exam-oriented educational system in China. By using data drawn from the China Education Panel Survey, this study examines the heterogeneous treatment effect of private tutoring on 8th graders’ academic performance from the perspective of individual choice. The results show that a propensity to attend private tutoring can differentiate the class disparities among middle school students in terms of individual, family, school, and other factors. Furthermore, the results indicate that private tutoring generally has limited impacts on academic achievement, net of all background factors. However, students whose propensity to attend private tutoring lies in the intermediate range tend to benefit the most from private tutoring. For both students from socioeconomically disadvantaged families with a low propensity and their advantaged counterparts with a high propensity, private tutoring has little effect on their academic performance. Sensitivity analysis further shows that the heterogeneous effects of private tutoring differs across cognitive ability, subject types, and tutoring periods. The finding has important implications for understanding the consequences of China's policy interventions in terms of reducing educational inequality.
Keywords
Research question
China's market-oriented economic reform, beginning in the late 1970s, has influenced education in multiple ways, including the public education system, school management models, and the distribution of educational resources (He, 2009; Ross and Gibson, 2007). It has generated new forms of education that were not previously part of the public education systems. The demand for individual educational opportunities, including private tutoring, has thus become widespread. Over the past decade, the private tutoring industry has developed rapidly in China. An increasing number of students in the compulsory education stage (the first nine years of schooling) have participated in private tutoring in order to improve their grades and thus be more competitive for examinations. According to a survey on private tutoring conducted by the Chinese Society of Education (CSE), more than 80% of the parent participants indicated that they strongly or somewhat agreed that private tutoring was an essential part of education during the elementary and middle school stages (Chinese Society of Education, 2016).
As private tutoring has come to prevail and exert a strong impact on the public education system in China, the public has become concerned about the issue of educational inequality (Yan, 2019). In July 2021, the General Office of the Communist Party of China Central Committee and the General Office of the State Council jointly issued guidelines designed to ease the burden of excessive homework and off-campus tutoring for students engaged in compulsory education. The guidelines outlined the so-called “double reduction” policy and sought proactively to control the disorderly expansion of private tutoring institutions. In this context, examining the impact of private tutoring on child development from a sociological perspective may help to clarify the mechanisms by which individual education choices are made and the influences of private tutoring itself, which provides insights to understand how individual and social factors influence private tutoring and then lead to educational inequality.
The relationship between private tutoring (usually called “shadow education” in this field) and educational equality has been widely discussed in the field of the sociology of education (Stevenson and Baker, 1992). On the one hand, according to the critical theories, although marketization has provided more choices beyond public education for most families, it will also lead to a monopoly over the highest-quality resources by the privileged groups if left uncontrolled. The privileged groups would even be able to legitimize a dominant power in the moral and ideological domains by influencing the socialization processes, thereby solidifying the social reproduction processes and exacerbating inequality (Bourdieu, 1973; Bowles and Gintis, 1976; Qian, 1997). Supporters of this theory oppose the excessive marketization of education, arguing that private tutoring will deepen social reproduction and lead to educational inequality.
On the other hand, some scholars have argued that the ways in which schools traditionally allocated and utilized educational resources were inefficient due to a lack of competition, and needed to be improved by introducing the market mechanism. They believe that the imbalance and inefficiency in resources allocation within the public education systems are the real reason why the interests of disadvantaged groups cannot be protected. Therefore, strengthening public education is not the best way to alleviate educational inequality. On the contrary, the market mechanism can allocate and utilize the resources more effectively. This will benefit disadvantaged groups and therefore help to promote social mobility (Hanushek, 1986; Howell and Peterson, 2006). In short, the supporters of this theory believe that private tutoring benefits disadvantaged families, boosts social mobility, and reduces educational inequality.
Both of these arguments provide important theoretical perspectives for studying the private tutoring phenomenon in China. However, which one is more convincing in terms of understanding the relationship between private tutoring and educational equality is yet to be discussed. An increasing number of studies have focused on the relationship between private tutoring and students’ academic achievement, as nationally representative data are becoming available in recent years (Ding and Xue, 2016). However, existing research has not reached a consensus on whether participating in private tutoring is beneficial for students to improve their academic performance. The motives of parents and their children for participating in private tutoring are complex and influenced by many factors. Under the examination-oriented system, parents and children are not passively accepting the arrangement of the system but will fully exercise their autonomy and reasonably seek to maximize their interests within the existing rules. Previous studies have shown that differences in individual, family, and school background factors can lead to heterogeneity in the effect of private tutoring. However, previous studies have not clarified the influence of the propensity of self-selection on the effect of private tutoring.
This study raises two important and interrelated research questions regarding the context of education marketization reform in China. First, how do individual, family, school factors affect the propensity of students to attend private tutoring? Second, how do the effects of private tutoring on students' academic performance vary by different levels of the propensity of attending private turoting? An exploration of these two questions will enable us to understand individuals’ preferences regarding private tutoring and the heterogeneity of its impact on educational outcomes. It will also contribute to clarifying the impact of private tutoring on educational equality, in light of the influence of marketization in education resources in China.
Literature review
Institutional and individual factors influencing private tutoring
To understand why private tutoring is essential, let us begin with the educational system and the mechanism of individual selection in China. Private tutoring is a “product” of the examination-oriented education system. 1 According to an international comparative study, private tutoring is more popular in countries where examination-oriented education is the mainstream (Peng, 2008). China's examination-oriented education represents an effective way to allocate a limited amount of educational resources. However, it cannot satisfy the need for high-quality education for all families, especially those of higher socioeconomic status. Therefore, private tutoring is a market response that is designed to help families to cope with examination-oriented education. Through private tutoring, parents expect their children to be more advantaged in terms of obtaining higher grades, so that they can access more educational resources and opportunities.
Private tutoring is a choice that parents often make following a comprehensive evaluation of both the family's wealth and their children's learning ability, and is based on the mechanism of individual selection. 2 There are two main reasons why parents make this choice: First, according to self-interest theory, parents believe that private tutoring brings benefits, such as compensating for their children's learning deficiencies and improving their ability to pass examinations. It is therefore a rational choice for them. Second, from the perspective of peer pressure, parents choose to have their children attend private tutoring because they feel the stress of complying with the social norms. Faced with the general pressure to enter higher education, most parents expect their children to enter high-quality schools. Therefore, even though their children may be performing relatively well at school, parents may still seek out extracurricular education in order to strengthen their advantages. Thus, private tutoring is actually a product of families’ educational competition.
According to previous studies, children from families with higher socioeconomic status are more likely to participate in private tutoring (Chu, 2009; Qian and Tang, 2009) since their parents have more resources to invest in their education. A series of factors have a significant influence on children's participation (or non-participation) in private tutoring: area of residence (urban or rural); parents’ educational and occupational backgrounds; parents' expectations about their children's education; and family income, structure, number of children, etc. (Liu and Chen, 2013; Tsang et al., 2010; Wang, 2017). At the school level, the characteristics of the schools and classes also affect the possibility of students participating in private tutoring. Researchers have found that factors including school quality, teachers’ capabilities, the average socioeconomic status among the class, and whether other students participate all affect students’ participation in private tutoring (Xi and Li, 2020; Xue and Fang, 2019). The more interaction parents have with the teachers and other parents, the higher the likelihood of participating in private tutoring and also the more money will be spent on it (Zhou and Wu, 2018).
Students’ academic performance also affects their parents’ decisions about whether or not they should be privately tutored (Li and Hu, 2017). Private tutoring is both a “compensation” and an “enhancement”: the former is especially important for students who perform poorly, because as a remedial measure it helps them to better understand what they have learned in school; meanwhile, the latter relates to students who perform relatively better, as it can further improve their ability to handle examinations. Therefore, within the examination-oriented education system, students with different levels of academic abilities may all have a need to attend private tutoring.
In conclusion, individual, family, and school factors are interconnected and work jointly to affect the possibility of participating in private tutoring. They make individual choices about private tutoring complicated, and this complexity affects the causal effect of private tutoring on academic performance in different contexts.
The private tutoring effect and its heterogeneity
Does private tutoring help students to improve their academic performance? There is no consistent conclusion about this in the existing studies. Some studies state that private tutoring enhances grades and increases the possibility of attending better high schools or universities (Hu et al., 2015; Wu and Chen, 2014; Xue, 2016; Zhang, 2018; Zheng et al., 2020). However, others have found that private tutoring has no significant positive effect, and sometimes even a negative one, on academic performance (Pang et al., 2017; Sun and Tang, 2019; Zhang et al., 2015). They conclude that private tutoring only has a “placebo effect”, and can also lead to intense competition among students and academic anxiety.
There are several reasons explaining the inconsistent conclusions. In terms of analytical methods, there are two important problems to consider when examining the causal effect of private tutoring. First, the endogeneity problem, caused by selectivity bias, makes it impossible for conventional regression analyses to obtain unbiased estimates of the effect of private tutoring. Students who attend private tutoring and those who do not differ significantly in their backgrounds. The opportunities to participate are not randomly assigned but are affected by many unpredictable factors that influence academic performance. Meanwhile, the relationship between private tutoring and academic performance is reciprocal, since the former influences students’ scores, while changes in test scores might further affect students’ enthusiasm to participate in private tutoring. 3
Second, private tutoring is an educational practice chosen by families under marketized conditions, and its effects are highly heterogeneous. Based on parents' and students' individual preferences, the outcomes of participating in private tutoring will differ for students with different levels of propensities to attend private tutoring. Therefore, the assumption of a homogeneous treatment effect leads to only a partial understanding of its impact. Meanwhile, this assumed effect homogeneity will also likely to misguide the education policies. In such a context, it is important to analyze under what circumstances private tutoring can lead to benefits or problems, and which groups of students would benefit the most from participating in it.
Existing studies have identified several main causes for the heterogeneous effects of private tutoring. 4 The first reason is the differences between social classes. Some studies have found that students from families of relatively lower socioeconomic status benefit more from private tutoring (Hu et al., 2015; Li and Hu, 2017; Zhang et al., 2020). As a result, private tutoring could help to reduce the achievement gaps in terms of socioeconomic status, which would enhance educational equality. In contrast, other studies have found that private tutoring had little effect in terms of improving the academic performance of students from disadvantaged families (Yang, 2020), particularly those living in rural areas (Sun and Tang, 2019). The second reason is differences in school performance rankings. For example, students who attend schools with lower performance rankings are more likely to improve their academic scores by participating in private tutoring (Hou, 2020; Zhang, 2018). Third, private tutoring can have varying effects on scores in different subjects. Studies have identified considerable positive effects for private tutoring in English and mathematics, but not for Chinese (Li, 2018; Li et al., 2020; Liu and Yao, 2018; Wang, 2020).
Research hypotheses
The conventional analytical method for examining heterogeneous effects of private tutoring is to add the interaction term between “private tutoring” and a variable that may alter the effect of private tutoring into a regression model, assuming that the selection mechanism affecting the effect heterogeneity can be simply attributed to a certain factor. However, there may exist more than one potentially influential factor that could produce varying effects of private tutoring. Simply focusing on one or several factors might simply ignore the complex individual selection mechanism. Also, the inclusion of many interaction terms into the regression model can lead to model estimation problems when the data are constrained, and strong theoretical support is needed to explain those interactions. Moreover, the role of interaction terms in the conventional regression model is based on the assumption of a linear relationship, and it cannot examine the possibility of any non-linear interactions (Brambor et al., 2006; Hainmueller et al., 2019). 5
Through the widespread use of propensity score analysis methods, researchers have gained a deeper understanding of how self-selection processes affect the heterogeneous treatment effects (Hu et al., 2021; Xie et al., 2012; Zhou and Xie, 2020). Studies have also shown that the propensity to participate in private tutoring shape the effects of private tutoring on academic achievement (Li, 2016). 6
“Propensity” refers to the probability that students will participate in private tutoring against the background of individual choice and other influencing factors. The main purpose in analyzing the heterogeneous treatment effect based on the propensity score is to explore how the effect of private tutoring varies by propensity scores. This analytical method is unique and helpful for exploring the effect heterogeneity of private tutoring. First, when investigating the opportunities for participation, propensity scores can distinguish differences in the students’ backgrounds, such as their family and school characteristics, as well as the ways in which their parents invest in their education. It will therefore be helpful for determining the characteristics of the groups that will benefit the most from private tutoring, providing insights for the educational policies on private tutoring. Second, by reducing the dimensionality of these complex influencing factors, the private tutoring effect and its propensity score will constitute a simplified two-dimensional system, in which the interaction will no longer be simply a linear relationship, and possible non-linear relationships can be examined using non-parametric and semiparametric smoothing methods (Zhou and Xie, 2020).
To theorize the heterogeneous effects of private tutoring by propensity scores, three different hypotheses can be derived based on the understandings of equity in educational opportunity, educational process, and educational outcome (Brand and Xie, 2010; Guo and Zhou, 2020; Li, 2016). They have provided different perspectives for illustrating the relationship between private tutoring and educational inequality.
The “positive selection hypothesis” suggests that individuals who are most likely to take action benefit the most. Private tutoring requires the investment of lots of time and financial resources; as such, it is not really affordable for families of lower socioeconomic status. Therefore, they lack opportunities to participle in private tutoring. Depending on the predominant evaluation processes of school education, whether or not students attend private tutoring is often seen as representing the extent to which parents value their children's education. In an evaluation system that is dominated by elitism, teachers are more likely to favor students with a higher propensity for attending private tutoring (Lyu, 2006). Children from advantaged families have more opportunities and resources to attend private tutoring, benefit more from school education, and therefore perform better academically. This hypothesis proves the process of social reproduction in education, especially the way in which privileged groups maintain their advantaged status, which leads to increasing educational inequality. Therefore, this study proposes the following hypothesis:
In contrast, the “negative selection hypothesis” suggests that individuals who are least likely to take action benefit the most. It is not only rational choice but also peer pressure that leads parents to choose private tutoring. For those of higher socioeconomic status, having their children attend private tutoring is not simply an act to improve their children's ability to pass examinations, but also under the influence of peer pressure, based on their concerns about their children's education amidst increasingly intense educational competition. Conversely, relatively disadvantaged families make more rational choice on private tutoring due to the constraints arising from their economic situation. At the same time, since public education system emphasizes equity issues, the school evaluation system should be more inclined to encourage and reinforce participation by children from disadvantaged families, as they would benefit more from parental investment and gain better grades were they able to access private tutoring. This hypothesis reflects the possiblity of social mobility among disadvantaged groups through participating in private tutoring. Therefore, this study proposes the following hypothesis:
The “medium selection hypothesis” suggests that individuals with a medium likelihood of participating in private tutoring benefit the most from it. Unequal opportunities to participate in private tutoring might lead to differences in the private tutoring effect, but the existing differences in educational outcomes will affect the relationship between the effect of private tutoring and its propensity even further. The higher academic achievement of students from advantaged families results from their family's comparatively intensive investment and high-quality school education. Participating in private tutoring, therefore, is simply a manifestation of the intense competition in family investment in education, and has a very limited influence on their actual academic performance. In contrast, students from disadvantaged families face a structural constraint due to the uneven distribution of educational resources, which makes their academic performance lower than that of students from advantaged families. Moreover, this structural constraint on educational resources cannot be fully overcome through private tutoring. It is also hard for this group to access high-quality tutoring, which reduces its effect on test scores. Therefore, for the students who are either the most or least likely to participate (due to coming from advantaged or disadvantaged families, respectively), the effect of private tutoring on improving their grades is not significant.
Regarding the possibility of participating in private tutoring, those from middle-class families, with a medium possibility of participating, might benefit the most from it. First of all, parents in these families will endeavor to access relatively high-quality private tutoring opportunities despite their limited resources, rather than acquiesce to the social-structural constraints caused by the uneven distribution of education resources. Meanwhile, as students from such families are likely to have average scores, the “compensation” and “enhancement” effects may be more significant. These families, especially average middle-class families, have relatively sufficient resources, as well as a high willingness and expectations to cultivate and improve their children's academic performance, also due to educational anxiety and social pressure. Therefore, this study proposes the following hypothesis:
Research data, variables, and methods
Data
The data for this study were drawn from the China Education Panel Survey (CEPS), conducted by the National Survey Research Centre at Renmin University of China. CEPS is a large-scale, longitudinal survey project with national representation. Using a multi-stage probability sampling design, it randomly selected 438 classes from 112 schools from 28 county-level units for investigation. Within the baseline survey in the 2013–2014 school year, the sample size of the 7th- and 9th-grade students was 19,487. CEPS conducted questionnaire surveys of the students, their parents/guardians, teachers, and school principals, including information such as family background, family education, school characteristics, and academic performance. The sample of this study are those students in Grade 7 (the first year of middle school) covered by CEPS. 7
CEPS used the 2013–2014 school year as the baseline, and conducted the first follow-up of the Grade 7 students in the 2014–2015 school year. Ninety-two percent of the students completed the follow-up, making the effective sample size 9449. This study conducted multiple imputation based on chained equations for missing data. The final sample for analysis was 9330, due to the missing values in the students’ academic performance data (the outcome variable). 8
Variables
Academic performance
CEPS collected the students’ mathematics, Chinese, and English test scores for the 2013–2014 and 2014–2015 school years, as provided by the schools. This study standardizes the test scores on every subject according to the distribution of scores in each school. The average standardized scores for these three subjects are then examined as the outcome variable. Considering the reverse causality between private tutoring and academic performance, the analyses include Grade 7 test scores as an independent variable when predicting the propensity scores. When investigating the effect of private tutoring on academic performance, the dependent variable is the Grade 8 test scores. The study further examined the influences of private tutoring on students’ cognitive ability in the sensitivity analysis.
Private tutoring
Whether to attend private tutoring is the core variable of this study. To analyze the effect of private tutoring on academic performance, “tutoring participation” (whether attending all Chinese, mathematics, and English tutoring) is considered the core dichotomous independent variable (“participating” = 1, “not participating” = 0). Due to data limitations, it should be noted that this variable does not reflect the quality of the private tutoring. Also, as shown in previous studies, the effects and duration of private tutoring both differ between different subjects. Therefore, in the sensitivity analysis, this study further analyzes the influences of single-subject tutoring and tutoring during holiday periods.
Covariates
The individual, family, and school factors can affect students’ propensity to attend private tutoring, and therefore affect their academic performance. Such factors are thus included in the analysis as covariates in order to explore their impact on private tutoring choices and their effects.
The individual-level variables include: standardized test scores at Grade 7; gender (“male” = 0, “female” = 1); age (years); ethnic group (“non-minority” = 0, “ethnic minority” = 1); whether the student is the only child in the family (“non-only child” = 0, “only child” = 1); hukou (household registration) status (“urban” = 0, “rural” = 1), and migrant status (“local”, “within-province migrant”, “inter-provincial migrant”).
The family-level variables include: family structure (“co-residing with both parents”, “co-residing with only the mother”, “co-residing with only the father”, “co-residing with neither parent”); parents’ marital status (“unmarried, divorced, widowed” = 0, “married” = 1); whether the students are co-residing with their grandparent(s) (“not living together” = 0, “living together” = 1); family economic status (“poor”, “medium”, “rich”); parents’ highest educational level (years); and parents’ occupation (“leader of government, enterprise, and institution”, “professional and technical staff”, “service worker, employee, and factory worker”, “peasant”, “self-employed”, “retired, unemployed, and other”).
The model also includes multiple factors about students’ interaction with their parents at home and schools. These include: parents’ expectations for their children's education (years); parent–child interactions (latent factor analysis); parents’ education and supervision (latent factor analysis); whether parents help with homework (“no” = 0, “yes” = 1); whether parents attend parent–teacher meetings (“no” = 0, “yes” = 1); frequency of parent–teacher communication (latent factor analysis); level of student–teacher interaction (latent factor analysis); and positive or negative peer influence (latent factor analysis).
In addition, the school-level variables include: boarding status of school (“non-boarding school” = 0, “boarding school” = 1); school type (“public school”, “private school”, “private school for the children of migrants”); the school's ranking within the county (“below average”, “average”, “above average”); the location of the school (“urban areas”, “peripheral city or urban–rural fringe”, “township or village”); and the proportion of students within the school who participate in private tutoring.
Analytical strategies
The data analysis proceeded in four steps. The first step was the descriptive analysis. In order to examine whether statistically significant differences in background factors existed between the groups that participated in private tutoring and those that did not, the mean comparison was conducted among those two groups.
The second step aimed to predict the propensity score of private tutoring. This analysis explores the influence of individual, family, and school factors on the propensity to attend private tutoring. Based on the logistic regression model (see Equation (1)), di = 1 means “attending private tutoring” and di = 0 means “not attending private tutoring”. For each student observed in the sample, the propensity score to attend private tutoring is the predicted probability under the logistic regression model (Rosenbaum and Rubin, 1983). In order to increase the accuracy of the prediction of the propensity score, the propensity model used in this study, which was developed based on prior research, includes a large number of observable variables at the individual, family, and school levels. It also takes into account the nonlinear relationships between the variables.
The fourth step was an exploration of the heterogeneous treatment effects of private tutoring based on propensity scores. Xie et al. proposed three propensity-score-based analysis methods for heterogeneous treatment effects (Xie et al., 2012; Zhou and Xie, 2020). The first is the “stratification-multilevel method”, in which the calculated propensity scores are divided into different strata, then the private tutoring effects are estimated for each stratum, and finally the treatment effect calculated for each stratum is used as the dependent variable to calculate the slope of the treatment effect trend using the weighted least squares method. This method is based on the assumption of linearity in parameter estimation; that is, a linear correlation exists between the treatment effects and the propensity scores. The second is the matching-smoothing method (MSM), in which the treatment effect is calculated for each observed sample by matching the propensity scores, then a curve is fitted to the matched treatment effect and propensity scores, to observe how the treatment effect changes according to the propensity score. The third is the smoothing-differencing method (SDM), in which a curve is fitted to the dependent variable values of each observed sample in the experimental and control groups separately as the propensity scores change, then the difference between the two curves is calculated to obtain an estimate of the heterogeneous treatment effects. The latter two semi-parametric estimation methods are based on the assumption that a nonlinear relationship exists between treatment effects and propensity scores, thus allowing the researcher to explore the nonlinear relationship between them.
Considering the three methods above, the heterogeneous treatment effects by propensity score focus on exploring the interaction between propensity scores and private tutoring interventions. Theoretically, this does not differ from the interaction in the traditional regression models, but researchers should be aware of three important issues. The first is whether propensity score models can accurately predict the propensity scores of the intervention variables (Xie et al., 2012); that is, whether the propensity score model setting is influenced by the missing variable. On the one hand, the propensity score model is affected by unobserved confounders, whereas, on the other, the nonlinear relationship between the background factor variables and intervention variables may affect the accuracy of the model prediction, so the estimation of the propensity score model is based on the aforementioned ignorability assumption. The inclusion of individual-, family-, and school-level control variables in the analysis of this paper is intended to bring it closer to meeting the ignorability assumption. Second, researchers should be aware of the theoretical validity of the assumption of linearity in the interaction between the propensity scores and intervention variables (Hainmueller et al., 2019). If the assumption of linearity holds, a traditional linear regression model can analyze the interaction effects; that is, by including an interaction term for the intervention variable and the propensity score in the model; if the interaction is nonlinear, a curve fit to the nonlinear relationship can be performed using nonparametric or semiparametric methods.
9
Third, the sample distribution of the intervention variable and propensity score can affect the actual utility of the interaction effects (Hainmueller et al., 2019). For each observed propensity score, a sufficient number of nearby proximity observation data points should be satisfied; that is, the propensity score is continuous. In addition, for the observed propensity scores, the corresponding intervention variables should have the corresponding variability; that is, they should satisfy the simultaneous presence of the intervention and control groups, which is also known as the common support area. Based on the studies of Xie et al. and Haimueller et al. (Hainmueller et al., 2019; Xie et al., 2012; Zhou and Xie, 2020), this paper employs a nonlinear estimation method, interactive kernel smoothing, to estimate the nonlinear heterogenous treatment effects of private tutoring.
10
The kernel density smoothing estimation is based on the following semiparametric interaction regression model:
Results
Descriptive analysis
Based on the CEPS-weighted sample analysis outlined in Table 1, the percentage of Grade 7 students who participated in private tutoring was 35%. Table 1 shows that there exist significant differences between the background factors of the group that participated in private tutoring and the group that did not. First, the average standardized composite scores of the students who participated in private tutoring were significantly higher than those of the students who did not. Second, children without siblings and urban students were more likely to attend private tutoring. Students who participated in private tutoring had significantly better family economic conditions, parental education level, and parental occupational status than those who did not. At the same time, the students who participated in private tutoring had higher parental expectations and higher levels of parent–child interaction, educational supervision, assistance with homework, and parent–teacher communication. The students who participated in private tutoring were also more likely to engage in interactions with their teachers. Finally, there were also significant differences between the school factors of the two groups. The students who participated in private tutoring were more likely to attend non-boarding schools, urban schools, and schools with lower or higher within-county rankings. In addition, the percentage of students who participated in private tutoring at the school level was higher for the participating students group.
Descriptive statistics and mean comparison by private tutoring status.
Note: Data analysis is weighted by China Education Panel Survey (CEPS) sampling weights. Hukou: household registration. *p <0 .05, **p <0 .01, ***p <0 .001.
Factors affecting private tutoring
Table 2 reports the regression coefficients and average marginal effect (AME) for the logit model. With other factors being controlled, it was found that there was a nonlinear effect of early-education achievement on students’ likelihood of participating in private tutoring. The negative and statistically significant quadratic regression coefficients for the composite scores of 7th graders suggest an inverse U-shaped relationship between academic performance and a propensity to attend private tutoring; that is, students with medium academic performance were more likely to attend private tutoring than students with a low or high level of academic performance. At the same time, female students were more likely to attend private tutoring than male students. Students from rural areas and inter-provincial migrants were less likely to attend private tutoring. In terms of family factors, students living with grandparents were also more likely to attend private tutoring. It is worth noting that, with other factors being controlled, family economic conditions and parental education level and occupational status had a limited influence on participation in private tutoring. Also, the higher the level of parental educational supervision, parent–teacher communication, and student–teacher interaction, the more likely the students were to attend private tutoring. Regarding school factors, the results showed that students attending private schools or private schools for the children of migrant workers were more likely to attend private tutoring compared to students attending public schools. Also, students attending schools with lower within-county performance rankings were more likely to attend tutoring. The percentage of private tutoring participation in a student's school had a greater impact on student's propensity to attend tutoring compared with other factors. For each standard deviation increase in the percentage of private tutoring participation in the student's school, the student's probability of participating in private tutoring increased by 20% (p < 0.001). This suggests that the competitive environment within the school, influenced by peer group pressure, plays an important role in shaping students’ participation in private tutoring.
Logistic model predicting propensity scores of attending private tutoring (n = 9330).
Note: *p < 0.05, **p <0 .01,***p < 0.001. Standard errors are shown in parentheses. The average marginal effect (AME) marks the change in the predicted probability of participating in private tutoring. The AME of a categorical variable is calculated as the probability of a discrete change from 0 to 1. The AME of a continuous variable is calculated as the probability of a continuous change with one standard deviation.
Homogeneous treatment effect of private tutoring
Table 3 lists the results of the homogeneous treatment effect of private tutoring, estimated by the three matching algorithms of nearest neighbor matching, radius matching, and kernel matching, which indicate that participation in private tutoring had no significant effect on the standardized composite test scores of eighth graders. With the effects of individual, family, and school factors on the selectivity of private tutoring being controlled, the homogeneous treatment effect of private tutoring on academic performance was not significant. In addition, individual students who participated in private tutoring had higher gains than those who did not (i.e. ATT > ATU). The fact that ATT is greater than ATU also suggests that the positive selection hypothesis (Hypothesis 1) holds; that is, that students who are more likely to attend private tutoring benefit more from it. However, simply comparing ATT and ATU does not fully enable us explore the nonlinear relationship between the effect of private tutoring and the propensity score.
Propensity score matching estimates of private tutoring on standardized test scores (n = 9330).
Note: *p < 0.05, **p <0 .01, ***p < 0.001. Standard errors are shown in parentheses. Bootstrapped standard errors are computed for ATU.
Heterogeneous treatment effects of private tutoring
Using kernel smoothing to fit the interaction between the participation in private tutoring and the propensity score, Figure 1 illustrates the propensity score-oriented estimates of heterogeneous treatment effects of private tutoring and reveals how the effect of private tutoring on the standardized composite scores of 8th graders varies with changes in the propensity score. The results show that the relationship between the private tutoring effect and the propensity score is nonlinear, and also explains to some extent why the homogeneous treatment effect of private tutoring is not significant. The effect of private tutoring differed for different propensity score strata. According to the strength and directionality of the effect, the propensity scores can be roughly divided into 3 strata: in the 0–0.4 stratum, the overall effect of private tutoring on students’ performance is negative; in the 0.4–0.8 stratum, the effect of private tutoring is significantly positive; and in the 0.8–1 stratum, the effect of private tutoring shifts from significantly positive to ineffective. This trend suggests that students with an intermediate likelihood of participating receive higher returns from private tutoring than those with the highest and lowest likelihood of participation. This result confirms the Hypothesis 3. Also, the actual distribution of the sample according to the propensity scores shows that 64% of the sample is in the 0–0.4 stratum, 28% in the 0.4–0.8 stratum, and only 8% in the top stratum. This suggests that, for the students with a low or high likelihood of participating in private tutoring, who represent more than two-thirds of the overall sample, private tutoring is not effective in improving their performance. For students with a medium likelihood of participation, about one-third of the total, private tutoring has a positive effect in terms of improving their composite test scores.

Heterogeneous treatment effect of private tutoring on the standardized composite scores of eighth graders (interactive kernel smoothing method).
The delineation of the propensity score intervals, based on differences in the private tutoring effect, allows the further profiling of the group that would benefit most from private tutoring. Table 4 displays a comparison of the average values of the individual, family, and school factors across different private tutoring propensity score strata. The results indicate that the propensity scores may distinguish between different student groups in terms of their sociodemographic background factors. Stratum 1 (0–0.4 interval) indicates that families with a lower propensity to attend private tutoring are more likely to be those of a relatively disadvantaged socioeconomic status. In this group, the following characteristics were identified: the higher proportions of rural families, single-parent families, and families in which the child co-resides with neither parent were higher than in the other groups; family socioeconomic status and parental education level were lower; the proportions of parents whose occupations were service worker, labor worker, employee, and peasant were higher; the level of parental involvement in children's education was relatively low; the majority of the students in this group attended rural schools and a higher percentage were from low-ranked schools. Compared to Statum 1, Stratum 2 (the 0.4–0.8 interval) largely had an average middle-class family background. The sample of students in this group had a moderate level of both family socioeconomic status and parental involvement in children's education. This group had the highest proportion of parents whose occupations were service workers, labor workers, employees, and self-employed. Stratum 3 (the 0.8–1 interval), on the other hand, represents the characteristics of families with a relatively high socioeconomic status. The students in Stratum 3 had the highest composite scores compared to the other groups. A higher proportion of the families in this group were financially well-off, had parents with higher education levels than those in the other groups, and had a higher proportion of parents whose occupations were government cadres and technicians. Also, the parents in Stratum 3 had the highest educational expectations and showed the highest level of involvement in communicating and interacting with their children, educational supervision, helping with homework, parent–teacher meeting attendance, and communication with teachers. Combining these contextual factors and family differences, the heterogeneous treatment effects of private tutoring allow us to infer that private tutoring has a positive effect on improving the academic performance of children from average middle-class families, whereas, for the relatively disadvantaged families (the majority) and advantaged families (the minority), the effect of private tutoring in terms of improving the students’ composite scores is minimal.
Mean comparison by propensity score strata of attending private tutoring.
Note: The data analysis has been weighted according to the China Education Panel Survey (CEPS). Hukou: household registration. *p < 0.05, **p < 0.01, ***p < 0.001.
Sensitivity analysis
Alternative methods estimating heterogeneous treatment effects
Different estimation methods for the nonlinear heterogeneous treatment effects lead to different outcomes (Hu et al., 2021). This study uses Yu Xie's MSM and SDM to examine the impact of other estimation methods on the nonlinear heterogeneous treatment effects, the results of which are presented in Figure 2. 12 As it shows, the results in Figures 1 and 2 are relatively consistent. In terms of the propensity to attend private tutoring, for students whose propensity lies in the middle range, private tutoring has a positive effect on their scores. For the majority of students with a low propensity and the minority of students with a high propensity, it has little effect on their scores.

The heterogeneous treatment effects of private tutoring on the composite test scores of 8th graders.
Heterogeneous treatment effects of private tutoring on cognitive ability
For the unobservable variables, including school teaching environment and examination rules and systems, examination test scores cannot provide a complete picture of students’ academic performance. The study further analyses the influence of private tutoring on students’ cognitive ability. CEPS includes a test of students’ cognitive ability; rather than focusing on the acquisition of rote learning, this test determines students’ logic and problem-solving skills, which are more standardizable and comparable at the national level. The test involves phrase analogy, verbal reasoning, graphical pattern analysis, paper folding, geometric applications, and numerical and logical reasoning components. Table 5 presents the homogeneous effect of private tutoring on students’ cognitive ability, as estimated by the different matching algorithms. As it shows, the effect is insignificant for 8th-grade students. Figure 3 presents the effect heterogeneity. It concludes that private tutoring has more positive effects on students with a higher propensity to participate, especially for those whose propensity score is between 0.6 and 0.8. The outcome, compared to Figure 1, is more consistent with the positive selection hypothesis (Hypothesis 1) and shows that private tutoring helps to enhance the cognitive ability of students from a relatively advantaged background.

The heterogeneous treatment effects of private tutoring on the cognitive ability of 8th graders (interactive kernel smoothing method).
Propensity score matching estimates of private tutoring on students’ cognitive ability (n = 9449).
Note: *p <0 .05, **p <0 .01, ***p < 0.001. Standard errors are in parentheses; Bootstrapped standard errors are computed for ATU.
Heterogeneous treatment effects of different subjects and holiday tutoring
Studies have proven that the effects of private tutoring vary with subject types and tutoring periods. Sensitivity analysis is used to analyze the heterogeneous treatment effects of tutoring in different subjects and tutoring during school holidays. Table 6 indicates the effect homogeneity of different subjects and holiday tutoring on students’ subject scores as well as overall performance, as estimated by different matching algorithms. The result shows that tutoring in different subjects and holiday tutoring have an insignificant influence on 8th-grade students’ average scores. Figures 4 and 5 further report the disparity between the heterogeneous treatment effects of tutoring in different subjects and holiday tutoring. The result shows that the heterogeneous treatment effects of tutoring in different subjects has different nonlinear relationships. For Chinese and mathematics tutoring, the pattern is consistent with the medium selection hypothesis (Hypothesis 3). The effect is positive for students with a medium propensity but negative for students with the highest propensity. In the interval of a high propensity, the sample size is smaller and the sample distribution more uneven, which may increase the margin of error. For English tutoring, the effect heterogeneity is consistent with the positive selection hypothesis (Hypothesis 1). Students with the highest propensity benefit the most from English tutoring, which indicates that English tutoring is more helpful for students from advantaged families. Furthermore, the heterogenous treatment effects of holiday tutoring is consistent with the negative selection hypothesis (Hypothesis 2). Students with the lowest propensity benefit most from holiday tutoring, which means that holiday tutoring is more helpful for students from less-advantaged families. Overall, the treatment effect heterogeneity of tutoring in different subjects and holiday tutoring reflects the complexity of private tutoring's effect on test scores.

Heterogeneous treatment effects of different subjects and holiday tutoring on the academic performance of eighth graders (interactive kernel smoothing method).

Distribution of the propensity score of different subjects and holiday tutoring.
Propensity score matching estimates of different subjects and holiday tutoring on academic performance (n = 9930).
Note: Standard errors are in parentheses. Bootstrapped standard errors are computed for ATU. *p < 0.05, **p <0 .01, ***p <0 .001.
Discussion and conclusion
This study has used nationally representative longitudinal data on junior high school students, drawn from the CEPS, and employs propensity score-based estimation of heterogeneous treatment effects to examine the effects of private tutoring on students’ academic performance.
We draw three main conclusions. First, individual, family, and school factors jointly affect students’ participation in private tutoring. Different from previous research findings, this study finds that family socioeconomic status, parental occupation, and parental education level exert a limited influence on whether or not students attend private tutoring. After controlling the more comprehensive family background variables, students of different family socioeconomic statuses have a somewhat similar propensity to engage in private tutoring. Meanwhile, students with a medium level of academic performance, compared to those with high and low performance, are more likely to attend private tutoring — but, overall, students with high performance are more likely to participate than those with low performance. The higher the proportion of students participating in private tutoring in a school, the more likely a student at the school is to attend private tutoring, indicating the effect of peer pressure. To sum up, individual, family, and school factors jointly influence students’ willingness to attend private tutoring.
Second, private tutoring has an insignificant influence on academic performance. After accounting for individual, family, and school factors, we find that private tutoring cannot effectively improve examination scores, which contradicts previous findings on the positive effects of private tutoring (Hu et al., 2015; Xue, 2016) and supports other studies’ findings regarding its insignificant effect (Pang et al., 2017). Studies of the effect homogeneity of private tutoring have also ignored its effect heterogeneity on academic performance. As the propensity of individuals to attend private tutoring is influenced by various factors, the underlying individual selection mechanism can influence the heterogeneity of private tutoring's effects.
Third, in terms of the propensity to attend private tutoring, students with a moderate propensity benefit the most, which is consistent with the medium selection hypothesis (Hypothesis 3). It also indicates that students from middle-class families benefit more than those from most of the socioeconomically disadvantaged families and a small proportion of the advantaged families, which contradicts the previous finding that students from low-socioeconomic-status families benefit more (Hu et al. 2015; Li and Hu, 2017; Zhang et al., 2020). However, this study shows that disadvantaged students with the lowest propensity, who constitute nearly two-thirds of the CEPS sample, receive negligible benefits, possibly because this group can rarely attend high-quality private tutoring due to a lack of resources. As a result, simply encouraging students from disadvantaged families to participate in tutoring cannot effectively bridge the education gap between families of different socioeconomic statuses. Furthermore, for the advantaged families, private tutoring is a choice motivated by the status culture and peer pressure rather than a rational choice based on students’ practical needs. Therefore, private tutoring has an insignificant influence on these students’ academic performance.
The sensitivity analysis further reveals the complexity of the heterogeneous effects of private tutoring. First, the influence of private tutoring on students’ cognitive ability is consistent with the positive selection hypothesis (Hypothesis 1). Although private tutoring has a limited effect on the examination scores of students from advantaged families, it has a positive effect on their cognitive ability, possibly because advantaged families find it easier to access high-quality private tutoring and are more concerned about children's non-test abilities.
Second, the heterogeneous treatment effects of private tutoring varies widely according to the subject and form of the tutoring. Specifically, the positive effect of Chinese and mathematics tutoring on students with an intermediate participation propensity suggests that the students who benefit from the tutoring are more likely to come from average middle-class families. Moreover, in line with the positive selection hypothesis, students who are the most likely to attend English tutoring benefit most from it, suggesting that the unequal access to English tutoring is likely to increase socioeconomic status differences in terms of academic achievement and also deepen educational inequality. For students who are least likely to attend tutoring during school holidays, holiday tutoring helps to improve their scores the most, which is consistent with previous research findings that private tutoring is more beneficial to students from disadvantaged families. Overall, the effect of private tutoring on academic achievement is complex, which will ultimately influences educational inequality.
The heterogeneous treatment effects of private tutoring by propensity scores need to be interpreted from the perspective of social class. The social class classification, based on education, profession, and income, cannot distinguish and reflect all of the differences between the characteristics of the groups by propensity score. The medium selection hypothesis (Hypothesis 3) suggests that those who benefit from private tutoring are more likely to come from average middle-class families, but this does not mean that all of the students from those families will necessarily benefit from it. The difference between the groups with a lower and higher propensity is not exactly parallel to the difference between socioeconomically disadvantaged and advantaged families. The effect of private tutoring is not significant for groups with a lower or higher propensity, suggesting that private tutoring, to a large extent, fails to enhance the examination scores of students from disadvantaged and advantaged families. This indicates that those who benefit the least from it are more likely to come from disadvantaged or advantaged families but that does not mean that all of the students from these families necessarily do not receive any benefit at all from private tutoring. Therefore, the characteristic difference between the groups by propensity score helps to reveal the heterogeneous treatment effects of private tutoring from the perspective of social class.
There are several limitations. First, according to the ignorability assumption, our analysis cannot fundamentally solve the intrinsic problem of the causal inference of private tutoring's effect, which is a common problem shared by quantitative analyses using observable data. Although the model controls many observable variables including individual, family, and school factors, it cannot control all of the unobservable variables that influence both private tutoring and test scores. These unobservable variables, such as parents’ partiality, academic capability, and social networks, cannot be reflected in the data and thus may influence the accuracy of the estimates. Secondly, the quality of the private tutoring may affect the students’ likelihood of participation and thus influence the causal inference. If the participation and quality of private tutoring are positively correlated, the estimated effect of private tutoring is positively biased; that is, the analysis would overestimate its actual effect. Finally, the CEPS data are based on the education of junior high school students only, so it is difficult to use the findings of this study to infer the effectiveness and the effect of private tutoring at other educational levels, such as kindergarten, primary school, and senior high school. All of the above limitations need to be addressed in future research.
Overall, the findings carry out important poilcy implications. The primary problem concerning educational equity in China today is the monopolized and exclusive allocation of educational resources by the private tutoring industry, which exacerbates inequality by leaving disadvantaged groups without access to subsidies and support. Therefore, a crucial aspect of the ongoing education reform is to regulate policies with recognition of the group disparities resulting from individual choices, and promote a fair and organized distribution of educational resources across diverse groups.
Footnotes
Acknowledgements
I am grateful to Mengchen Liu for her support and encouragement of this research, and to Yan Wang, Anning Hu, and the anonymous reviewers of Chinese Journal of Sociology (Chinese version) for their comments. The author takes sole responsibility for his views.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Social Science Fund of China (grant number: 20CSH035).
