Abstract
Meta-analyses on the effect of peer tutoring have rarely examined the effect of peer tutoring on tutors’ academic gain. Some previous analyses are dated and have methodological or theoretical limitations. Hence, there is a compelling need to fill this gap by conducting an updated and comprehensive meta-analysis for identifying certain determinants of best practices for peer tutoring on tutors’ academic achievement in the present study. Additionally, role theory and equity theory in peer tutoring were tested. The present meta-analytic study examined 16 articles using The comprehensive meta-analysis software programme and SPSS macro for analyses. It was found that the weighted mean effect size was 0.43 (p < 0.001). Moreover, the crucial parameters for optimizing the effectiveness of peer tutoring interventions are identified as follows: Tutees with low academic ability; tutors coming from secondary school; fewer tutor training sessions per week; shorter tutor training time per session; choosing mathematics as subject content; random assignment of tutees and tutors; structured peer tutoring; same-age non-reciprocal peer tutoring; same-sex dyad grouping; and more weekly tutoring sessions but longer tutoring time for each session.
An updated meta-analysis on the effect of peer tutoring on tutors’ academic achievement
Although several meta-analyses have shown that peer tutoring is effective in promoting tutors’ academic achievement, some of these meta-analyses are dated and have methodological or theoretical limitations, and some studies are confounded by other types of interventions (e.g., cooperative learning or adult-led tutoring). Furthermore, some studies failed to consider the academic achievement of tutee and tutor separately. The present study sought to synthesize previous findings on peer tutoring to address these limitations, examine the effects of peer tutoring on tutors’ academic achievement, and examine certain determinants of optimal effectiveness of peer tutoring on tutors’ academic achievement.
Previous meta-analyses and limitations
Similar to traditional reviews (e.g., McMaster, Fuchs, & Fuchs, 2006; Robinson, Schofield, & Steers-Wentzell, 2005; Spencer, 2006; Topping, 1996, 1998, 2005; Topping & Lindsay, 1992), previous meta-analyses have shown that peer tutoring can promote tutors’ academic gain and that its effectiveness is moderated by several programme features, such as participant characteristics and tutoring duration. However, some of these studies are dated (e.g., Cohen, Kulik, & Kulik, 1982; Cook, Scruggs, Mastropieri, & Casto, 1985) and do not capture methodological advancements such as correcting for sample size when calculating the magnitude of the effect, utilizing weighting procedures that account for sample size, selecting the appropriate unit of analysis, and conducting homogeneity analyses to examine moderator variables and group differences. Additionally, some studies were confounded with other types of tutoring, including teacher-led and other adult-led tutoring (e.g., Cohen et al., 1982). Even rather recent meta-analyses that examine peer tutoring by adopting some of these advancements still have methodological limitations. For example, Rohrbeck, Ginsburg-Block, Fantuzzo, and Miller (2003) conducted a meta-analysis on studies adopting reciprocal peer tutoring in which students take turns acting as the tutee and tutor. This study includes academic achievement for both tutee and tutor. The results demonstrated that peer tutoring has a positive effect; furthermore, certain moderators such as subject content, student characteristics and intervention features were identified. However, this study was confined to a specific population (such as elementary school children), limited to same-age reciprocal peer tutoring and confounded with other types of peer learning (such as cooperative learning). Moreover, it did not employ proper methods for addressing publication bias. Another example is the meta-analysis done by Leung (2015). He conducted the most updated and comprehensive meta-analytic study that has overcome most of these limitations and adopted advanced methodology that was not confounded with other types of peer-assisted learning. Moreover, a wide range of participants and subject content were engaged to establish the effects of peer tutoring. Since the meta-analysis includes reciprocal peer tutoring, the findings also indicate the effect of peer tutoring on tutor’s academic achievement. However, similar to Rohrbeck and colleagues (2003) meta-analysis, Leung (2015) did not consider the achievement scores separately for tutor and tutee for some studies because it included studies that used reciprocal peer tutoring.
Hence, there is a need to conduct an updated and comprehensive meta-analysis adopting advanced methodology to establish the effects of peer tutoring on tutors’ academic achievement by computing the tutor’s achievement scores only. To this end, only studies adopting same-age, non-reciprocal peer tutoring and cross-age peer tutoring were considered.
In addition, role theory and equity theory were evaluated in the present investigation. Role theory suggests that people act according to the roles assigned to them. Hence, female tutors may find it difficult to act as a tutor because that role confers greater authority and superior status to the female tutor over the male tutee (Eagly, Wood, & Diekman, 2000). This situation is inconsistent with traditional gender roles, and therefore, female tutors in mixed-sex pairs have negative responses to tutoring (Fogarty & Wang, 1982). Moreover, male tutees may be uncomfortable with the subordinate role as tutee when female tutors possess the superior status. Hence, in the current study, it was hypothesized that same-sex dyads would produce greater achievement gains than would mixed-sex dyads.
Based on equity theory, the tutee may find it inequitable in same age, non-reciprocal peer tutoring since they perceive being tutee as contrary to the norm of self-reliance as predicted by equity theory. Hence, it was suggested that cross-age peer tutoring will be more equitable than same-age non-reciprocal peer tutoring and may not elicit negative attitudes for the tutees. Cohen (1986) explained the reasons why cross-age peer tutoring is expected to surpass same-age peer tutoring in that adopting cross-age peer tutoring can eliminate the competition between peers that hinders teaching due to the legitimate status of the tutor. Additionally, adopting cross-age peer tutoring can protect the tutee’s self-esteem by providing a clear rationale for the tutee’s role. A study investigating the effects of being assigned the role of tutor or tutee on attitude in same-age peer tutoring revealed that participants showed positive attitudes when assigned the role of tutor, but negative attitudes were found when assigned as tutee (Bierman & Furman, 1981). In sum, based on equity theory, in the current study it was hypothesized that cross-age peer tutoring would be more effective in promoting academic gains than same-age non-reciprocal peer tutoring.
Aims of the present investigation
The present investigation aimed to:
Examine the overall effectiveness of peer tutoring on tutors’ academic achievement; Conduct moderator analyses on the effectiveness of peer tutoring on tutors’ academic achievement; Suggest certain determinants of the effectiveness of peer tutoring on tutors’ achievement; and Validate the predicted effects of certain intervention parameters derived from role theory and equity theory as postulated in the previous sections.
Method
Literature search procedures and inclusion criteria
Various sources were used for the literature search. PsycINFO and the Educational Resources Information Centre (ERIC) were the two main online databases for searching articles using key terms, including ‘peer tutoring, peer tutee, peer tutor, tutoring, tutor, and tutee’. Other relevant articles from previous meta-analytic studies and reference lists from the review articles of peer tutoring were also examined. Additionally, a manual search was conducted in select journals that have published peer tutoring articles (e.g., Journal of Educational Psychology, American Educational Research Journal). There were several inclusion criteria, including (1) being in a peer-reviewed journal that was published in 2018 or before, (2) the peer tutoring was conducted in a school setting, (3) participants were students, (4) the targeted subject matter was academic achievement, (5) the outcome data available in the article were amenable to the calculation of effect sizes, (6) the studies employed a treatment-comparison (quasi-experimental) or treatment-control (true experimental) group design, (7) tutoring was confined to same-age, non-reciprocal peer tutoring and cross-age peer tutoring, and (8) non-English materials were excluded due to limited language abilities of the author. Conference papers, dissertations, reports, and book chapters were excluded due to bias (see also Ferguson & Brannick, 2012). In the present study, 16 articles were retained for further analysis because they met these inclusion criteria. A complete list of articles can be found in the online supplemental material.
Coding of studies
The coding features were utilized mainly from Leung’s (2015) meta-analysis, which included (1) report information, (2) participant characteristics, (3) methodology, (4) intervention features, and (5) outcome assessment. Two coders who had obtained Master of Education degrees conducted the coding. They received a two-hour training on revision of certain key concepts on meta-analysis and coding procedures. In order to achieve a common understanding of the parameters used in the coding sheet, two pilot coding sessions were conducted for two coders to identify and resolve any disparities, and reach consensus. All of the 16 articles were then double coded independently. Inter-rater reliability was high in that the average percentage of agreement was 90.0% and kappa coefficient was 0.92.
Data analysis
Computation of standardized effect size and moderator analyses
The standardized effect size was computed by dividing the difference between the treatment and control/comparison group means by the pooled Standard Deviation of the two groups (Hedges, 1981). The standard procedure for correcting small sample size bias was used because effect size is positively biased in small samples (Hedges & Olkin, 1985). The comprehensive meta-analysis software programme (Version 2.0; Borenstein, Hedges, Higgins, & Rothstein, 2005) and SPSS Macro (see also Lipsey & Wilson, 2001) were used to calculate the mean effect size, variance and 95% confidence interval estimates and conduct moderator analyses.
Shifting approach for moderator analyses
Generally, one effect size was calculated for each study. However, when there were subgroups within a study, a shifting unit approach was adopted (Cooper, 1998) such that for each subgroup within a single study, each effect size was coded as if it were an independent unit of analysis.
Homogeneity tests and I2 index
Homogeneity analyses were utilized to examine whether the mean values of the various effect sizes all estimated the same population effect size (Hedges, 1982; Rosenthal & Rubin, 1982). Fixed effects models were adopted for homogeneity tests in the present investigation. A fixed effects model assumes that the source of error for any variance in effect size comes from subject-level sampling (Hedges & Vevea, 1998).
The Q statistic was used for homogeneity analyses (Hedges & Olkin, 1985), and a significant Q statistic rejects homogeneity and suggests a heterogeneous condition. The I2 index was also adopted to assess the extent of heterogeneity (Higgins & Thompson, 2002; Higgins, Thompson, Deeks, & Altman, 2003; Huedo-Medina, Sánchez-Meca, Marín-Martínez, & Botella, 2006).
Moderator analyses
Homogeneity tests were adopted to examine whether the variance in the effect size within a particular set was accounted for by the sampling error alone or by a special coded feature of the effect size of the set. A Q statistic was used and divided into a between-group homogeneity statistic, QB, and a within-group homogeneity statistic, QW (Hedges & Olkin, 1985). A significant QB suggests that the grouping variable is a significant moderator of outcome, whereas a non-significant Qw indicates that the studies can be grouped into homogenous subgroups (Lipsey & Wilson, 2001).
In the present investigation, both Q statistics and an I2 index were adopted to assess the extent of heterogeneity, as mentioned previously.
Results
During the literature search process, an initial count revealed 16,149 articles. After excluding studies (N = 15,808) that looked irrelevant based on the title, abstract and sources (e.g., book chapters and conference papers), 341 articles were retained. Finally, only 16 articles that met the inclusion criteria were included in this meta-analysis.
Overall effect size and effect size adjustment
Overall effect size
The 16 studies included 46 effect sizes for tutors’ achievement because some studies reported subgroups. However, for independent samples, only 16 effect sizes for achievement were used.
The weighted mean effect size for the 16 independent samples included in the present investigation of achievement was significant (d = 0.43, p < 0.001; Confidence Interval (CI) = 0.35–0.50)
Publication bias
Because only published articles were used in the present study, the effect size was likely overestimated because published materials tend to present significant findings (Lipsey & Wilson, 2001). To evaluate whether publication bias occurred, McDaniel, Rothotein, and Whetz (2006) recommended the use of the defensible trim and fill method (Duval & Tweedie, 2000a, 2000b).
In the present investigation, the weighted mean effect size was 0.43 before trimming. After trimming, the weighted mean effect size was d = 0.38 (CI = 0.31–0.44). Therefore, there were greater post-test scores for tutor’s achievement in the treatment groups compared with the comparison/control groups.
Moderator analyses
Based on the empirical model proposed in Leung’s (2015) meta-analytic study, the entire peer tutoring intervention process consists of three phases: Pre-intervention, intervention, and post-intervention. In the pre-intervention phase, peer tutoring is concerned with the selection and screening of participants and format of tutor training. In the intervention phase, the model is concerned with subject content and intervention format. In the post-intervention phase, the assessment of outcomes is the key focus.
Pre-intervention
Homogeneity analyses and mean effect size for parameters as moderators during pre-intervention for tutor achievement (k = 16).
Note: A significant QB indicates a significant moderator whereas a non-significant QW shows that the variable can be grouped into homogenous subgroups. K = number of effect sizes.
These categories include unspecified groups, mixed group or subcategories which have df equal to 0. For this reason, the overall value of k for the test of this moderator variable is smaller than the total number of independent samples (i.e., 16). This also explains why the overall mean effect size for these variables is different from the mean of 0.43.
p < 0.05; **p < 0.01; ***p < 0.001.
Homogeneity analyses and mean effect size for parameters as moderators during intervention for tutor achievement (k = 16).
Note: A significant QB indicates a significant moderator whereas a non-significant QW shows that the variable can be grouped into homogenous subgroups. K = number of effect sizes.
These categories include unspecified groups, mixed group or subcategories which have df equal to 0. For this reason, the overall value of k for the test of this moderator variable is smaller than the total number of independent samples (i.e., 16). This also explains why the overall mean effect size for these variables is different from the mean of 0.43.
p < 0.05. **p < 0.01. ***p < 0.001.
The frequency of tutor training was a significant moderator of achievement outcomes and showed high heterogeneity (QB = 15.33, p < 0.001, I2 = 93%). Studies at or below the median (2 or fewer sessions of tutoring per week) had greater effect sizes (d = 1.09) than did those above the median (d = 0.22) (see Table 2). The length of tutor training sessions was also a significant moderator of achievement outcomes and showed high heterogeneity (QB = 14.54, p < 0.001, I2 = 93%). Studies at or below the median (40 minutes or fewer per session) had greater effect sizes (d = 1.24) than did those above the median (d = 0.26).
Intervention
When mathematics, reading and grouped other subjects (e.g., physical education, arts) were compared, subject content was a significant moderator of achievement outcomes (QB = 18.25, p < 0.001, I2 = 89%). Mathematics had a greater effect size (d = 1.16) than grouped other subjects (d = 0.36) and reading (d = 0.41; see Table 2).
Studies that involved same-gender dyads were significantly associated with larger effect sizes (d = 0.85) than were mixed gender dyads (d = 0.39), yielding high heterogeneity (QB = 9.62, p < 0.01, I2 = 90%). In addition, studies that involved random assignment (experimental versus quasi-experimental design) were significantly associated with larger effect sizes (d = 0.69) than those without random assignment (d = 0.26), yielding high heterogeneity (QB = 29.50, p < 0.001, I2 = 97%). Studies that involved structured tutoring were significantly associated with larger effect sizes (d = 0.80) than unstructured one (d = 0.40), yielding high heterogeneity (QB = 7.61, p < 0.01, I2 = 87%).
Regarding the type of peer tutoring, adoption of same-age non-reciprocal peer tutoring produced significantly larger effect sizes (d = 0.63) than cross age peer tutoring (d = 0.34), yielding high heterogeneity (QB = 12.83, p < 0.001, I2 = 92%).
The frequency of tutoring session per week was a significant moderator of achievement outcomes and indicated high heterogeneity (QB = 16.37, p < 0.001, I2 = 94%). Studies above the median (greater than 2.5 sessions per week) had larger effect sizes (d = 1.03) than did those at or below the median (d = 0.40). Similarly, the length of each tutoring session was also a significant moderator of achievement outcomes and indicated high heterogeneity (QB = 9.02, p < .01, I2 = 89%). Studies above the median (greater than 25 minutes per session) had larger effect sizes (d = 0.48) than did those at or below median (d = 0.33).
Duration of tutoring (QB = 0.14, ns, I2 = 0%) and the total dosage of tutoring (QB = 0.42, ns, I2 = 0%) were all non-significant moderators of achievement outcomes.
Post-intervention
Homogeneity analyses and mean effect size for parameters as moderators during post-intervention for tutor achievement (k = 16).
Note: A significant QB indicates a significant moderator whereas a non-significant QW shows that the variable can be grouped into homogenous subgroups. K = number of effect sizes.
These categories include unspecified groups, mixed group or subcategories which have df equal to 0. For this reason, the overall value of k for the test of this moderator variable is smaller than the total number of independent samples (i.e., 16). This also explains why the overall mean effect size for these variables is different from the mean of 0.43.
p < 0.05; **p < 0.01; ***p < .001.
Discussion
The meta-analysis conducted in the present study advances our understanding of the effects of peer tutoring on tutors’ academic achievement by including studies that adopt same-age, non-reciprocal and cross-age peer tutoring. Moreover, the present study adds to new understanding on how the intervention features moderate the effects of peer tutoring, including those features that were derived from role theory and equity theory related to peer tutoring.
Overall effect of peer tutoring on tutor achievement
The meta-analysis conducted in the present investigation provides evidence that peer tutoring has a positive effect on the academic achievement of tutors. After imputing the missing values by adopting the trim and fill method, the weighted mean effect size was d = 0.38 (confidence interval = 0.31 to 0.44). The results showed that the post-test scores were greater in the treatment groups than in the control groups.
Determinants of the effects of peer tutoring on tutors’ achievement
During pre-intervention, the selection of participants and format of tutor training were considered. The present investigation identified that tutees of low ability had larger effect sizes than those at the mixed ability levels. Cohen et al. (1982) found that there was a greater, but not significant, effect when using tutees with very low academic ability levels (unweighted ES = 0.42) compared with those with middle academic ability levels (unweighted ES = 0.33). Hence, the findings of the present study were consistent with previous studies to a certain extent. Moreover, consistent with Leung’s (2015) meta-analytic study, the results of the present study showed that tutors from secondary school had greater effect sizes than those from elementary school or college/university.
Regarding tutor training, consistent with Leung’s (2015) meta-analytic study, having fewer weekly tutor training sessions was more effective in enhancing tutors’ academic achievement than having frequent training sessions. However, unlike Leung’s (2015) meta-analytic study, which found that length of a training session was not a significant moderator, it was revealed that shorter training sessions had significantly greater effect sizes than longer session in the present study. The difference could be explained by the fact that the present study counted tutor’s achievement scores only, whereas tutor’s achievement scores were confounded with tutee’s achievement scores in the same-age, reciprocal tutoring in Leung’s (2015) meta-analytic study mentioned previously.
Within this meta-analysis, the subject content and intervention format were considered. Mathematics was found to be more effective in enhancing tutors’ academic achievement than reading and other subjects. Cohen et al. (1982) also reported that studies involving mathematics produced larger effect sizes (unweighted ES = 0.62) than reading (unweighted ES = 0.21). Hence, the findings of the present study were consistent with previous studies.
Consistent with Leung’s (2015) meta-analytic study, structured peer tutoring was more effective than unstructured one in enhancing tutors’ academic achievement.
However, unlike Leung’s (2015) meta-analytic study, which found that frequency of tutoring per week, length of tutoring session and random assignment were not significant moderators, it was revealed that more weekly tutoring sessions had significantly greater effect sizes than fewer number of weekly sessions, longer length of tutoring sessions had significantly greater effect sizes than shorter ones, and having random assignment produced significantly greater effects than non-random assignment. Again, the difference could be explained by the fact that the present study included tutor’s achievement scores only, whereas tutor’s achievement scores were confounded with tutee’s achievement scores in the same-age, reciprocal tutoring in Leung’s (2015) meta-analytic study mentioned previously.
Regarding the theoretically derived parameters, it was hypothesized that same-sex pairs would be more effective in promoting tutors’ academic gain than mixed-sex pairs according to the role theory. Consistent with the prediction, the studies of same-gender dyads produced larger effect sizes than did those of mixed gender dyads. Hence, the negative effect of gender stereotypes and gender roles based on role theory was suggested. For example, Rohrbeck et al. (2003) revealed that studies with same-gender dyads (weighted ES = 0.63) produced larger effect sizes than did those with mixed gender dyads (weighted ES = 0.30). Topping and Whiteley (1993) reviewed 15 tutoring programmes to examine the influence of gender composition on reading performance of tutees and tutors. The study indicated that same-sex pairs demonstrated greater reading achievement gain than those in mixed-sex pairs.
Based on equity theory, it was hypothesized that cross-age peer tutoring would be more effective in promoting tutors’ academic gain than same-age non-reciprocal peer tutoring. However, it was found that same age, non-reciprocal peer tutoring produced significantly greater effect sizes cross-age peer tutoring. Hence, equity theory was not supported.
During post-intervention, for the nature of the assessment instruments (unstandardized or standardized test), using unstandardized tests to control for author bias produced greater effect sizes than standardized tests. This finding is consistent with meta-analytic studies conducted by Cook et al. (1985) and Leung (2015), which found that unstandardized tests had greater effect sizes than standardized tests.
Implications
Best practices for peer tutoring and teacher training
This updated meta-analysis provides empirical evidence for educational trainers and practitioners to design and implement peer tutoring to optimize the effectiveness of promoting tutors’ academic achievement.
When considering the crucial parameters for optimizing the effectiveness of peer tutoring interventions during the pre-intervention stage, selecting tutees with low academic ability and tutors coming from secondary school may be more effective. Provision of fewer tutor training sessions per week and shorter tutor training time per session may be more appropriate.
During the intervention stage, choosing mathematics may be more effective than reading and other subjects. It also may be more effective to assign tutees and tutors randomly and group them in same-sex dyads. And provision of structure for peer tutoring and adoption of same-age non-reciprocal peer tutoring could be more effective. Additionally, more weekly sessions and longer tutoring time for each session are recommended.
During the post-intervention stage, adoption of unstandardized tests would likely be more effective than standardized tests.
Validation for theory
By evaluating the gender composition of the tutoring dyads and the type of peer tutoring, the present investigation provides strong evidence to support the role theory whereas equity theory was not supported. Hence, the present meta-analysis suggests other theoretically derived moderators could also be evaluated in a similar manner in the future.
Limitations and future research
Because the methodologies adopted in the present meta-analysis are based on current knowledge in meta-analysis, this study provides promising directions for future research to adopt these techniques for assessing the effects on helpers in peer tutoring and other domains (e.g., motivation and non-academic outcomes) or other types of peer-involved programmes (e.g., counsellors and mediators in peer counselling and peer-mediated interventions).
As discussed in Leung’s (2015) meta-analytic study, the first major limitation of the present study was the insufficient data for computing the effect sizes due to the considerable amount of missing data in studies. Future research on peer tutoring should adopt appropriate methodology that can report sufficient data for the computation of effect sizes.
Second, the reliability of the results of some moderator analyses was reduced due to the evaluation of a small number of studies. For example, only two studies at the college/university level examined the tutor’s education level. Since this factor will affect the statistical power of the analyses, future research should be conducted to obtain a larger number of studies for these subgroup variables.
Supplemental Material
Supplemental material for An updated meta-analysis on the effect of peer tutoring on tutors’ achievement
Supplemental material for An updated meta-analysis on the effect of peer tutoring on tutors’ achievement by Kim Chau Leung in School Psychology International
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material is available for this article online.
Author biography
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
