Abstract
Dependent group contingencies offer an efficient way to improve the behavior of an entire group of students, as the performance of only one or a few students needs to be monitored at a time. Prior literature reviews outlined the use of group contingency interventions with children in educational settings; however, these reviews did not exclusively examine dependent group contingencies or the varied conditions under which this intervention has been implemented. The purpose of this review was to classify the settings, populations, outcome measures, intervention components, and procedural parameters of dependent group contingencies across the research literature. We completed electronic database searches between 1970 and 2019 for experimental studies in APA PsycINFO, ERIC, CINAHL, CINHAL Complete, Psychology and Behavioral Sciences Collection, Education Source, Academic Search Ultimate, and ProQuest and ancestral searches for the exact terms “dependent group contingenc*” OR “dependent group-oriented contingenc*” in the title, abstract, or author-defined keywords. The results of our review are summarized and discussed in terms of directions for future research and implications for practice.
Keywords
Group contingencies are reinforcement procedures in which a common consequence (i.e., reward) is contingent on the performance of one, a small number of, or all members of a group. Rather than focusing on individual behavior support, group contingencies provide a simple way to address the performance needs of an entire group of students (Albers & Greer, 1991). These interventions are classified into three types: independent, interdependent, and dependent (see Litow & Pumroy, 1975). In an independent group contingency, the same criterion is in place for all group members, and each member must individually meet that criterion to receive their own reward (e.g., each student with a grade >80% earns access to reward). In an interdependent group contingency, reward access is contingent on a criterion set at the group level rather than the individual level (e.g., group average >80%). Under a dependent group contingency (DGC), the same criterion is in place for all members of a group (e.g., the class), but group reward delivery is contingent on a single student or smaller subset of students meeting that criterion (e.g., single student >80%; subgroup average >80%).
Unlike other group contingencies, DGCs require attention to only one or a few students, as opposed to the whole class—making them a feasible option for teachers in the classroom. DGCs may also promote class-wide cooperative behavior, even with untargeted students (D. A. Williamson et al., 1992). Other features of DGCs, such as how target students are selected (randomly or deliberately) and revealed (remaining anonymous, consistently identified, or revealed dependent upon success), can also be customized to enhance contextual fit to particular classroom settings. Given these benefits, DGCs may be an appealing strategy for educators to implement in their classrooms.
Despite the encouraging features of DGCs, elementary school teachers reported that they find DGCs to be unfair (Briesch et al., 2015). Researchers also raised concerns that DGCs could incite bullying (Heering & Wilder, 2006; Romeo, 1998), as target students might be identified as the person responsible for preventing the group from earning the reward. The variables leading to perceptions of unfairness and bullying require further investigation, but it is possible that examining particular arrangements of DGCs could be illuminating. Given the various ways that DGCs can be implemented, researchers and practitioners could benefit from a systematic review of the populations, settings, outcome measures, intervention components, and procedural variations often used in the research to better understand these variants.
Prior literature reviews examined all types of group contingencies with school-age children (Little et al., 2015; Maggin et al., 2012) and in school settings (Maggin et al., 2017). These reviews fill important gaps in our understanding of group contingencies, but there is much we do not know about DGCs in particular, including their use in other educational contexts (e.g., preschools, alternative schools), with school staff performance (e.g., increasing use of praise), in community-based applications, or in other settings where group behavior change is desired. Prior reviews also did not detail specific procedural elements related to how DGCs were applied in the research. DGCs can vary in ways such as the method for identifying target students, conditions under which students are revealed to the group, how potential reinforcers are identified, and the inclusion of DGCs as part of a treatment package. These details could have a substantial impact on the degree to which DGCs are effective and feasible to implement in varied contexts.
The purpose of this review was to provide a synthesis of the research examining DGCs. We built upon previous reviews by expanding our search to capture additional literature sources and years of publication in which DGCs have been investigated, and we coded for intervention characteristics not examined in prior reviews. We attempted to answer the following research questions:
Method
Article Identification, Screening, and Inclusion
We followed a process similar to prior research syntheses on group contingencies to identify and screen relevant articles (e.g., Little et al., 2015; Maggin et al., 2012). We first used EBSCO Host to search APA PsycINFO, CINAHL, CINHAL Complete, Education Source, ERIC, Psychology and Behavioral Sciences Collection, Academic Search Ultimate, and dissertations from ProQuest collections from 1970 to 2019. We searched for the exact terms “dependent group contingenc*” OR “dependent group-oriented contingenc*” contained in the title, abstract, or author-defined keywords. We then completed an identical electronic database search of the Journal of Behavioral Education between 1987 and 2018, followed by an ancestral review of three prior syntheses on group contingencies (Little et al., 2015; Maggin et al., 2012, 2017) where studies in the reference lists that contained the above terms in the title were included for screening. These efforts returned 146 total citations. After Zotero citation management software (Roy Rosenzweig Center for History and New Media, 2021) automatically removed exact duplicates (n = 99), 47 entries remained.
The first author then manually screened titles and abstracts (n = 47) for inclusion or exclusion followed by a subsequent round of full-text screening (n = 33). Our inclusion criteria were intentionally broad to maximize the number of potential articles. To be included, studies must have experimentally evaluated the effects of a DGC (see Litow & Pumroy, 1975; Little et al., 2015) in a single-case design (SCD) or group experimental research design (see Maggin et al., 2012). We included published articles, as well as unpublished dissertations. Studies that evaluated a DGC in isolation or in comparison to other group contingencies were included if the graphs allowed analysis of each group contingency type. We excluded all nonexperimental articles. This included procedural descriptions (e.g., practice guides or book chapters; n = 5), survey articles (n = 1), and literature reviews/meta-analyses (n = 2). We also excluded exact duplicates not identified by Zotero (n = 4), dissertations that were later published (n = 4), and articles where the DGC could not be separated from other interventions/conditions (n = 3). We excluded a total of 19 articles from the 47 retrieved in our search, which resulted in the inclusion of 28 articles (see Supplemental Diagram).
Coding Procedures and Intercoder Reliability
We developed a Google Form to extract, categorize, and evaluate the characteristics of the 28 DGC articles. The primary and secondary coders were doctoral students and board certified behavior analysts familiar with group contingency research. The primary coder (first author) trained the secondary coder (second author) by reviewing coding definitions, how to access articles, and the use of the coding form. The primary and secondary coders then scored a sample of studies (not included in the review) simultaneously to clarify the coding process and discuss any questions that arose. Coders then independently scored nonincluded studies until they reached 80% agreement for two consecutive studies. Both the primary and secondary coders scored 100% of identified articles independently. After all studies were independently evaluated by both coders and prior to data analysis, form responses were imported into a Microsoft Excel spreadsheet for intercoder reliability (ICR) scoring and data analysis.
To calculate a percentage of ICR, we divided the total number of agreements by the total possible and then multiplied by 100. We scored agreements if both coders’ entry was identical and disagreements when coding differed in any respect or any portion of the coding omitted details from the other (e.g., Coder A reported goal setting, but Coder B reported both goal setting and feedback). ICR was calculated on initial responses and was not adjusted after reaching consensus. Coders agreed on 87% (range, 0%–100%) of items across all coding categories for the studies included. Notable exceptions in ICR occurred in the unit of data analysis (90% agreement across 33% of articles) and What Works Clearinghouse (WWC) evidence standards data path scoring (100% across a random sample of data paths >20%). After ICR scoring, the first and second authors met to reach a consensus for all of disagreements. Consensus involved reviewing disagreements, discussing reasons for coding, reviewing the article, and reaching a determination on final coding. If consensus could not be reached, a third coder (a faculty advisor and the third author) provided a final determination, but this was not required.
Participant and setting characteristics
To report the diversity of DGC research participants, coders extracted participant age range or grade, disability, and race and ethnicity information reported by researchers. Coders identified the settings by country, geographic region (e.g., mid-western United States), locale classification (e.g., rural, suburban, urban), the general setting(s) where the procedure was employed (e.g., elementary school, university program, home), the specific educational setting if applicable (e.g., special education, general education, or physical education [PE]), and the academic subject or activity (e.g., reading, math, soccer).
Measurement and intervention characteristics
Coders categorized the target behaviors and dependent variable characteristics by the dimension of behavior (e.g., frequency, duration, rate) or system (e.g., partial interval), the topography of behavior (e.g., physical activity, mathematics accuracy), and details regarding social validity measurement. These coding characteristics are consistent with previous reviews (see Maggin et al., 2012). Coders also extracted the implementer of the DGC, the number of students in the target group, the method for determining target individuals (e.g., random or deliberate), how targets were revealed to the group (e.g., anonymous, identified, or conditional), how reinforcers were determined, the frequency of reinforcer assessment and delivery, session duration and frequency, and any treatment components co-occurring with the DGC (e.g., goal setting, vocal performance feedback, and written feedback).
Design characteristics
Coders scored the quality of SCD studies using the WWC SCD standards (Kratochwill et al., 2013). Coders provided a narrative description of baseline and treatment data and a dichotomous judgment (yes/no) of experimental control, which involved visual analysis of graphs to examine level, trend, variability, overlap, immediacy of effect, and consistency of data patterning. One rating was given for each individual data path within a panel for reversals and alternating treatment designs, and per data set and data path across panels for multiple baseline designs. Coders applied the WWC design standards (i.e., meets standards without reservations, with reservations, or does not meet) and evidence standards (i.e., strong, moderate, or no evidence) accordingly. We did not score combined or other SCDs using the WWC standards as explicit guidance was unavailable for these variations. As a result, coders only extracted a dichotomous judgment of experimental control and narrative descriptions of the structural characteristics of the graph. If a group design was used, coders reported if participants were randomly assigned to experimental conditions, presence of a control group, whether mean differences were found to be statistically significant, effect size measures and values, and confidence intervals. We did not apply design or evidence standards to the two group design studies. Coders also extracted interobserver agreement and treatment integrity data, including the percentage of sessions data were collected and the mean and range of scores. As our findings are discussed, it should be noted that articles falling into more than one category (i.e., more than one participant population studied, multiple behaviors of interest, inclusion of multiple training components) were recorded in multiple categories, influencing reported totals.
Results
Participants and Settings
Our findings show that DGCs were most frequently studied with elementary-age children (5–10 years old; 47%), followed by middle school–age students (11–13 years old; 31%; see Table 1). Few studies included high school–age students (14–18 years old; 19%), and no studies included adults. Fewer than half of the studies (43%) included participant racial or ethnic information. For studies that included this information (n = 12 studies), 39% of participants were described as White, 32% Black/African American, 8% Hispanic, 1.7% Asian American, 1.3% Native American, and 0.9% biracial. The remaining 17.2% of participants were coded as “racial/ethnic information not specified.” In articles that included participant disability categories (n = 12), individual counts of diagnoses show intellectual/developmental disabilities, emotional disturbance, and learning disabilities individually made up 18% of the diagnosis count (n = 4 per category). We also identified participants considered to be typically developing (n = 3), with traumatic brain injury (n = 2), attention-deficit/hyperactivity disorder (n = 2), speech impairment (n = 1), oppositional defiance disorder (n = 1), and other health impairment (n = 1; please refer to the table in the online supplemental materials). If authors did not state that participants were typically developing, neurotypical, and so on, coders labeled participant characteristics as “not reported/unspecified” (n = 16 studies).
DGC Study Summary Table.
Note. The above table summarizes the overall composition of DGC research from 1970 to 2019. C = competition; DGC = dependent group contingency; GS = goal setting; LS = level system; MM/RR = mystery motivator/random reinforcer; PC = punishment contingency; PF = vocal performance feedback; Voc. VP = vocal prompting; Vis. PF = visual performance feedback; SM = self-monitoring; TE = token economy; WF = written feedback.
All of the DGCs reviewed were applied in educational settings, most widely in public schools (n = 20 studies). Few investigations occurred in charter (public charter n = 2; n = 1 unspecified charter), private (n = 1), or alternative schools (n = 1), Head Start programs (n = 1), or other educational settings (n = 3). Prior researchers applied DGCs most in general education (n = 11; includes PE classes) and special education settings (n = 8) and during mathematics (n = 9), literacy (n = 8), PE (n = 4), science (n = 2), and noninstructional activities (n = 3).
Measurement and Target Behavior
Dependent variable measurement
The count of behavioral topographies included on-task/academic engagement (n = 6), problematic/disruptive behavior (n = 6), work accuracy (n = 4), work completion (n = 3), physical activity (n = 3), appropriate and inappropriate vocalizations (n = 3), destructive behavior (n = 2), social interactions (n = 2), reading fluency (n = 1), addition fluency (n = 1), and activity transitions (n = 1). These findings indicate more emphasis was placed on increasing overall attending and decreasing problem behavior than to topographies related to a particular content area (e.g., math accuracy). We discuss some of the more common dependent variables in greater detail below.
Academic behavior
We judged fewer than 25% of data sets provided reliable demonstrations of experimental control for studies that attempted to increase reading fluency (Alric et al., 2007), homework accuracy (Lynch et al., 2009), and spelling accuracy (Shapiro & Goldberg, 1986). Some studies that attempted to increase student academic engagement (e.g., Briesch et al., 2013; B. D. Williamson et al., 2009) or on-task behavior (e.g., Heering & Wilder, 2006) were also associated with ratings of experimental control for fewer than 33% of data sets, whereas Cariveau and Kodak (2017; 38% of data), Deshais et al. (2018b; 67% of data), and Ferneza et al. (2013; 100% of data) showed higher percentages of persuasive demonstrations. Although it is unclear why these particular demonstrations were more effective than others, it is notable that all three of the successful demonstrations used randomized conditional DGCs (i.e., target students selected at random and revealed only if the criterion was met) at least once per day.
Disruptive and destructive behavior
Past researchers used DGCs to reduce disruptive (e.g., Bulla & Frieder, 2018; Theodore et al., 2004) or problematic behaviors (Reitman et al., 2004), as well as specific topographies like talking aloud (Coleman, 1970), negative verbal statements (Hansen & Lignugaris/Kraft, 2005), and verbally disrespectful behavior (Jones et al., 2008). We found these applications had higher judgments of experimental control than academic, cooperation, sportsmanship, and physical activity behavioral topographies. Two thirds or more (> 67%) of the data paths from Coleman (1970), Hansen and Lignugaris/Kraft (2005), Hartman and Gresham (2016), and Reitman et al. (2004) were judged to be conclusive experimental demonstrations.
Cooperation and sportsmanship-like behaviors
Prior researchers examined separate topographies of social behavior (n = 3) that can be largely categorized into cooperative (e.g., working together to generate estimations; D. A. Williamson et al., 1992) and sportsmanship-like classes (e.g., encouraging comments to teammates; Vidoni & Ward, 2006). D. A. Williamson et al. (1992) compared teacher reports of cooperative behavior of students in an independent group contingency and a DGC and found that students in the DGC group seemed more likely to display cooperative behavior, F(1, 34) = 4.03, p < .05. We also judged that in the intervention package used by Vidoni and Ward (2006), 100% of the data sets demonstrated experimental control in increasing prosocial behaviors of middle-school students. Speltz et al. (1982), however, did not use a true experimental design; therefore, we considered that these data did not demonstrate experimental control for positive social interactions.
Physical activity
An identical DGC package to Vidoni and Ward (2006) was used to target physical activity, with varied results. Our judgments reflect that Vidoni et al.’s (2014) data showed conclusive experimental demonstrations (100% data paths with experimental control) whereas the data presented in Azevedo et al. (2016; 0% with experimental control) and Vidoni et al. (2012; 25% with experimental control) were ambiguous. These mixed findings indicate that this intervention does show potential to increase physical activity.
Social validity measurement
Social validity was reported in over half (57%; n =16) of the coded studies. Some studies interviewed teachers (n = 3) or students (n = 3) only, but the majority surveyed both (n = 10). Researchers occasionally used a pre-established measure including the Intervention Rating Profile-15 (IRP-15; Martens et al., 1985; n = 4), Bray and Kehle Index (1996; n = 3), IRP (Witt & Martens, 1983; n = 2), Student Teacher Relationship Scale (Piata, 2001; n = 1), and the Treatment Evaluation Inventory–Short Form (Kelley et al., 1989; n = 1), but many researchers generated their own measure (n = 8; see the online supplemental materials table). Overall, 94% of the reports portrayed the application and outcomes of DGCs positively, although we did not examine the degree to which these measures adequately captured the construct of social validity.
Intervention Features and Components
Preference assessment and reinforcement
Over half (57%; n =16) of the articles described the process of how researchers identified potential reinforcers prior to DGC use. The most common formats were participant (n = 7) and teacher (n = 4) surveys or vocal reports. Two studies used a class-wide vote, and three studies used a formal preference assessment (PA; e.g., multiple-stimulus without replacement; DeLeon & Iwata, 1996). Eighty-eight percent of the studies that used a PA reported measuring preference only once at the beginning of the investigation. One study included assessment prior to the start of each day (Shapiro et al., 1986) and one prior to each intervention (B. D. Williamson et al., 2009). Interestingly, Shapiro et al. (1986) and B. D. Williamson et al. (2009) were not associated with more convincing demonstrations of experimental control over the other studies as only 33% of data sets in each study were judged to have experimental control. Studies that only used a survey once at the beginning (Deshais et al., 2018a; Ferneza et al., 2013; Gresham & Gresham, 1982; Hansen & Lignugaris/Kraft, 2005; Hartman & Gresham, 2016; Vidoni et al., 2014) were judged by coders to have experimental control in over 67% or more of the evaluated data paths.
When coding for frequency and timing of reinforcement delivery, six studies did not provide sufficient information and could not be coded leaving 22 articles. Nine of these articles issued a reinforcer immediately after the criterion was met, five delivered reinforcers at the end of the school day, one at the middle and end of the day, and another on the following day. The remaining studies described delayed reinforcement at the end of the week or beginning of the following week (n = 2), or upon the conclusion of the study (n = 4; see the online supplemental materials table). The studies that described delivering reinforcers immediately and frequently (n = 8), Coleman (1970) and Vidoni and Ward (2006) were the only two studies we judged to have experimental control in 33% or more of data paths. In all but two studies (Shapiro et al., 1986; Theodore et al., 2004) that issued reinforcers at the end of the day (Deshais et al., 2018b; Gresham & Gresham, 1982; Hansen & Lignugaris/Kraft, 2005), we judged 67% or more of the data to have experimental control.
DGC Dosage and Features
Intervention length and frequency
Session length and intervention frequency were codable for all but one article (Ralston, 2012), and the dosage of intervention (how long and frequently treatment was conducted over time) was codable in 57% of articles (see Table 1). The remaining articles provided session duration information and did not include details to determine how often treatments were delivered per day, week, or month. In the articles that reported codable session rate, 47% delivered the DGC once daily, 29% more than once per day, and one study 3 times per week (Vidoni & Ward, 2006) and another 4 times per week (Cariveau & Kodak, 2017). Another two studies described sessions that occurred at least once per week (Azevedo et al., 2016; D. A. Williamson et al., 1992).
Target group size, selection, and reveal
Research on DGCs has primarily used random student selection (86%) with groups that consisted of only one student (79%) who generally remain anonymous (50%; see Table 1). We identified five studies that used deliberate student selection. Another five studies reported target groups of two or three students and a final study used four to five students (Heering & Wilder, 2006). We were unable to identify target group identification in four studies (Alric et al., 2007; Azevedo et al., 2016; Briesch et al., 2015; D. A. Williamson et al., 1992), and one study evaluated both anonymous and identified variations (Speltz et al., 1982). Variations that used groups of three or more students (n = 4), identified students (n = 6), or revealed students conditionally (n = 5) have not been extensively studied. An interesting finding is that in half (n = 3) of identified DGCs (i.e., target students were known to the class; n = 6), 100% of data paths were judged to have experimental control.
Intervention implementation and components
DGCs have been studied using teachers as implementers (n = 24), but others have used experimenters to carry out the intervention (n = 4). Alric et al. (2007) was the only study that involved the classroom aide in addition to the teacher. While using teachers as implementers has remained relatively consistent, the components accompanying DGCs have varied considerably (see Table 1). In the reviewed articles, three studies did not provide enough information for us to code the included treatment components (Coleman, 1970; Deshais et al., 2018a; B. D. Williamson et al., 2009). From the remaining 25 articles, goal setting and vocal performance feedback were each used in 56% of DGC intervention packages. Mystery motivators and/or random reinforcers were found to occur in 48% (n = 12) of the reviewed treatment packages, followed by visual performance feedback (n = 9), punishment contingencies (n = 8), vocal prompting (n = 6), token economies (n = 4), self-monitoring (n = 3), written feedback (n = 1), level system (n = 1), and competition (n = 1).
An equal number of studies (n = 14) reported using some form of vocal performance feedback (e.g., behavior-specific praise or corrective feedback) and goal setting, which could be naturally occurring and effective components of DGCs. However, it was common for articles to provide incomplete procedural descriptions which prevented us from extracting all the potential components researchers may have used. Therefore, it is uncertain whether the reviewed studies included goal setting, vocal performance feedback, or other components unbeknownst to us and possibly other readers.
Experimental Quality Indicators and Control
Measurement and design quality
In terms of the level of analysis presented in graphs, 50% of studies presented individual-level data, 39.3% group-level data, and 10.7% of studies included both individual- and group-level data (see Table 1). Seventy-one percent of the studies included reliability measurement and treatment integrity data were reported in less than half (46%) of the studies. SCD made up all but two of the reviewed studies (i.e., Gross et al., 2016; D. A. Williamson et al., 1992; 7%), with reversal/withdrawal designs representing 36% of SCD, alternating treatments/multi-element designs 21%, multiple baseline designs 18%, combined designs 11%, and other SCD variations which made up less than 8%. Forty-seven percent of all reviewed data paths met the WWC SCD standards “without reservations,” 21% met standards “with reservations,” and 32% “did not meet” design standards. Although the classifications by design type were relatively equal, 60% of multiple baseline designs were classified as “did not meet” standards. This was primarily because graphs included five or fewer phases—only two of five studies included three or more panels staggered across time. Also, we rated 87% of alternating treatment design data series as “meeting without reservations,” but this is likely due to less restrictive guidelines on this type of design. We applied the evidence standards for each individual data set within a study where the design standards could be applied, resulting in 21.2% of the codable individual data sets rated as “no evidence.” The remaining data offered “moderate evidence” in 35.3% of data sets and “strong evidence” in 43.5% of data sets.
Experimental control
We judged 38% of the total SCD data paths demonstrated experimental control. This included 48.7% of reversal/withdrawals, 20.8% of alternating treatments, and 20% of multiple baseline data sets. A common error across reversal and multiple baseline data sets was a failure to include the minimum number of phases required to demonstrate an effect. Combined designs were associated with the most ratings of experimental control at 89%. It is important to note that studies using an alternating treatment design that produced a general but undifferentiated effect across conditions (e.g., desirable behavior changed indiscriminately across conditions) we determined did not demonstrate experimental control, which may affect the interpretation of our results. The most that could be said about these cases, as well as other designs producing similar patterns (e.g., reversal design without reversals in behavior), is that a general effect was seen across all conditions that happened to coincide with the onset of the experimental phase; whether the interventions themselves were, in fact, responsible for the change could not be determined.
Discussion
In this review of the DGC literature from 1970 to 2019, we coded for the settings, populations, outcome measures, intervention components, and procedural variations of DGCs. Results of our review illuminate implications for practice and directions for future research. In our recommendations below, we emphasize the importance of contextual fit and combining DGCs with other strategies identified in our review to increase the likelihood of socially valid behavior change in the classroom. Before offering these recommendations, there are limitations of the review worth noting.
The use of a dichotomous judgment of experimental control presents some challenges. As noted earlier, in some studies the design itself prevented the opportunity to demonstrate experimental control. Therefore, experimental control coding reflects our opinion on the quality of research demonstrations—not necessarily on the effectiveness of DGCs. Using an effect size metric (e.g., Tau-U, Parker et al., 2011) in future reviews can improve reader confidence of experimental effects, facilitate the inclusion of review findings in future meta-analyses, and allow for publication bias calculations (Gage et al., 2017). Also, considering we applied the prior version of the WWC standards in this review, which possess their own limitations, future researchers should consider using the updated 4.1 version of the WWC standards (What Works Clearinghouse, 2020), alternative research evaluation tools, and employing established group design evaluation criteria to improve study result analysis and reporting. These suggestions would likely improve reader confidence in the interpretation of intervention effectiveness and study findings.
The terms used in the search process, “dependent group contingenc*” OR “dependent group-oriented contingenc*,” only returned studies that included “dependent” in the title, keywords, or abstract. Studies that evaluated the use of group contingencies but only labeled DGCs in the manuscript body and not in the title, keywords, or abstract would not be returned using the terms in our search. We attempted to enhance the identification of unpublished data to limit publication bias (Gage et al., 2017) by conducting ancestral searches of past reviews and including dissertation research, but additional search strategies and terms could be included to identify additional relevant research. Future reviews could conduct reverse citation searches (i.e., “cited by”) of previously reviewed DGC studies and interview expert group contingency researchers to locate additional unpublished data. Future reviewers could also outline and follow established literature review plans and procedures, such as Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and QUOROM (Moher et al., 2009), to enhance reader confidence in the literature search strategies. Although we replicated and extended prior review procedures, including these plans and detailed descriptions in review manuscripts would be beneficial.
Recommendations for Practice
Our findings might give pause to implementers, in that we rated few studies as demonstrating experimental control. Since we did not focus on the presence of a positive clinical effect, our results should not be interpreted as speaking to the effectiveness of DGCs. This intervention has been documented by prior research and reviews as an effective intervention (Little et al., 2015), and the potential benefit of DGCs in schools should not be underestimated. Like other behavioral interventions, DGCs should be embedded within a continuum of support, incorporate data-based decision-making (both fidelity and student outcome data), and be sensitive to contextual factors to better achieve desirable student outcomes (see Horner et al., 2014; Sugai & Horner, 2009). Results of our review suggest that when coupled with less intrusive Tier 1 strategies (e.g., performance feedback for behavioral expectations), DGCs are likely to enhance the effectiveness of class-wide Tier 1 support. Evidence for this recommendation is reflected in the DGC studies that included Tier 1 strategies such as goal setting (n = 14), vocal performance feedback (n = 14), visual performance feedback (n = 9), and vocal prompting (n = 6). Given the effectiveness of DGCs paired with these strategies, practitioners might consider including them when implementing DGCs.
Student reinforcer surveys (n = 7) and end-of-day reinforcer distribution (n = 6) were the most frequently identified methods of reinforcer assessment and delivery, and both were included in well-designed studies demonstrating experimental control. As such, these variations of DGCs appear to be effective and require less response effort than formal PAs and frequent reinforcer delivery. Students, for example, could generate and rank reward options in survey format (e.g., Hansen & Lignugaris/Kraft, 2005), and then teachers could randomly select rewards (e.g., Ferneza et al., 2013) from that pool to deliver at the end of the day (e.g., Hartman & Gresham, 2016). Incorporating input on reward type and delivery format could inform DGC design to align with the values and preferences of those in the intervention context (e.g., implementers, students, resource providers) and improve outcomes.
When designing a DGC, implementers must carefully consider the criterion for accessing the reward and the target size (individual student or subgroup). When setting the criterion, it is important to collect baseline data to determine the current student performance, compare those data to the terminal level of desired performance (i.e., the long-term goal), and set the DGC criterion at a level that is feasible to attain. Like other behavior support strategies, it is essential that students contact sufficient reinforcement to increase and maintain desired behavior. With success, implementers can systematically adjust the criterion until performance reaches the desired level. In addition to the criterion itself, how the criterion is applied to the target (subgroup or individual student) must also be considered. When the criterion is applied to a subgroup, there are a variety of ways in which the contingency can be structured, such as an average level of performance (e.g., average quiz score >80% for selected students), the number or percentage of students in the subgroup meeting criteria (e.g., 75% of students in the subgroup receiving >80%), or for each student in the subgroup (e.g., each student score is >80%). For an individual student, the criterion is applied only to their performance.
Another consideration is the method of selecting the target student(s). Randomly selecting students can minimize the inadvertent selection of a disproportionate sample of high- or low-performing students (Ferneza et al., 2013). Randomization has been the most widely studied (n = 23) method for target student(s) identification, and it can be easily implemented in classrooms by using an online random number generator or simply drawing student names. Implementers should be sensitive to the possibility that bullying could occur if target students are known to the group and rewards are not earned (Romeo, 1998). This is diminished although not entirely avoided in the anonymous variations of DGCs where target student(s) are not revealed, but teachers might also consider revealing the target student(s) when the goal is met (i.e., conditional reveal; e.g., Deshais et al., 2018a). Using a conditional reveal in a DGC conceivably mitigates bullying and also allows target student(s) to contact positive peer interaction when goals are achieved. Those designing DGCs should consider asking both implementers and recipients to provide input on these methods. This would inform designers in determining if and how they might reveal subgroup members, in order to improve contextual fit.
Recommendations for Research
All of the studies we reviewed were incomplete or ambiguous in some fashion regarding the methodological description and this inhibited our ability to extract one or multiple categories of data. In general, the participant skillsets, diagnoses, race, and ethnicty were not well described despite this information being essential in understanding disparities in the research and bounds of intervention use. Similarly, procedural descriptions often provided insufficient detail to allow for replication. For instance, in many studies where researchers used DGCs but goal setting was not described (n = 14), it is uncertain how students were informed of the criteria for reinforcement. This raises questions about how prior researchers introduced and taught the DGC to participants. For researchers to effectively replicate studies and practitioners to implement interventions as intended, it is important authors provide comprehensive descriptions of their procedures.
In our review, past researchers used DGCs in combination with other interventions in all of the studies that could be coded (n = 25). Although treatment packages may enhance intervention outcomes, it does not aide in the understanding of DGC effectiveness when used in isolation. Complex treatment packages presumably consume limited resources in schools, so it seems important that future research systematically evaluates the components commonly used with DGCs to understand which components are integral for DGC success. This is particularly important to determine the extent to which DGCs may be modified to the needs of a particular context and if they in fact save time for teachers, as has been proposed by other researchers (Albers & Greer, 1991).
Further investigations on how target student(s) are selected and revealed pose an interesting line of research. Little is known about how deliberate or static selection influences student performance or when target group sizes begin to create a situation where DGCs function as interdependent group contingencies. Researchers might investigate the extent to which revealed DGCs (i.e., target student[s] are known to the class) encourage negative social interactions (Heering & Wilder, 2006) and if revealing students conditionally minimizes this undesirable side effect while providing opportunities for positive peer interaction. Considering that revealed DGCs (Coleman, 1970; Gresham & Gresham, 1982; Reitman et al., 2004) were coded with higher judgments of experimental control and that this variation has been studied the least (17.8% of total studies), there is much to learn about the desirable and undesirable effects of this DGC arrangement.
Another practical direction for future research is investigating the effects of DGCs in other educational settings (e.g., afterschool programs, alternative schools) or during extracurricular activities (e.g., sports, student clubs) to support prosocial behavior among students. We found few (n = 3) studies that sought to improve student cooperation, despite the proclaimed benefits of improving collaboration. Applying DGCs where increased cooperation is desirable such as group competitions, small group lessons, or team sports afford a unique application of DGCs in schools. Additional research on the Fair Play Game (Vidoni & Ward, 2006), a DGC treatment package, might also be a worthwhile endeavor as this intervention was found to be the most consistently described and replicated DGC treatment package used in PE classes (n = 4).
Examining how DGCs could be used to promote staff performance is another interesting frontier for DGC research in school settings. In our review, we found no research targeting adult behavior, but it is feasible that DGCs could be used with school staff to support their implementation of behavior support (e.g., increasing student acknowledgment). Applying DGCs in this new domain offers exciting opportunities to advance our understanding of this strategy which could have far-reaching implications for student success.
A final set of directions for future research is to evaluate the factors that influence the social validity of DGCs for both implementers and students and how DGCs can be adapted to improve contextual fit. Given that all studies with social validity measures used some form of survey or interview after DGC delivery, more robust methods to assess social validity and the factors that might foster implementer and recipient perceptions of unfairness (Briesch et al., 2015) are warranted. Investigators could design studies to assesses social validity with teachers and students prior to, throughout, and following implementation to see whether perceptions shift after experiencing a DGC. This could be completed while systematically manipulating particular intervention components and arrangements.
Another suggestion is to train teachers to use all group contingency variations and allow students or teachers to select which one to apply (e.g., Hanley et al., 1997). Evaluating continued intervention use would provide a more reliable indicator of preference and social validity compared with vocal or written reports (Kennedy, 2002). Studies in these areas could offer valuable information on aspects of DGCs that are socially valid/invalid (Schwartz & Baer, 1991) and how DGCs could be modified to improve social validity and contextual fit. These areas of future research provide fascinating opportunities to enhance our understanding of DGCs so that all students can be supported to achieve their maximal educational potential in ways that everyone enjoys.
Supplemental Material
sj-docx-2-pbi-10.1177_10983007211054519 – Supplemental material for A Systematic Review of Dependent Group Contingencies (1970–2019)
Supplemental material, sj-docx-2-pbi-10.1177_10983007211054519 for A Systematic Review of Dependent Group Contingencies (1970–2019) by Scott V. Page, Dylan M. Zimmerman and Sarah E. Pinkelman in Journal of Positive Behavior Interventions
Supplemental Material
sj-tiff-1-pbi-10.1177_10983007211054519 – Supplemental material for A Systematic Review of Dependent Group Contingencies (1970–2019)
Supplemental material, sj-tiff-1-pbi-10.1177_10983007211054519 for A Systematic Review of Dependent Group Contingencies (1970–2019) by Scott V. Page, Dylan M. Zimmerman and Sarah E. Pinkelman in Journal of Positive Behavior Interventions
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
Development of this paper was supported by the Office of Special Education Programs U.S. Department of Education (H325D170080). Opinions expressed herein are those of the authors and do not necessarily reflect the position of the U.S. Department of Education, and such endorsements should not be inferred.
Supplemental Material
Supplemental material for this article is available on the Journal of Positive Behavior Interventions website with the online version of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
