Abstract
This study investigated the effects of the Good Behavior Game (GBG) on classwide off-task behavior in two ninth-grade basic algebra resource classes. Ten students with a variety of disabilities, in two classrooms, and their special education resource teacher participated in this study. A reversal design was employed, in which the special education teacher implemented GBG compared to typical practice-algebra readiness instruction. Results showed that classwide off-task behavior decreased in the GBG conditions compared to the baseline and reversal conditions. Fidelity measures indicated that the teacher implemented GBG with fidelity. Students and the teacher rated GBG favorably. Overall findings support the use of GBG for reducing classwide off-task behavior. Implications for practice and future research directions are presented.
In today’s educational environment of accountability, schools face constant pressure to increase the academic performance of all students. As a mandate of the No Child Left Behind Act of 2001, schools must apply high academic standards and meet annual performance goals for all students, including those with disabilities. In addition, No Child Left Behind requires high schools to report their annual graduation rates and to demonstrate improved academic performance of their students. High school teachers must pay particular attention to students, including students with disabilities, who may be at risk for poor performance and school dropout.
Ninth grade is a critical year concerning school completion (Allensworth & Easton, 2007; Neild, Stoner-Eby, & Furstenberg, 2008) for all students, including students with disabilities, who drop out at higher rates than other student groups (Office of Special Education Programs, 2006). Researchers (e.g., Allensworth & Easton, 2007) have identified various predictors of dropout, including failure in English and mathematics classes, misbehavior, and poor attendance. Ninth-grade English and mathematics teachers must help all of their students learn and earn course credit for graduation. This is a daunting challenge, as students with disabilities have academic difficulties, often lack the prerequisite skills to successfully complete academic tasks (Wagner, Newman, Cameto, Levine, & Garza, 2006), and may attempt to avoid challenging assignments (Gunter, Denny, Jack, Shore, & Nelson, 1993), such as by engaging in off-task behavior.
According to a Public Agenda report (2004), teachers in secondary schools report that disruptive behavior is a significant concern. In fact, up to 40% of teachers reported spending more time managing disruptive behavior than teaching course content. Teacher concerns include noncompliance and disruptive behavior (i.e., off-task behavior).
Classwide Off-Task Behavior
When individual students or groups of students become off task in the classroom, they are more likely to demonstrate disruptive behaviors (Harris, 2008; Sutherland, Wehby, & Yoder, 2002), which are likely to affect all students’ attention and learning (Sutherland et al., 2002). Researchers studying on- and off-task behavior, or academically engaged time, report that, on average, typically developing students are on task for 75% to 80% of the time and thereby off task for 20% to 25% of the time (Walker & Severson, 1992). Gettinger and Walter (2012) reported that students might spend 45% to 50% of their time on task, or 50% to 55% of their time off task. This research (Gettinger & Walter, 2012; Walker & Severson, 1992) concerns the percentage of time that individual students spend off task. Few studies have examined classroom levels of on- or off-task behavior, although Richards, Heathfield, and Jenson (2010) recently conducted a study with classwide on-task behavior as the variable of interest. These researchers recorded the percentage of students on task. Results suggested that under the baseline condition, at any given time, 50% to 87% of students were on task, or 13% to 50% of students were off task. High levels of off-task behavior can be particularly problematic as it may adversely affect teacher efforts to provide instruction (Ratcliff et al., 2010). This highlights the need for teacher knowledge and fluency in effective classroom management strategies.
Evidence-Based Practices for Classroom Management
Effective classroom management practices may help to guard against off-task behavior. Successful classroom managers maximize structure and predictability; post, teach, review, monitor, and reinforce expectations; engage students in observable ways; and use a continuum of strategies to acknowledge appropriate behavior and respond to inappropriate behavior (Simonsen, Fairbanks, Briesch, Myers, & Sugai, 2008). Classrooms that are chaotic, less structured, and unpredictable are likely to waste instructional time (Simonsen et al., 2008). In the absence of effective strategies, disruptive behaviors can reduce time for teaching and learning. Such a loss of instructional time could translate to reductions in academic performance (Cameron, Connor, Morrison, & Jewkes, 2008).
On the other hand, classrooms with engaged students, structure, and predictability are likely to use instructional time efficiently and effectively (Simonsen et al., 2008). Therefore, strategies must address classwide and individual student behavior. Classwide on- or off-task behavior is an important variable for study because techniques targeting the entire class may be easier for teachers to implement, particularly at the high school level, which tends to rely on classwide instruction. The use of group contingencies is one such evidence-based approach to promote appropriate behavior among many students or an entire class.
Group Contingencies
Group contingencies are systems in which reinforcement for the group or individuals within the group is contingent on demonstrated behavior. In a meta-analysis of 99 studies, Stage and Quiroz (1997) concluded that group contingencies decreased disruptive behavior, finding a strong effect (ES = −1.02, SD = 0.63). Teachers may find group contingencies preferable because they are less demanding of time and effort than individualized contingencies and interventions and can address the behavior of all students in a classroom. Effective practitioners must collect and analyze data to determine whether or not a treatment is effective. Direct observation data are often used to determine effect (Fiske & Delmolino, 2012). However, it is impractical and inefficient to expect teachers to collect data on every student’s behavior in the classroom. When challenging behavior manifests in similar ways across several students a group-oriented intervention such as a group contingency may be particularly useful. To measure the effect of a group contingency, a teacher might randomly select particular students for observation or collect classwide data with the class considered as a single entity, perhaps by using discontinuous recording methods such as momentary time sampling (MTS). Research has shown MTS to be a valid data recording method for evaluating intervention effectiveness (Devine, Rapp, Testa, Henrickson, & Schnerch, 2011; Rapp, Colby-Dirksen, Michalski, Carroll, & Lindenberg, 2008; Tiger et al., 2013) when using group contingencies.
Another important feature of group contingencies is that they may be uniquely relevant to youth in secondary settings because older students may be less likely to receive positive reinforcement than students in earlier grades. For example, in one study, less than 50% of high school teachers supported the use of strategies to acknowledge appropriate student behavior (Flannery, Sugai, & Anderson, 2009). In addition, teachers in secondary settings tend to use techniques that might be considered punitive (Eccles, 2004), rather than proactive. The Good Behavior Game (GBG; Barrish, Saunders, & Wolf, 1969) is a proactive group contingency that has been associated with improved behavior and increased student engagement (Embry, 2002).
GBG
The GBG has been recognized as a successful classroom management strategy (Embry, 2002; Lannie & McCurdy, 2007; Tingstrom, Sterling-Turner, & Wilczynski, 2006). The GBG’s procedures include identifying target behaviors, posting rules, identifying reinforcers, dividing classes into at least two equal teams, identifying rule violators and stating their infractions or identifying examples of behavior that meet expectations, debiting the offending team for infractions or awarding points for achievements, and presenting daily and weekly prizes to the team with the fewest infractions or most points. Some researchers have implemented a criterion for winning—that is, the winning team could not receive over a certain number of infractions (e.g., 5; Tingstrom et al., 2006). The GBG allows teachers to (a) acknowledge appropriate behavior, (b) teach classroom rules and expectations, (c) provide feedback about inappropriate behavior, (d) engage in response cost practices, and (e) provide reinforcement.
In their initial empirical evaluation Barrish et al. (1969) successfully used the GBG to decrease out-of-seat and talking-out behaviors of fourth-grade students during mathematics and reading instruction. Other past applications of GBG suggested strong effects on impulsive, aggressive, disruptive, shy, and antisocial behavior, as well as substance use and abuse. The GBG primarily has been used to improve observable classroom behaviors with elementary-school-aged youth (Tingstrom et al., 2006). To date, most GBG studies have explored its use with students in first through fifth grades. Recent research replicated GBG use with kindergarteners (Donaldson, Vollmer, Krous, Downs, & Berard, 2011). Another earlier study (Phillips & Christie, 1986) explored the use of the GBG to effect time off-task in middle school students. Only two studies (Kleinman & Saigh, 2011; Salend, Reynolds, & Coyle, 1989) have used GBG with high school students to improve classroom behavior.
Kleinman and Saigh (2011) implemented GBG with typically developing students in a ninth-grade history class and found reduced verbal disruptions, aggression, and seat leaving. Only one study used GBG with high-school-aged youth with disabilities (Salend et al., 1989). That study, conducted with students with emotional disturbance at a residential school, found that GBG sharply decreased swearing and negative comments. No studies have been identified that used GBG with high school students with disabilities in a traditional public school setting. No identified studies have examined the effects of GBG on the levels of classwide on- or off-task behavior. The present study fills a gap in the literature by considering high school students with disabilities and classwide off-task behavior.
Purpose and Research Questions
The purpose of this study was to examine the effects of the GBG on the classwide off-task behavior of students with high-incidence disabilities at the high school level. The following questions guided this study:
Method
Setting
This study took place in a public high school in a suburban school district in Central Texas. The Texas Education Agency rated the district as “academically acceptable” and the participating high school as “recognized.” At the time of the study, the district served more than 10,000 students. Approximately 87% of the district’s students were considered to be economically disadvantaged; approximately 82% were Hispanic, 10% African American, and 6% Caucasian. The school’s enrollment was approximately 2,200 students and reflected the overall diversity of the school district. According to reports from the Texas Education Agency, approximately 63% of the school’s students were considered to be at risk and 12% received special education services. The resource room was in a typical high school classroom that contained 28 to 32 single desks set up in single-file rows that faced the overhead projector screen and whiteboard for whole-class instruction. Each class period lasted approximately 54 min.
Participants
Teacher
A female special education teacher of Caucasian descent participated in this study. The teacher had a master’s degree plus 30 hr of college credit in biology. She was alternatively certified in mathematics for Grades 8 through 12, in special education for early childhood through Grade 12, and in science for Grades 8 through 12. She served as a mathematics coach for her professional learning community and reported receiving prior training in schoolwide positive behavior supports, functional behavioral assessment, strategies for giving behavioral feedback to students, and behavior intervention plans. The teacher taught both resource classes, using the same curriculum. She had 3 years of teaching experience, including 2 years in ninth grade.
Students
Students in two ninth-grade algebra resource classes (Class 1 and Class 2) participated in the study. The average age of the students was 15.1. All students in the classes had disabilities and all demonstrated low mathematics performance, which made them eligible for resource program services in basic algebra. The majority of these students were identified with a specific learning disability (LD). Some students were identified with intellectual disabilities, and others were identified with other health impairments, mostly attention deficit hyperactivity disorder. All students in the classes received additional support from a study skills class, and several students participated in a reading intervention. Three students accessed an on-site behavior support program. The majority of the students were male, and most were Hispanic. All students were fluent English speakers, and instruction was provided in English. Approximately 90% of the participating students were eligible for free or reduced-price lunch.
Design
An ABAB reversal design with a follow-up phase was used to test the effects of GBG implementation on classwide off-task behavior of two high school basic algebra resource classes. In the intervention phase, the GBG was paired with typical instruction (GBG). The reversal phase included removal of the GBG to determine whether the GBG had an effect on classwide off-task behavior. To examine the maintenance of effects on classwide off-task behavior, follow-up observations were conducted two weeks after the conclusion of the final GBG phase.
Measurement
Classwide off-task behavior
The percentage of intervals in which the class was off task was selected as the dependent variable to demonstrate experimental control because during earlier observation sessions and in teacher-researcher discussions, (a) the majority of students displayed severe off-task and disruptive behaviors in both classes and (b) student behavior resulted in a chaotic environment that prevented access to academic instruction. For example, during one preobservation period in Class 1, the entire class (100% of students) was off task for 35 continuous minutes, with little to no learning occurring, despite teacher efforts. During other preobservation sessions, the majority of students were off task for multiple periods of time.
Using a 1-min momentary time sampling procedure, classwide off-task behavior was coded when two thirds or more of the class was observed to be off task. Off-task behavior was defined as time not attending to classroom instruction or completing teacher-assigned tasks (Lane et al., 2009). Two thirds was used as a metric for classwide off-task behavior because (a) it was unlikely that 100% of the class would be off task at the same moment; (b) in classes of 6 (Class 1) and 11 students (Class 2), respectively, only one or two students were typically on task and not engaging in behaviors that disrupted instruction; and (c) at least one study (Richards et al., 2010) found that 13% to 50% of a class might be off task at any given time. The intent of the GBG implementation here was to reduce classwide off-task behavior.
The observers used a MTS procedure with a 1-min interval size. Observers collected MTS data throughout the class period rather than a brief observation period, which research has determined to be a valid method (Devine et al., 2011; Rapp et al., 2008; Tiger et al., 2013). At the end of each 1-min interval, a pre-recorded cue sounded through the observer’s earphone, and the observer then had 5 s to observe the number of students off-task. If more than two thirds of the students (e.g., 6 out of 9) were off-task at the cue, the class as a whole was recorded as off-task as the class was treated as a single entity. This procedure was repeated every minute across the class period (approximately 50 min); thus, each data point in Figure 1 represented the percentage of 1-min intervals across the 50-min class period for each Class 1 and 2.

Percentage of 1-min intervals with off-task behavior across Baseline, Good Behavior Game (GBG), Reversal, and Follow-Up phases for Class 1 (upper panel) and Class 2 (lower panel).
Two researchers were initially trained in operational definitions and observation procedures. Researchers then conducted two 20-min observations to establish initial agreement. At the conclusion of each initial IOA observation, researchers discussed all disagreements and, when appropriate, discussed operational definitions of observation constructs.
IOA for the dependent variable was calculated during 30.8% of all observations. IOA was assessed during 33.3% of observations of Class 1 (n = 6) and 28.5% of observations of Class 2 (n = 6). IOA was calculated by dividing the number of agreements by the sum of agreements and disagreements. Overall, IOA was 93% for classwide off-task behavior.
Social validity
Student social validity was assessed using a modified version of the Children’s Social Validity Interview (Lane, 1997) to solicit responses concerning the utility of GBG and GBG+. Eight participating students rated the acceptability of the intervention on two 12-item questionnaires, consisting of yes/no questions, open-ended questions, and questions based on a 3-point Likert-type scale (1 = none at all; 2 = a little; 3 = a lot). Example items include (a) Did you like participating in (GBG or GBG +; yes/no answer), (b) What did you like best (open ended), (c) Did you like earning prizes (3-point Likert-type scale), and (d) Did you learn new skills you will use in the classroom (3-point Likert-type scale). Two participating students did not complete a social validity measure due to school absence or time conflicts.
The teacher completed a modified version of the Intervention Rating Profile-15 (Witt & Elliott, 1985), which was used to assess her ratings of the social validity of the GBG and GBG+ interventions. The questionnaire consisted of 15 items based on a 6-point Likert-type scale. Example items include (a) This would be acceptable to help with class engagement, (b) This method should prove effective, (c) This method would be appropriate for various types of classes, and (d) This method was a good way to handle the problem described.
Procedures
Baseline
The baseline condition consisted of typical instruction in algebra readiness content. Observations prior to the study revealed students wandering around the classroom, interrupting student-teacher conversations, bullying and teasing peers, and ignoring teacher directions and instructions. These behaviors were defined as off-task behaviors. High rates of disruptive and off-task behavior were characteristic of both classes.
Typical teacher responses to problem behavior included repeatedly asking individual students questions during whole-group instruction to engage them in the lesson and providing one-on-one assistance with the algebra content. Behavioral feedback occurred infrequently. In fact, only a small proportion of students in either of the classes received any behavioral feedback. In many instances, the teacher ignored problem behavior and continued to provide instruction, despite little student participation. The teacher tended to direct opportunities to respond toward students who primarily displayed compliant behavior in the classroom. During baseline, it was common for students to earn their participation points for the whole-class period by answering one teacher-provided question, despite being off task or disruptive for the majority of the class period.
Preparation for GBG Intervention
From observations prior to and during baseline, the research team learned that the teacher had limited classroom management and that her expectations were not well defined or used. For 1 week following baseline, the research team met with the teacher for training in order to prepare for GBG implementation. Each GBG training session for the teacher lasted approximately 30 min per day. The teacher worked with the researchers to redevelop classroom expectations so that they could be used during GBG. The researchers and teacher discussed how the teacher should explain the classroom expectations by talking with students about what each looked like and sounded like in her classroom. At the end of the week, the researchers presented the GBG to the teacher and explained its procedures. At this time, the researchers also modeled the use of the GBG for the teacher. Prior to beginning to play the GBG, the teacher worked with students to discuss and directly teach the classroom expectations.
GBG Phase
After discussing and teaching the classroom expectations, the teacher explained that the class would have a competition (the GBG) in teams to help meet the expectations. The teacher told students that she would look for students who were meeting expectations and that when she observed a student not meeting expectations; she would record a “foul” for that student’s team. The class was then divided into teams of three in Class 1 and teams of four in Class 2, which were named by the respective team members. Students were then required to sit with their teams, but the precise location was left to the students’ discretion. The teacher also explained that the winner would be the team with the fewest fouls each day as long as the number of fouls was below a certain level. The level of allowed fouls would remain unknown to the students each day until the end of the class period. The criterion requirement, which was consistent with one of the other two high school studies (Salend, Reynolds & Coyle 1989), was put into place because the teacher and the research team believed that as high school students with disabilities and challenging behavior the students might perform the minimum work required to earn the daily prize. Without the criterion, if one team were having a difficult time with their behaviors, the other team might not make an effort to do their best and instead put in minimal effort. With the mystery criterion requirement, students had to do their best to attempt to meet the criterion. Winners were declared at the end of the competition for the period. The students were told that the daily winner would earn a prize and a token to contribute toward a collective group prize. An open-ended preference assessment was also conducted at this point to determine what students were interested in earning. Students were provided with a sheet of lined paper and asked to indicate items or opportunities that they were interested in earning for meeting classroom expectations. Students were instructed to be reasonable with their list, meaning that they should not indicate activities or items that were prohibited by school policy. The teacher then examined these lists in order to provide daily prizes for the winning team or teams. In an effort to secure student buy-in and to prevent students from becoming discouraged, more than one winner was possible during any class period (e.g., in the case of a tie). The preference assessment was also used to determine the reward that would be earned when the class filled their class container (see below).
Each time the GBG was implemented the teacher announced “game on,” wrote team names on the board, and directed students to sit with their assigned teams. The teacher then reviewed the GBG procedures and class expectations. After reminding students that they were trying to ensure that they did not exceed the mystery number of fouls for winning, academic instruction began. During instruction, the teacher gave fouls for behaviors that violated class expectations. When 5 min remained in the class period, the teacher announced that she was “pausing” the game to count each team’s fouls. The mystery criterion for winning was announced and the winners identified. Winners won a reward at the end of class, based on the preference assessment (e.g., a piece of candy, school supplies, a coupon for a free homework), and they also earned a token to store in a large class container (i.e., a gallon-size jar), which was used to track progress toward earning a larger class reward. The daily prize was provided to winners as they exited the classroom, whereupon they also added their token to the large class container. This procedure required approximately 2 min upon students’ exit.
Reversal Phase
After establishing that the GBG was associated with a change in level and trend compared with baseline, the GBG was withdrawn. This reversal phase allowed the researchers to again demonstrate that behavior change was experimentally associated with GBG implementation. During this phase, the GBG was withdrawn and the teacher was told to do what she might normally do in her classroom. In this phase, the teacher used the same strategies observed in baseline including repeatedly asking questions of individual students during whole-group instruction in an effort to engage them in the algebra content. Again, behavioral feedback occurred infrequently. Again, students were able to earn their participation points for the whole-class period by answering one teacher-provided question at the end of the period. This occurred despite being off task or disruptive for the majority of the class period.
Follow-Up Phase
Five weeks after the last GBG phase, observations for maintenance of the GBG procedures were conducted. The teacher was not informed about what the researchers would observe during these follow-up visits. During this phase, the researchers were interested in (a) the extent to which the teacher had maintained GBG procedures maintained after researcher involvement had ended and (b) what the levels of classwide off-task behavior were at that time.
Fidelity of Implementation
Fidelity of implementation was calculated for each intervention phase and the reversal condition. Reversal conditions were measured to confirm the absence of intervention components. Fidelity for GBG was measured using a researcher-developed checklist adapted from Lannie and McCurdy (2007). The checklist was composed of GBG procedures, such as the teacher (a) posting the recording sheet, (b) announcing the beginning of the game, (c) reviewing rules/expectations with students, (d) providing reminders about the criterion, (e) identifying and recording fouls, (f) counting total fouls, (g) announcing the day’s criterion and the winning team(s), (h) providing winning team(s) with a reward, and (i) reviewing performance toward a cumulative reward. Across GBG conditions, mean treatment fidelity was 89.28% for Class 1 and 86.09% for Class 2. Interobserver agreement of fidelity was measured during 21% of the fidelity observations (intervention phases and reversals), with researchers having 94.80% agreement. In addition, the classroom teacher self-assessed fidelity of game procedures during 81.25% of the intervention sessions and provided researchers with a completed fidelity checklist for scoring purposes. Teacher self-assessment of fidelity had means of 88.61% for both classes. When fidelity was scored by the researchers and self-assessed by the teacher, agreement between the teacher and researchers was 85.47%.
Data Analysis
Classwide off-task behavior
The percentage of classwide off-task behavior was graphed after each observation session. Visual inspection was used to analyze changes in level and trend as well as to make decisions regarding phase changes (Kazdin, 2011). Visual inspection is a common analysis approach used in single-subject research in order to observe an experimental effect after treatment begins, and to demonstrate experimental control through replication across different phases (Horner et al., 2005; Kennedy, 2005). In addition to visual analysis, the percentage of nonoverlapping data (PND) was calculated as a measure of magnitude of the effect (Scruggs, Mastropieri, & Casto, 1987) on classwide off-task behavior. To determine PND, the number of treatment data points lower than the lowest baseline data point was divided by the total number of treatment points and then multiplied by 100. It was expected that the independent variable would facilitate a decrease in the dependent variable; thus, the lowest baseline data point was used as a reference point. Use of PND to evaluate the effectiveness of interventions that use a single-subject design is a common practice. Little or no overlap between baseline and intervention is considered evidence of a treatment effect (Kratochwill et al., 2010). Interventions with PND greater than 90% are considered highly effective, those with PND between 70% and 90% are considered moderately effective, those with PND between 50% and 70% are considered mildly effective, and those with PND less than 50% are considered ineffective (Mastropieri, Scruggs, Bakken, & Whedon, 1996).
Social validity
Social validity data were evaluated with descriptive statistics (means and ranges) for the GBG. Social validity was analyzed for each item as well as total social validity. Both teacher and student responses were evaluated.
Results
The GBG was implemented and withdrawn over several weeks in two basic algebra resource classes in order to reduce the overall high level of classwide off-task behavior. The following sections contain the findings related to the use of the GBG in two high school basic algebra resource classrooms.
Effect of the GBG Classwide Off-Task Behavior
Figure 1 shows the effects of GBG on classwide off-task behavior across all phases: Baseline 1, GBG (first GBG implementation), reversal, GBG (second implementation of GBG), and follow-up/maintenance for each class.
Class 1
During baseline for Class 1, classwide off-task behavior was quite high with a range from 36% to 61.11% of 1-min intervals with an overall increasing trend. Implementation of the GBG resulted in a decrease in classwide off-task behavior from 55.1% of intervals to 3.33%, which is a difference of 51.77% of intervals. In the first GBG phase for Class 1, the classwide off-task behavior ranged from 0% to 13.04% with three data points at 0%, one at 3.33%, and another at 13.04%. Clear separation between baseline and the first GBG phase was evident. In the next phase, reversal of the GBG was associated with an increase in classwide off-task behavior with a change from 0% to 43.18%. In the reversal phase, classwide off-task behavior ranged from 43.18% to 69.38% in the reversal phase with a slightly increasing trend across four data points. The reintroduction of the GBG demonstrated an immediate decrease in classwide off-task behavior from 69.38% (reversal) to 0% at GBG reintroduction. Classwide off-task behavior ranged from 0% of intervals to 48.57% of intervals. Clear separation was observed between reversal and the second GBG phase. During the follow-up phase in Class 1, classwide off-task behavior was observed to be approximately 30% with a range from 27.9% to 32.65%.
Calculation of PND on classwide off-task behavior for Class 1 was found to be 90% for the GBG intervention phase and 100% for the maintenance phase. Thus, 90% and 100% of data points in these phases did not overlap with baseline.
Class 2
In Class 2, classwide off-task behavior ranged from 27.9% to 67.34% of 1-min intervals. The baseline phase demonstrated a slightly increasing trend in classwide off-task behavior. Upon implementation of the GBG, classwide off-task behavior dropped from 59.09% of intervals (in baseline) to 6.38% of intervals (in GBG), which was a decrease of 52.71%. During the first GBG phase, classwide off-task behavior ranged from 6.38% of intervals to 23.4% of intervals with most observations reflecting classwide off-task behavior at approximately 10%. Between the last data point in the first GBG phase and the first data point in the reversal phase, classwide off-task behavior rose from 12% to 28.3%. During the reversal phase, classwide off-task behavior ranged from 27.27% to 84%. This phase reflected an increasing trend from 27.27% to 84% of intervals for classwide off-task behavior. Upon the next implementation of the GBG, classwide off-task behavior again decreased from 84% in the reversal phase to 7.69% in the second GBG phase, which is a more than 70% decrease in classwide off-task behavior. During this GBG phase, classwide off-task behavior remained somewhat below baseline levels, but demonstrated some variability with a range from 7.69% to 38%. Finally, during maintenance observations classwide off-task behavior was observed to have increased from 16.66% during GBG to 39.96% of intervals for the first observation in the maintenance phase. Classwide off-task behavior during maintenance ranged from 27.27% to 63.16% of intervals.
Calculation of PND between baseline and GBG phases for Class 2 was found to be 90%. Thus, 90% of the intervention data points did not overlap with baseline. For the maintenance phase, PND was 33.33% as all but one maintenance data point overlapped with baseline.
Use of the GBG During Follow-Up
When the researchers returned to conduct follow-up observations to determine (a) if the GBG procedures had been maintained, and (b) if the effects on classwide off-task behavior had been maintained, two interesting findings were discovered. For both classes, the teacher had discontinued the use of GBG procedures. In Class 1, classwide off-task behavior largely remained below baseline levels but had returned to a level higher than it had been during the GBG intervention phases. In Class 2, classwide off-task behavior had largely returned to the same level as it had been at during baseline.
Social Validity
Using a set of questions that employed a 3-point Likert-type scale, the mean student response ranged from 1.875 to 2.625 (M = 2.14), suggesting a somewhat positive opinion of the GBG. For questions with yes/no answers, student answers of yes ranged from 50% to 87.5% (M = 70.83%), also suggesting a somewhat positive opinion of GBG. Multiple students indicated that their favorite part of the GBG was the use of teams. For the teacher measure, the mean response concerning the GBG was 5.13 out of 6, suggesting a positive view.
Discussion
The purpose of the present study was to evaluate (a) the effects of GBG on classwide off-task behavior, (b) the extent to which the teacher continued to implement the GBG procedures without researcher involvement and the effect on classwide off-task behavior, and (c) the social validity for the use of the GBG with high-school-aged students. The findings are discussed according to the research questions. We also offer our insights about the limitations of the study and recommendations for future research. Finally, we provide implications for practice concerning the use of GBG in the high school setting.
Reductions in Classwide Off-Task Behavior
Overall, the GBG resulted in reduced classwide off-task behavior. Under baseline conditions, the class spent significant amounts of time engaged in off-task behavior, and during GBG reversal phases, classwide off-task behavior largely returned to baseline levels. It is particularly interesting that the final data point in the reversal phase of Class 2 showed a very high percentage of classwide off-task behavior. This data point was also the farthest away from the previous implementation of the GBG intervention. In addition, during this reversal phase, students in both classrooms asked why they were not playing GBG. Increasing trends across baseline phases suggest that when GBG was not in place and as time continued without exposure to GBG, the classes experienced higher percentages of classwide off-task behavior. These findings are particularly striking, given that each observation represented an entire class period rather than a brief interval of time (e.g., 10 min). The behavioral improvement found in this study is consistent with results from previous implementations of the GBG at the high school level (e.g., Kleinman & Saigh, 2011; Salend et al., 1989) and is further evidence of the need for teachers to implement appropriate and consistent classroom management procedures (Simonsen et al., 2008; Witt, VanDerHeyden, & Gilbertson, 2004) such as the GBG.
Anecdotally, in the reversal phase, when students asked why they were not playing the game, they also asked about whether they could win prizes and about the value of the mystery number (referring to the mystery criterion). When the teacher responded that they could not win prizes and/or that there was not a mystery number because they were not having the competition, the students asked when they could have the competition again. In addition to the formal social validity interviews, these exchanges also spoke to the students’ acceptance of the GBG procedures.
In the reversal phase for Class 1, students initially presented with less classwide off-task behavior than in the baseline condition. Since students had already been exposed to GBG in the first GBG phase, they were well-aware of classroom expectations, and had already received reinforcement for meeting expectations. Thus, there may have been some generalization of demonstrating behavior to meet expectations, which in practice is a very desirable and instructionally important outcome. In addition, the teacher had become more comfortable with providing feedback to all of her students, not just to students she believed would respond positively, as was seen in earlier observations. Thus, there may have been some generalization of teacher and student behaviors from changes to the classroom ecology and from the provision of researcher feedback and consultation. Examinations of graphs from other GBG studies at the high school level reveal similar results (e.g., Kleinman & Saigh, 2011; Salend et al., 1989).
Classwide off-task behavior in the baseline and reversal phases often exceeded 50% of intervals. Most researchers (e.g., Gettinger & Walter, 2012) and practitioners would likely agree that such high levels of classwide off-task behavior reduce instructional time (Sutherland et al., 2002) and thereby take a serious toll on overall acquisition of and fluency with instructed content and skills (Codding & Smyth, 2008) and on student achievement (Reinke, Lewis-Palmer, & Merrell, 2008). Considering the GBG’s reductive effects on classwide off-task behavior, it is likely that students have more access to academic instruction under GBG conditions. For high school students, this result may mean increased likelihood of passing examinations required for graduation.
Interestingly, even though fidelity ratings for the GBG were consistently above 80%, the intervention appeared to be more effective (i.e., greater reductions in classwide off-task behavior) when fidelity was rated at 95% or more. Lower fidelity percentages were most often associated with a failure to use fouls for behavior that did not meet expectations. Had this standard been adhered to more closely, GBG implementation may have resulted in greater reductions in classwide off-task behavior, as consistency in behavior management is critical (Gable, Hester, Rock, & Hughes, 2009; Simonsen et al., 2008).
Follow-up
Prior to intervention, the teacher expressed frustration concerning student behavior and reported needing help to facilitate learning in her classroom. The researchers’ observations of how the students sought attention from their peers through their disruptive behavior as well as the teacher’s approach to managing such behavior led the researchers to suggest the GBG. The effects of the GBG were particularly salient in Class 1, and teacher-rated social validity was quite high; thus, it was surprising that upon return after a period of several weeks, the GBG procedures were no longer being used. The teacher may have eliminated the GBG from use because she was not able to collect data about its effectiveness without the researchers’ presence. However, the practical significance of the decrease in off-task behavior should have functioned to maintain the teacher’s use of the GBG procedures. Interestingly, during maintenance, in Class 1 the level of classwide off-task behavior had not returned to the same level observed during baseline but was at a level somewhat above that which was observed during the last GBG phase.
Social validity of the GBG for high school
Although rarely researched in high school settings, GBG may be especially relevant for these students, as the social nature of these environments and social preferences of adolescents (Patrick, Ryan, & Kaplan, 2007; Ryan & Patrick, 2001) fit well with the interdependent contingency of GBG. The present study supports GBG’s relevance for high school students—During the reversal phases, students requested to work with their teams and questioned why they were not playing the game. During this developmental period, adolescents strongly seek peer approval (Scott & Steinberg, 2008). Playing the GBG, where rewards are contingent on behavior, creates a system in which peer approval promotes positive behavior. Overall social validity interviews revealed very positive responses for the GBG according to the students and their teacher.
Limitations
This study has a few limitations. First, GBG has a specific set of defined procedures for teachers to follow, meaning that GBG changes both student behavior and, in most cases, teacher behavior. In our study, fidelity was reduced at times due to the teacher failing to give fouls for inappropriate behavior. However, fidelity scores were always greater than 85%, and student behavior changed dramatically, even with teacher fidelity less than 100%. In addition, this study took place in a resource classroom with high school students with disabilities, which may mean that this study does not have external validity for classrooms that include students with disabilities who are not instructed in a resource or self-contained setting. However, results are consistent with previous studies that GBG implementation is related to behavioral improvement—in this case, reductions in classwide off-task behavior in an algebra classroom.
Future Research
Previous research has not widely explored the use of GBG with high-school-aged youth. Only two other studies (Kleinman & Saigh, 2011; Salend et al., 1989) have addressed the effect of GBG on classroom behavior with this age group. Although in the present study, GBG reduced classwide off-task behavior, which is an important outcome, future research should examine the use of GBG to increase on-task behavior. Finally, with the focus on autonomy in adolescence, interventions that use GBG and that incorporate elements of self-monitoring should be considered for secondary settings.
Implications for Practice
This study offers several implications for practice. First, teachers must be able to minimize problem behavior while keeping the instructional environment intact, rather than requesting removal of or avoiding interactions with disruptive students. Baseline findings in the present study corroborate previous research (Gunter & Coutinho, 1997; Wehby, Symons, Canale, & Go, 1998), which suggested that teachers tend to interact more frequently with students who are the most compliant and inconsistently interact with or ignore students with problem behavior. Yet, these students often need the most support and remediation, given their low achievement (Wagner et al., 2006). GBG limits such teacher avoidance by requiring teachers to identify behaviors that violate class expectations, assign fouls to teams, and provide corrective feedback to increase chances for students to use more appropriate behavior on subsequent opportunities.
This study suggests that GBG may be useful for secondary settings, as it had a clear effect on classwide off-task behavior. When teachers reduce off-task behavior, they can teach with greater breadth and depth, which is encouraging, given that all schools are trying to improve their test scores and graduation rates to comply with No Child Left Behind. Results of this study, combined with other high-school-level studies (Kleinman & Saigh, 2011; Salend et al., 1989), also suggest that GBG may assist secondary teachers’ management of prevalent but less serious problem behavior, such as noncompliance and overall classroom disruption, as referred to in a Public Agenda report (2004).
Anecdotally, conversations about implementing GBG at the secondary level often turn to student buy-in and whether students would see GBG as elementary. The present study provides promising evidence that students in secondary settings may be motivated to engage in GBG and improve their classroom behavior if reinforcement is appropriate and preferred. In fact, the students in the present study not only bought in to GBG, but also requested to play the game again or to work with their teams during reversal phases and follow-up. A system in a game-like format might also give teachers who feel less confident and prepared to manage behavior (Public Agenda, 2004) a framework to do so in a preventive, rather than reactive, manner, which will help them to maximize instructional time and foster student learning.
Footnotes
Declaration of Conflicting Interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) declared receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Texas Education Agency Grant 126600287110004.
