Abstract
This article features two studies examining the impact of a Tier 2 intervention—Recognize. Relax. Record. (RRR)—designed to support elementary students with internalizing behaviors by helping them manage anxious feelings and increase academic engagement. We collaborated with teachers to use screening and attendance data to locate students in a Midwestern U.S. who might benefit from RRR. Results of a multiple-baseline design across participants study with five students provided modest evidence of a functional relation between introduction of RRR Instruction and decreases in variability of students’ engagement during academic instruction. These shifts were mostly small in magnitude and in need of replication and future inquiry before generalizing results.
Treatment integrity data suggested high levels of implementation fidelity. Overall, social validity ratings were positive for most participants; however, some variability in ratings across students—as well as differences in adults’ pre- to post-intervention ratings—suggested the intervention was more acceptable for supporting some students than others. Results of an A–B nonexperimental study provided descriptive data suggesting potential benefits of implementing RRR instruction in a small-group format. We discuss the findings of this initial test of RRR, noting limitations and directions for future inquiry.
In the wake of the COVID-19 pandemic, there is rising concern for the well-being of school-age youth, with calls to address these concerns and priorities heard from policymakers, researchers, and media (National Center for Education Statistics [NCES], 2023; Richtel, 2023). This situation predates the pandemic, with evidence suggesting an increasing trend in social, behavioral, and emotional challenges for youth. For example, Lebrun-Harris et al. (2022) found a significant increase in diagnosed well-being concerns among children, including a 29% increase in anxiety and a 10% increase in behavioral/conduct problems from 2016 to 2020. The impact of the pandemic may have worsened this trend (Bera et al., 2022; NCES, 2023). It has long been true that schools act as the most common entry point for youth to receive support for their emotional well-being (Duong et al., 2021; Farmer et al., 2003). With schools closed or functioning in nontraditional ways due to COVID-19, students may have not only been negatively affected directly by school closures, social isolation, and loss of loved ones, among other pandemic impacts, but they may also have lost opportunities to access services in the setting where a range of supports—academic, behavioral, and social and emotional well-being—are most routinely provided (Lane et al., 2021; Minkos & Gelbar, 2020).
Given these circumstances, educators need research-based strategies to encourage students’ academic, behavioral, and social growth, with an emphasis on instilling skills related to self-agency and self-determination. This priority was articulated by the Institute of Education Sciences (IES) in 2022, with funding opportunities created for research and program development aimed at pandemic recovery by accelerating academic learning alongside social, emotional, and behavioral skill development for students with and at risk for disabilities. This aim of creating school structures to support students across learning domains in a purposeful and proactive manner overlaps with recent expanding interest in integrated tiered systems (Gandhi et al., 2023; Lane et al., 2020). Schools implementing these systems create a continuum of effective, research-based supports integrating multiple domains (e.g., academic, behavioral, social, and emotional well-being) and taking place across three tiers: Tier 1 (i.e., primary prevention for all students), Tier 2 (i.e., for students with moderate risk), and Tier 3 (i.e., for students with intensive needs; I-MTSS Research Network, 2023; Lane et al., 2020).
Within these systems, a need exists to expand the capacity of school personnel to implement feasible and effective interventions for students with and at risk for emotional and behavioral disorders (EBD; Walker et al., 2014), as these students often demonstrate externalizing (e.g., noncompliant, aggressive) and internalizing behaviors (e.g., shy, anxious, socially withdrawn), which present challenges to their healthy development (Lane et al., 2019; McIntosh et al., 2014). Indeed, in a survey investigating the professional learning needs of educators working in schools implementing Comprehensive, Integrated, Three-tiered (Ci3T) models of prevention (Lane et al., 2009, 2020)—a type of integrated tiered system—strategies to support students with internalizing behaviors was rated as the second highest need (Common et al., 2021). Students demonstrating internalizing behaviors represent a demographic at risk for experiencing deleterious effects following the pandemic and from ongoing crises related to student well-being. To address this need, in the present study, we describe a preliminary test of Recognize. Relax. Record. (RRR; Ci3T Project ENHANCE Research Team, 2021), a school-based Tier 2 intervention designed to support students experiencing anxious thoughts and feelings at school.
Internalizing Behaviors and School-Based Interventions
Two major classes of behavioral challenges in childhood are externalizing and internalizing behaviors (Achenbach, 2001; Walker et al., 2014). Externalizing behaviors are directed outwardly into the social environment and can include behaviors such as noncompliance, impulsiveness, and aggression. In contrast, internalizing behaviors are typically directed inward toward the self. Examples include anxious feelings, social withdrawal, and somatic complaints. Externalizing and internalizing problems may exist alone or in combination, as moderate-to-high levels of comorbidity have been observed (Willner et al., 2016). Distinctions in the manifestation of externalizing and internalizing behaviors in social environments such as schools have resulted in a historical imbalance in responses to these challenges. Namely, students with internalizing behaviors are much less likely to be referred for support than students demonstrating externalizing behaviors (Bradshaw et al., 2008). Also, students with co-occurring externalizing and internalizing challenges may be referred for support related to disruptive behaviors, potentially overlooking covert internalizing problems. While being historically under referred, outcomes for students with internalizing behaviors indicate these concerns may carry serious implications for long-term outcomes, particularly if these behaviors are left unaddressed. For example, students with internalizing behaviors are at-risk for poor academic achievement (Gresham et al., 1999; Mychailyszyn et al., 2010), problems with peer relationships (Henricsson & Rydell, 2006), and the potential to develop more serious mental health issues such as clinical depression or anxiety disorders (Merrell & Gueldner, 2010).
Of disorders associated with internalizing behaviors, anxiety is one of the most common among school-age youth. A meta-analysis of anxiety prevalence during the COVID-19 pandemic estimated 31% of children and adolescents experienced anxiety symptoms (Deng et al., 2023). Results were similar to pre-pandemic estimates. Merikangas et al. (2010) found 32% of children experienced an anxiety disorder at some point during adolescence, with 8.3% experiencing severe impairment. Anxiety-related disorders typically involve a combination of physiological responses (e.g., pounding heart, sweating), cognitive and mood problems (e.g., difficulty concentrating, irritability), and avoidant behaviors (e.g., behaviors incompatible with participation in contexts associated with fears; American Psychiatric Association, 2022; Gresham & Kern, 2004). Collectively, evidence indicates an urgent and ongoing need to empower teachers to detect and support students experiencing anxious feelings so these students can engage productively during instruction and enjoy more comfortable interactions with peers and adults.
Fortunately, tiered systems of supports such as Ci3T offer an ideal structure for schools to use behavioral screening tools to detect and assist students who may be at risk for such problems. The intent of systematic screening is not to diagnose disorders but to proactively detect patterns of behaviors that may pose a challenge to students’ academic, behavioral, and social and emotional development. In doing so, such tools allow educators to be more proactive in identifying students’ needs at the first sign of concern, with the hope of preventing symptoms of anxiety from worsening and improving short- and long-term outcomes.
Just as referrals for externalizing behaviors have historically outpaced those for internalizing behaviors, the development of school-based interventions specifically targeting externalizing behaviors has also been more prominent than those aimed at addressing internalizing behaviors (Mitchell et al., 2021). Yet, this trend has shifted in recent years with more attention paid to potential interventions for students demonstrating anxiety, social withdrawal, and other challenges associated with internalizing behaviors. Examples include modifications of interventions like Check-in/Check-out to include peer mentors (e.g., Bernard et al., 2025; Dart et al., 2015) and the addition of cognitive-behavioral small-group instruction (e.g., Resilience Education Program; Kilgus & Eklund, 2017). These efforts represent important advancements for school-based interventions for internalizing behaviors by utilizing two major classes of treatments for anxiety, behavioral and cognitive-behavioral (Gresham & Kern, 2004), in packages implemented by school personnel to support students with moderate levels of internalizing behaviors (e.g., Tier 2). Behavioral strategies provide modeling and adjustment of antecedents and consequences (e.g., positive peer role-models, increased access to positive social reinforcement), whereas cognitive-behavioral interventions include instruction on recognizing symptoms associated with anxiety, use of adaptive strategies to manage symptoms, and self-evaluation and reinforcing of successful application of adaptive strategies (Gresham & Kern, 2004). As the evidence-base for school-based, Tier 2 interventions develops (Kilgus et al., 2015), there exists an imperative to continue testing a variety of intervention approaches to identify what works, for whom, and under what conditions (Lane et al., 2006).
RRR Intervention Development
Recognize. Relax. Record. is a Tier 2 intervention for students experiencing anxious thoughts and feelings at school. RRR was developed as part of an IES-funded project, Project ENHANCE (Lane, 2019—2024), which was conducted to identify and build professional learning to assist educators implementing Ci3T. An early phase of this project consisted of surveying educators to identify desired professional learning topics (Common et al., 2021). From a list of 20 topics, educators most desired de-escalation techniques and strategies for internalizing behaviors. Respondents indicated strategies for internalizing behavior as the least implemented strategy of the most desired topics, suggesting not just a desire to learn this content but also a strong need to do so. These results led to the creation of RRR to meet this need.
RRR was developed by the Ci3T Project ENHANCE Research Team, with the goal of creating a Tier 2 intervention package (a) comprised of research-based, effective practices and (b) feasible to implement by a classroom teacher, with potential additional support provided by other school personnel (e.g., counselors), if needed. Regarding selection of intervention components, authors collaborated using their expertise in special education, applied behavior analysis, and school psychology and by reviewing available research summaries on school-based interventions for students with internalizing concerns (e.g., Gresham & Kern, 2004; Lembke, 2012). We also received input on proposed components from an expert advisory board with research expertise specific to students with internalizing behaviors. From these collaborations, we created an intervention consisting of cognitive-behavioral instruction and self-monitoring components. The cognitive-behavioral component consists of small-group or individual instruction during which students learn to recognize and evaluate anxious thoughts and feelings (e.g., Is this thought/feeling helpful or unhelpful?), use relaxation strategies when experiencing anxious thoughts and feelings (e.g., breathing techniques), and apply self-monitoring to assist them in generalizing the aforementioned skills to settings when they experience anxious thoughts and feelings. Self-monitoring consists of teaching an individual to observe and record specific behaviors (Lane, Menzies, et al., 2011). We added the self-monitoring component based on recommendations from the literature for supporting students with internalizing behaviors (e.g., Lembke, 2012), the existing research base for the practice for elementary-age students with EBD (Mooney et al., 2005), and because self-monitoring involves integrating antecedent (e.g., self-monitoring form as a permanent product to remind students to use strategies) and consequence-based components (e.g., accessing positive attention from teachers for using strategies).
Once conceptualized, the RRR intervention was built into an Enhancing Ci3T Module (see description of modules in Buckman et al., 2024). The RRR module consisted of instruction on understanding the needs of students with internalizing behavior and a step-by-step process for implementation of RRR (see Method). Instruction is provided by an eBook and interactive resource (i.e., web-based multimedia version of eBook) and supported by a full complement of materials to implement RRR, including lesson plans and forms to assess student progress using Direct Behavior Rating (DBR; Chafouleas et al., 2009a), social validity, and treatment integrity. After drafting materials and gaining preliminary input from our advisory board (i.e., content experts), Ci3T Leadership Team members (i.e., school leaders guiding Ci3T implementation), and role-specific users (e.g., school counselors), we received funding to revise and test the RRR intervention as part of IES’s efforts to accelerate pandemic recovery (IES, 2021).
Purpose
In this article, we present two preliminary studies of RRR, an intervention aimed at addressing the need identified by educators to support students with internalizing behaviors at school (Common et al., 2021). We conducted these studies as part of Project ENGAGE (Lane, 2022—2026), an IES project funded through the Research to Accelerate Pandemic Recovery in Special Education program (IES, 2021). Across these studies, we asked the following research questions: When implemented with intensive university support:
Initially, these studies were intended to be two multiple-baseline across participant design studies, taking place in a fourth- and fifth-grade class, respectively (see original pre-registered study at Open Science Framework; Lane et al., 2022). During baseline, we observed ceiling effects on the primary dependent variable (DBR for academic engagement) for three of four participating fourth-grade students. Prior to initiating any phase changes, we elected to consolidate all fifth-grade students and the remaining fourth-grade student into a single multiple-baseline design study (Study 1) and to enter the three fourth-grade students with ceiling effects into an A–B nonexperimental study (Study 2) to gather initial information about practicalities and impact of conducting RRR as a small-group intervention, rather than as a 1:1 intervention as initially planned. We made this decision to allow for at least one experimental test of RRR while still providing the intervention and reporting outcomes for all students who met initial intervention entry criteria. The latter was deemed necessary because (a) we committed to instructing all consenting students who met entry criteria, (b) descriptive, rating scale data for these students indicated they could benefit from the intervention (see Study 2: Method), and (c) we sought to learn as much as possible during this preliminary test, including how students might progress even if their initial internalizing challenges were not evident in academic engagement data. We first present the Method and Results for the multiple-baseline design across participants (Study 1), followed by reporting for the A–B design study (Study 2).
Study 1: Method
Participants and Setting
Participants in Study 1 were five students: two fifth-grade male students (Brayden and Leo; all names are pseudonyms), two fifth-grade female students (Belle and Allison), and one fourth-grade female student (Isabel). All five students were White—one was also Hispanic—and ranged in age from 9 to 11 years (see Table 1). Two students received special education services as determined by a multi-disciplinary team, one for a specific learning disability and another for autism spectrum disorder. All students attended one of two participating teachers’ classes: Ms. Driscoll (fifth-grade) and Ms. Avery (fourth-grade). Both teachers were female and had bachelor’s degrees. Ms. Driscoll had 21 years of teaching experience. Ms. Avery had 10 years of experience in schools but was in her first year as a classroom teacher (see Supplemental Table S1).
Characteristics of Participants in Testing of Tier 2 Intervention Recognize. Relax. Record.
Note. Gr. = Grade; Abs. = Absences at the time of fall administration of Student Risk Screening Scale—Internalizing and Externalizing (SRSS-IE; Drummond, 1994; Lane & Menzies, 2009); aAge reported at fall (pre-intervention) timepoint. bIsabel received special education for specific learning disability; cLeo received special education for autism spectrum disorder. All students were White, and one student (Isabel) was Hispanic.
All students attended a public elementary school with a diverse student body in a mid-size, Midwestern city in the United States (see Supplemental Table S1). The school was in its fourth year implementing a Ci3T model of prevention as part of district-wide implementation. The Ci3T is an integrated tiered system including primary (Tier 1) prevention for academics, behavior (i.e., positive behavioral interventions and supports; PBIS), and social and emotional well-being (i.e., Second Step; Committee for Children, 1992). Systematic screening for academics and behavior occurred three times per year (fall, winter, and spring) using Measures of Academic Progress (MAP) assessments (Northwest Evaluation Association [NWEA], 2009) and the Student Risk Screening Scale for Internalizing and Externalizing behaviors (SRSS-IE; Drummond, 1994; Lane & Menzies, 2009). The school’s Ci3T Leadership Team facilitated collection of treatment integrity and social validity data of Tier 1 Ci3T implementation twice per year (fall, spring) as part of regular school practices. In brief, Ci3T Treatment Integrity: Teacher Self-Report (Lane, 2009a) scores indicated acceptable levels overall for procedures for teaching, reinforcing, and monitoring (fall: M = 77.23%, SD = 12.85; spring: M = 82.74%, SD = 12.89), and direct observation ratings from outside observers (Lane, 2009b), suggesting higher levels of implementation (fall: M = 88.38%, SD = 11.83; spring M = 90.17%, SD =11.31). Fall and spring administrations of the SW-PBIS Tiered Fidelity Inventory (TFI; Algozzine et al., 2014) Tier 1 protocol each produced a score of 86.67%, indicating implementation of Tier 1 with fidelity. See the description of baseline for additional details regarding Tier 1 implementation in participating teachers’ classrooms and Supplemental Table S2 for full reporting of schoolwide treatment integrity and social validity measures.
Participant Selection Procedures
Consenting and Screening
After securing Institutional Review Board (IRB) and district approval, we shared information about the project with all elementary school principals and collaborated with district leaders to review SRSS-IE data from the previous school year. We used SRSS-IE data to identify schools with relatively higher proportions of students with internalizing behaviors as measured by the SRSS-IE. From this list, we contacted one school principal who accepted the invitation to allow us to invite fourth- and fifth-grade teachers to participate. Next, we held teacher consenting meetings following the fall SRSS-IE screening window. Meetings consisted of a description of the intervention and research procedures. Five of six invited teachers consented. With these teachers, we analyzed data collected as part of regular school practices to detect fourth- and fifth-grade students with internalizing behaviors and regular school attendance for possible participation in this Tier 2 intervention. Students were eligible to participate if they: (a) scored in the moderate- or high-risk on the SRSS-IE internalizing subscale (SRSS-I5) in fall and (b) had < 5 absences during the first 6 weeks of school. We describe screening measures and criteria below. We note some participants demonstrated high risk for internalizing behaviors. This was due to (a) relatively low numbers of students with moderate risk for internalizing in participating classrooms as compared to students with high risk and (b) teachers’ request to support students with higher level of internalizing behaviors.
Student Risk Screening Scale—Internalizing and Externalizing
The SRSS-IE is a free-access, teacher-completed screening tool, designed to detect K–12 students with internalizing and externalizing behavior patterns in fall, winter, and spring administrations. Classroom teachers independently rate each student on 12 items using a 4-point rating scale: never = 0, occasionally = 1, sometimes = 2, and frequently = 3. The SRSS-IE has two subscale scores. The externalizing subscale (SRSS-E7) includes seven items: (a) steal; (b) lie, cheat, sneak; (c) behavior problem; (d) peer rejection; (e) low academic achievement; (f) negative attitude; and (g) aggressive behavior. The internalizing subscale (SRSS-I5) includes five items: (a) emotionally flat; (b) shy, withdrawn; (c) sad, depressed; (d) anxious; and (e) lonely. Risk categories were established using the Teacher Report Form (TRF; Achenbach, 2001) scores, yielding the following: externalizing = low (0–3), moderate (4–8), and high (9–21); internalizing = low (0–1), moderate (2–3), and high (4–15; Lane et al., 2015). Predictive validity studies indicate fall risk scores predict year end-performance, with students with higher levels of risk more apt to have lower oral reading fluency, lower reading scores, more nurse visits, and more days in in-school suspension relative to students in low risk (Lane et al., 2019).
Attendance
We established an entry criterion of students missing fewer than 5 days within the first 6 weeks of school. We include an attendance requirement to ensure students would be likely to access the intervention, which involved time with the interventionist to learn the relaxation strategies, as well as apply what they learned during instruction.
Students Identified
Two of the five consented teachers had enough students meeting inclusion criteria (i.e., students with internalizing behaviors and adequate attendance) to proceed (N = 4, fifth-grade teacher; N = 4, fourth-grade teacher). For the eight students across both classrooms, we extended an invitation to those students’ parents and guardians for their child to participate. After receiving parental/guardian consent, we met with students to describe the project and request assent. All eight students provided assent.
Next, we reviewed academic screening data collected as part of regular school practices (MAP; NWEA, 2009) to learn more about students’ academic capabilities. We also collected descriptive information, as part of approved research procedures, from teachers, family members (e.g., parents or guardians), and student perspectives. Descriptive data sources included behavioral rating scales (Achenbach System of Empirically Based Assessment [ASEBA] rating scales; Social Skills Improvement System–Rating Scales [SSiS-RS]) and social validity measures (Intervention Rating Profile-15 [IRP-15], Children Intervention Rating Profile [CIRP], Witt & Elliott, 1985) as described below. These measures were intended to be completed at pre-, post-, and follow-up timepoints to describe shifts in screening and assessment performance. However, follow-up data collection was truncated due to time limitations at the end of the school year.
Three students had ceiling effects during baseline (see Study 2: Results) but did have SRSS-IE scores and teacher-reported clinical levels for internalizing behaviors on the TRF. As such, we did not recommend these students to continue with this multiple-baseline design study. Instead, we honored our commitment to students and families to provide the intervention by conducting a separate study shifting procedures from 1:1 instruction to small-group instruction (Study 2).
In sum, of the eight students meeting criteria, data are reported for a sample of (a) five students in Study 1 (multiple-baseline design, testing 1:1 instructional approach; Belle, Brayden, Allison, Isabel, & Leo) and (b) three students in Study 2 (A–B nonexperimental, exploring a small-group instructional approach; Ava, Arlo, & Kevin).
General Instructional Procedures
Students began with baseline, followed by RRR Instruction and RRR In-Class intervention phases (see Intervention Description). Beginning at baseline, research personnel completed DBR to assess students’ academic engagement during a time selected by the general education teacher. Both teachers selected math instruction. Research personnel continued to collect DBR data throughout all phases of the intervention. RRR Instruction, the first intervention phase, consisted of six to eight lessons taught by research personnel outside the classroom. Students participated in lessons three to four times per week for ≈ 20–30 min each session, with the full sequence of lessons lasting 3–4 weeks. After completing RRR Instruction, students began the RRR In-Class phase, which involved students using a self-monitoring checklist with their teachers’ support. Throughout intervention phases, research personnel assessed treatment integrity of intervention procedures.
Intervention Description
Recognize. Relax. Record. is a Tier 2 intervention designed to take place over three phases: baseline, RRR Instruction, and RRR In-Class. The full intervention is detailed in the Enhancing Ci3T Module entitled Recognize. Relax. Record. An intervention package for students struggling with anxious feelings (Ci3T Project ENHANCE Research Team, 2021).
Baseline
During baseline, teachers began by selecting a time during the school day when participating students experienced difficulty related to internalizing concerns (e.g., low academic engagement, withdrawal). The selected period was the instructional time when progress monitoring would occur, using DBR to assess the target behavior(s). This time also later served as the self-monitoring period (see RRR In-Class) and thus was held constant through all intervention phases. Other than beginning DBR data collection during baseline, no other changes took place in the instructional environment. Both teachers selected math as the period for DBR data collection. We note the design of RRR involves teachers collecting DBR data. Yet, in this initial study we utilized research personnel as data collectors. Research personnel collected DBR data in the back of the classroom from an unobtrusive position where students were visible. See Table 2 for interobserver agreement (IOA) and Table 3 for mean observation lengths.
Interobserver Agreement.
Note. IOA = interobserver agreement; Obs. = observations; RRR = Recognize. Relax. Record. intervention. All IOA describes agreement between ratings of research personnel-completed academic engagement Direct Behavior Rating.
Panel A: Study 1 Treatment Integrity and Recognize. Relax. Record. intervention (RRR) In-Class Dosage.
Note. Count of days represented calendar days during which research activities related to a given phase took place. Some counts may overlap (e.g., an RRR Booster may overlap with an RRR In-class observation). Students received up to two booster sessions if data suggested a need for additional instruction. See Supplemental Tables S6 and S7 for information about treatment integrity IOA.
During baseline, participating teachers provided typical classroom math instruction. Math instruction consisted of whole-group activities (e.g., teacher-led instruction), although routines varied and included some small-group and independent activities (e.g., partner-based daily math practice, computer-based math programs). Teachers continued implementing Tier 1 Ci3T (see Supplemental Table S3). Noted features of Ci3T Tier 1 implementation present in both classes included four schoolwide expectations posted and visible throughout the room, implementation of low-intensity strategies to promote engagement (e.g., active supervision, instructional choice), provision of behavior-specific praise, and use of schoolwide universal reinforcers (i.e., tickets). Students had no access to any RRR intervention components during baseline.
RRR Instruction
During the RRR Instruction phase, we continued collecting DBR data during math instruction (see Table 2 for IOA and Table 3 for mean observation lengths), and math instruction continued in a manner similar to the baseline phase. Students received RRR Instruction one at a time during an intervention block selected by the teachers. Instruction occurred 1:1 in one of two available settings, a conference room and a sensory room. RRR Instruction includes up to seven lessons, which are intended to last between 20 and 30 min. Each lesson is scripted but also allows for flexibility if the interventionist needs to adjust the pace, provide more examples, or to integrate additional elements to promote engagement or increase contextual fit. RRR developers created lesson plans for teachers as the intended intervention agent. Yet, in this first test of RRR, research personnel served as intervention agents, with the intent of gaining preliminary insight into implementation before testing with teachers in subsequent studies.
The first two RRR lessons cover the recognize component, in which students learn to recognize anxious thoughts and feelings, consider whether those anxious thoughts and feelings are helpful or unhelpful (e.g., some anxious thoughts are helpful because they may help us get out of danger) and to identify situations in which they have experienced anxious thoughts or feelings. The next four lessons provide instruction on specific relaxation strategies: breathing techniques, guided imagery, progressive muscle relaxation, and positive self-talk. Developers selected these strategies for inclusion, as they are common components of existing interventions for children and adults experiencing anxiety (Chorpita & Daleiden, 2009). Students choose at least two of these strategies, allowing instructional choice to support student motivation and interventionists’ discretion for how many lessons to teach (at least two and up to four). Each Relax strategy has a corresponding icon and a step card breaking down how to use each strategy. Cards are printed on 4×6 cardstock and can be connected using a metal ring. Students receive one copy of the cards, and one copy to take home, after completing the Relax Instruction phase. The final lesson teaches vocabulary around self-monitoring and provides the teacher (or other interventionist) with an opportunity to instruct, model, and provide feedback on the use of the self-monitoring form that will be used in the subsequent step.
Interventionists used the scripted RRR lesson plans to teach the six lessons and completed treatment integrity checklists for each lesson. A second member of the research team conducted intermittent observations and completed a second treatment integrity checklist to assess the reliability of treatment integrity reporting. Students completed lessons over the course of 6–8 days (Mdn = 7; M = 6.80). Total instructional dosage ranged from 135 min for Belle to 186 min for Isabel (M = 160.00; SD = 23.27). This dosage includes booster sessions, which we provided to students based on responsiveness to the intervention and indications they had not fully mastered RRR instructional content. See Table 3 for full dosage information. Students who were not in this condition did not have access to RRR Instructional content.
RRR In-Class
After each student completed the RRR Instruction phase, they began using self-monitoring in the classroom (referred to as RRR In-Class). In the present study, classroom teachers implemented all self-monitoring procedures. Self-monitoring usage took place during the same time period as DBR measurement (i.e., math instruction). Instruction remained consistent with previous phases, except for teacher and student behaviors related to self-monitoring. During this phase, teachers provided students with the self-monitoring form and prompted the student to complete it at predetermined intervals in the classroom during the time period (math instruction) selected by the teacher during baseline (i.e., same period as when DBR is collected). The self-monitoring form prompts the student to use skills taught during RRR Instruction: self-monitoring academic engagement, the presence of anxious thoughts and feelings, and the use of relaxation strategies when needed. The self-monitoring period is divided into equal intervals, with students completing the self-monitoring form at the end of each interval. In the present study, the self-monitoring period was 40 min and comprised of four 10 min intervals. The teacher used an audio track that plays a “ding” at the end of each interval, prompting students to complete the self-monitoring form. At each “ding,” the student completed the self-monitoring form, producing one rating for anxious feelings on a DBR scale, one rating for academic engagement on a DBR scale, and indicating which (if any) strategies they used during that interval. Teachers checked in with students at the conclusion of each interval to confirm completion of the self-monitoring form. At the end of the 40-min period, teachers checked in again to provide input (e.g., feedback and/or positive reinforcement) and pick up the form. Only students in the RRR In-Class phase had access to the self-monitoring form and teacher prompts/feedback. Students in other phases could hear the “ding” signifying the end of a self-monitoring interval. Yet, without the requisite instruction, materials, and prompts, this was unlikely to constitute access to active ingredients of the intervention.
Throughout the period, the teacher provided feedback and reinforcement (e.g., behavior-specific praise, PBIS ticket) to students for using their self-monitoring form and for engaging in target behaviors. We continued to collect DBR during the RRR In-Class phase. See Table 2 for IOA and Table 3 for mean observation lengths. In addition, teachers completed daily treatment integrity forms for self-monitoring procedures, and research personnel completed intermittent reliability treatment integrity observations.
RRR Teacher and Research Personnel Training
Following consenting and participant selection procedures, we met with participating teachers over a period of 2 days to provide training on all intervention procedures. The first training (60 min) included the rationale for RRR, a summary of the research on RRR components (e.g., self-monitoring), an explanation of potential benefits and challenges, a step-by-step process for implementing RRR (including steps implemented by research personnel, to ensure teachers were informed and could support generalization), and an overview of the data necessary to monitor RRR (treatment integrity, social validity, student outcomes). The second training (45 min) included discussion of operational definitions of academic engagement (the DBR target behavior), and scheduling when (a) DBR would be collected by research personnel and (b) students would receive RRR instruction. Day 2 concluded with 15 min to complete pre-intervention rating scales (TRF, SSiS-RS: Teacher [SSiS-RS: T]) and social validity (IRP-15). Each training included a verbal check for understanding and opportunities for participants to ask questions. Teachers also received an additional training session prior to the RRR In-Class phase, as they were the primary implementers of self-monitoring procedures. Research personnel met briefly with teachers (≈ 15 min) to review self-monitoring procedures and the corresponding treatment integrity checklist. Teachers had the opportunity to ask questions and then completed a verbal check for understanding. Research team members provided corrections and clarifications as needed.
Members of the research team also received training to support their role as the primary implementers of the RRR Instruction phase. RRR Instruction was led by two research project members: one held a master’s degree in special education, while the other held a master’s degree in counseling psychology. These research project members received training (~120 min) from the first and second authors, who were both part of the team of RRR developers. Training involved presenting and modeling of lesson content, the opportunity to ask questions, a written check for understanding (passed with 90% accuracy by each research personnel serving as interventionist), and receipt of feedback to clarify any misunderstanding. In addition, training included (a) observing one of the intervention developers (first author) provide RRR Instruction during study 2 (which was conducted chronologically first), which provided additional opportunities for modeling and to ask questions and (b) was observed by the first author during initial lessons, which provided the opportunity for feedback.
Implementation Integrity
RRR Lesson Treatment Integrity Checklist
Research staff assessed implementation of the RRR Instruction lessons using the 14-item RRR Lesson Treatment Integrity Checklist. Nine items involved rating interventionist behaviors (e.g., Did I define or review key vocabulary?), and five items related to student participation (e.g., Did the student participate in discussion?) which were rated for each student present. All items were rated on a 4-point scale (0 = not implemented, 1 = partially implemented, 2 = mostly implemented, 3 = fully implemented). For each lesson, interventionist-level and student-level treatment integrity scores could range from 0 to 100%. Interventionists completed one form per lesson, with a second researcher conducting reliability observations. See the Results section for treatment integrity levels and reliability.
RRR Self-Monitoring Treatment Integrity Checklist
Research staff and participating teachers assessed implementation of the RRR In-Class process using the eight-item RRR Self-Monitoring Treatment Integrity Checklist (e.g., Teacher or other adult checked in with student at the conclusion of each interval to confirm they filled out self-monitoring form). To allow for the possibility of teacher implementation fidelity varying across students in the same classroom, each item was rated for each participating student. Items were rated on a 4-point scale (0 = not implemented, 1 = partially implemented, 2 = mostly implemented, 3 = fully implemented). For each day of self-monitoring implementation, teacher-level and student-level treatment integrity scores could range from 0 to 100%. Teachers completed forms daily, with researchers conducting reliability observations. See the Results section for fidelity levels and reliability.
Descriptive Measures
We gathered descriptive data according to approved research protocols, including some data from regular school practices (e.g., academic screening), as well as other study-specific measures such as multi-informant behavioral rating scales.
Map
The MAP (NWEA, 2009) is a computer-adaptive test used as a universal academic screening tool administered in fall, winter, and spring to assess students’ achievement in math and reading. We report Rasch Unit (RIT) scores, national percentiles, and categorical achievement levels. Scores have demonstrated evidence of convergent validity with other common academic screening measures. For example, January and Ardoin (2015) found strong correlations (r = .83) between reading fluency curriculum-based measures and MAP RIT scores for students in Grades 2–5. The district collected MAP data as part of regular school practices.
ASEBA (Achenbach, 2001)
We utilized the TRF and Child Behavior Checklist (CBCL) from the ASEBA system (Achenbach, 2001). The TRF is a teacher-completed, norm-referenced measure to assess students’ levels of problem behavior, academic performance, and adaptive functioning. Teachers rate students on 113 items using a 3-point scale (0 = not true, 1 = somewhat or sometimes true, 2 = very true or often true) and fill-in-the-blank questions, requiring 15–20 min per student. The measure uses T-scores categorized by normal range (< 65), borderline clinical range (65–69), and clinical range (significant concern;
The CBCL is a parent-completed, norm-referenced measure to assess students’ physical problems, concerns, and strengths. Parents rate students on 113 items using the same three-point scale used on the TRF, requiring 15–20 min to complete. T-scores yield categories according to the same scale as the TRF. Internal consistency across subscale estimates for Cronbach’s α ranges from 0.63 to 0.97. We focused on the CBCL broadband internalizing score (α = 0.90), as well as the anxious/depressed (α = 0.84) and withdrawn/depressed subscale scores (α = 0.80).
SSiS-RS (Gresham & Elliott, 2008)
The SSiS-RS is a nationally norm-referenced suite of tools for students ages 3–18 years to assess social skills, problem behaviors, and academic competence from teacher (SSiS-RS: T), parent (SSiS-RS: P), and student (SSiS-RS: S) perspectives. Social skills constructs include communication, cooperation, assertion, responsibility, empathy, engagement, and self-control; competing problem behaviors include externalizing, bullying, hyperactivity/inattention, internalizing, and autism spectrum; academic competence includes reading achievement, math achievement, and motivation to learn (Gresham & Elliott, 2008). Alpha coefficients indicate acceptable internal consistency across subscales and raters, teacher, parent, and student, respectively: (a) social skills α = .97, .95, .94; (b) problem behavior α = .95, .94, .91; and (c) academic competence α = .97 (rated by teacher only). We focused on engagement (α = .84, .83, .74) and internalizing subscales (α = .83, .82, .82).
Outcome Measures: Direct Behavior Rating
Academic Engagement DBR: Research Personnel-Completed
The primary dependent variable in this study was academic engagement, as assessed using a single-item DBR completed by research personnel during classroom observations. We used engagement as the primary dependent variable because it represents a replacement behavior under students’ direct control and is conceptually linked to the intervention’s goal of helping students manage anxious feelings to remain engaged during instruction. Although managing symptoms of anxiety (e.g., anxious thoughts and feelings) is a central outcome targeted by RRR, several considerations led us to prioritize engagement for this initial test. First, anxiety symptoms are often covert and not directly observable (Gresham & Kern, 2004). Second, requiring students to self-monitor anxious feelings prior to receiving instruction on recognizing and labeling such feelings would likely yield unreliable data (e.g., in the absence of instruction about identifying such feelings). Most critically, we judged it could pose ethical risks to ask students to focus on anxious feelings in the absence of adaptive strategies for managing them—particularly for students in the third tier (i.e., leg). Thus, engagement was used as an observable behavioral proxy for successful application of the cognitive–behavioral and self-regulation strategies taught in RRR.
Single-item DBRs involve selecting and operationally defining a student’s behavior of interest (i.e., target behavior). The target behavior is observed during a period of time when it most frequently occurs and is rated by an observer on an 11-point Likert-type scale (0 = not at all –10 = all of the time) immediately following the end of the rating period (Chafouleas et al., 2009b). Studies reported strong correlations with systematic direct observation of academic engagement (r = .81; Chafouleas et al., 2012). In this study, academic engagement was defined as actively or passively participating during classroom activities. Examples included looking toward educator (or contributing peer) or instructional materials (e.g., textbook, handout), looking away from speaker (e.g., educator or contributing peer) and/or materials for a duration of < 10 s, and appropriately asking the teacher for help. Nonexamples included engaging in any activity other than the assigned/scheduled task, such as disruption (e.g., inappropriate audible vocalizations, stomping feet) and behaviors incompatible with a designated task (e.g., walking around the classroom, looking away from the speaker or materials more than 10 s, head down on desk, reading unapproved materials).
Across classrooms, DBR observations lasted 40 min and took place during math instruction. Prior to completing DBR, all research personnel reviewed the operational definition of academic engagement and completed the DBR Training Module published by the University of Connecticut (n.d.), which involved rating video models of academic engagement and completing three calibration ratings in participating classrooms. Then, research personnel conducted calibration observations in participating teachers’ classrooms by completing DBR ratings of participating students during representative instructional activities (i.e., whole-group math and reading instruction). Calibration observations provided opportunities for researchers to practice using the DBR form and ask questions about the behavioral definition. Agreements between the first author and other personnel during calibration were 92.31% and 80.77% across 26 student-level DBR ratings. We collaborated with one of the developers of DBR to establish an IOA criterion of +/−1 across raters (i.e., S. Chafouleas, personal communication, September 28, 2022). We report IOA by student and phase in Table 2. IOA ranged as follows: baseline: 70%–93.75%; RRR Instruction: 66.67%–100%; and RRR In-Class: 80%–100%.
Academic Engagement DBR: Student Self-Report (Engagement DBR-Student)
Students completed a self-monitoring form during RRR In-Class, which included rating their academic engagement using an 11-point DBR identical to the scale research personnel-completed DBR. Students completed four ratings at 10-min intervals, as well as an overall rating encompassing the full 40-min period. During RRR Instruction, students received training on the completion of this measure, including the definition of academic engagement and the use of the self-monitoring form. Instruction on the definition of academic engagement corresponded to the operational definition used by research personnel, including examples and nonexamples.
Anxious-Feelings DBR: Student Self-Report (Anxious DBR-SSR)
In addition to the academic engagement DBR, the student self-monitoring form also included a DBR for students to rate their level of anxious thoughts and feelings. Procedures mirrored those described for academic engagement. Students received training on how to define and observe anxious feelings during RRR Instruction (e.g., what does it feel like in the body, what types of behaviors might be displayed).
Social Validity Measures
We assessed social validity from teacher, family, and student perspectives during baseline and post-intervention phases. Teachers and families completed the IRP-15 (Witt & Elliott, 1985) to obtain their views of the social significance of intervention goals, acceptability of the procedures, and importance of intervention outcomes. Students completed the CIRP (Witt & Elliott, 1985). We compared baseline and post-intervention scores to determine the degree to which social validity ratings shifted, with the goal of understanding how expectations regarding goals, procedures, and intended outcomes were met, exceeded, or unfulfilled (Lane & Beebe-Frankenberger, 2004; Lane, Harris, et al., 2011). We also conducted brief social validity interviews with teachers and students to further explore their experiences with RRR.
IRP-15 (Witt & Elliott, 1985)
The IRP-15 assesses teacher and family members’ views of intervention acceptability, requiring 10–15 min to complete. Adults rate 15 statements about procedures and outcomes (e.g., I liked the procedures used in this intervention) on a 6-point Likert-type scale ranging from 1 (strongly disagree) to 6 (strongly agree). Items are summed, with total scoring ranging from 15 to 90, with higher total scores indicating greater levels of acceptability. Previous studies using the IRP-15 have shown internal consistency estimates from α = .88–.98 (Lane, Harris, et al., 2011). To facilitate readers’ understanding, we computed percentage scores for the IRP-15.
CIRP (Witt & Elliott, 1985)
The CIRP measures students’ views of intervention acceptability, rating seven items on a 6-point Likert-type scale ranging from 1 (I do not agree) to 6 (I agree). After reflecting negatively worded items, summed scores range from 7 to 42, with high scores suggesting high acceptability. We modified the wording of CIRP as in previous studies (e.g., Lane, 1999) to increase readability. Previous studies using the CIRP have shown internal consistency estimates from α = .75 to .89 (Lane, Harris, et al., 2011). To facilitate readers’ understanding, we computed percentage scores for the CIRP, after reflecting negatively worded items.
Social Validity Interviews
We conducted individual, semi-structured interviews with participating teachers at the end of the study to gain insights into acceptability, usability, and feasibility of RRR. Research personnel posed questions to the educator while another research team member annotated the conversation. Prior to beginning the interview, we presented and described the daily visual analysis graphs of the student’s self-monitoring, and DBR data from the entirety of the intervention. Sample questions included “Would you share more about what you liked about the intervention?” and “What recommendations do you have for the intervention overall, including selection of students, timing, interval recording, audio prompt, and self-monitoring procedures?”
Experimental Design and Analysis
We utilized a multiple-baseline across participants design to examine the impact of RRR on student engagement. This design followed the three phases of the RRR intervention: baseline (A), RRR Instruction (B), and RRR In-Class (C), as described previously. We conducted visual analysis formatively to guide phase-change decisions and summatively to evaluate whether functional relations were demonstrated.
Formative visual analysis occurred daily during the baseline phase to assess data stability and determine when to transition from baseline to RRR Instruction. Research personnel examined variability, level, and trend. When academic engagement showed reduced variability or stabilized for approximately three consecutive sessions, researchers coordinated with teachers to begin the RRR Instruction phase. The transition from RRR Instruction to RRR In-Class occurred following completion of instructional lessons, rather than based on student data patterns.
We conducted summative visual analysis after collecting all data to evaluate whether the introduction of each intervention component produced a functional relation with changes in academic engagement. Following single-case design standards (e.g., Kratochwill et al., 2013), we examined changes in variability, level, trend, immediacy of effect, overlapping of data paths, and consistency of effects across cases. First, we assessed whether the introduction of RRR Instruction (B)—the acquisition phase—produced a functional relation with increases in academic engagement. To meet criteria for a functional relation in a multiple-baseline design, visual evidence would need to show one clear demonstration of effect and at least two replications across participants. We then conducted a similar analysis for the transition to RRR In-Class (C), which served as the performance phase during which students self-monitored their engagement. Evaluation of potential B–C changes likewise required one demonstration and two replications across cases to meet standards for establishing a functional relation. To assist with visual analysis, we examined descriptive statistics for each phase (M, SD; Slope, Syx; see Table 4).
Student Outcomes in Testing of Tier 2 Intervention Recognize. Relax. Record. (RRR).
To complement visual analysis and provide a quantitative measure of magnitude, we computed between-case and within-case effect sizes using the scdhlm and SingleCaseES R packages, respectively (Pustejovsky et al., 2023; Pustejovsky et al., 2024). For within-case effect sizes, we analyzed raw, tabular data to compute log response ratio–increasing (LRRi; Pustejovsky, 2018) effect sizes for academic engagement. Because our multiple-baseline design included two potential demonstrations of effect (A–B and B–C), we calculated LRRi values for each phase contrast. To facilitate interpretation, we report A–B and B–C effects in separate tables (see Tables 5 and 6). For students’ self-reported anxious feelings, we conceptualized improvement as a decrease or reduction, so we computed log response ratio–decreasing (LRRd) for the B–C contrast. These values are described descriptively because anxious-feelings ratings served as a secondary outcome rather than a primary dependent variable. To aid interpretation of all LRR estimates, we also reported the corresponding percent change for each contrast (A–B, B–C), providing an intuitive metric alongside conventional benchmarks (Swan & Pustejovsky, 2018). Specifically:
Academic Engagement Effect Sizes From Baseline to RRR Instruction (A:B).
Note. RRR = Recognize. Relax. Record.; DV = Dependent Variable; AE = Academic Engagement; LRRi = Long Response Ratio-increasing.
Indicates statistically significant effect sizes; BC-SMD effect size = 0.08 [-0.45, 0.61].
Academic Engagement Effect Sizes From RRR Instruction to RRR In-Class (B:C).
Note. RRR = Recognize. Relax. Record.; DV = Dependent Variable; AE = Academic Engagement; LRRi = Long Response Ratio-increasing; BC-SMD effect size = 0.22 [-0.19, 0.64].
For instance, a 50% change in academic engagement represents a half increase from baseline, 100% indicates academic engagement doubled, and 200% reflects a tripling of the original level.
For between-case effect sizes, we calculated the between-case standardized mean difference (BC-SMD) using a Hierarchical Linear Model (HLM) approach using the scdhlm web application fit with Restricted Maximum Likelihood (Pustejovsky et al., 2023; Valentine et al., 2016). We followed Valentine’s recommendations, adjusting for autocorrelation across repeated measures with fixed and random effects to account for participant variability. Consistent with Shadish et al. (2016), BC-SMD effects were descriptively interpreted using the following guidelines: small (0.37–0.98), medium (0.98–1.87), and large (> 1.87).
Treatment Integrity and Social Validity
We analyzed treatment integrity and social validity data using descriptive procedures (e.g., computing means, standard deviations) as well as computing effect sizes between pre- and post-social validity scores for teachers, parents, and students, using Hedges’ g with a pooled standard deviation in the denominator.
Study 1: Results
Student Outcomes
Belle
During baseline, Belle’s academic engagement was variable (M = 3.86; SD = 2.38), generally increasing during the first six sessions, followed by a downward trend in the last six observations (see Figure 1). With the introduction of RRR Instruction, the mean level increased and variability decreased (M = 6.33; SD = 1.86; see Table 4). Yet, marked within-phase variability in both baseline and RRR Instruction—as well as highly overlapping data ranges—suggests a modest demonstration of effect. A similar pattern continued during RRR In-Class, during which Belle self-monitored her academic engagement (M = 6.38; SD = 1.83). Belle demonstrated higher and less variable levels of academic engagement in both treatment phases relative to baseline (see Table 4). The effect-size estimate indicated a 63.15% increase in academic engagement from baseline to RRR Instruction, which was statistically significant (LRRi = 0.49 [0.09, 0.89]; see Table 5). The percentage change from RRR instruction to RRR In-Class was not significant at 0.22% (LRRi effect size estimate = 0.00 [−0.26, 0.27]; see Table 6).

Study 1: Academic Engagement for the Five Student Participants.
In terms of student self-reported outcomes, Belle reported no anxious feelings throughout RRR Instruction and RRR In-Class phases. Thus, there was no opportunity to compute an effect size for this outcome. Belle self-reported high and stable levels of engagement (M = 8.29; SD = 0.90) during RRR In-Class, which were higher than observer-reported levels (M = 6.38; SD = 1.83).
Brayden
During baseline, Brayden demonstrated high levels of academic engagement, with moderate variability early in the phase (M = 8.17; SD = 1.01), and a decreasing level of engagement during the last seven sessions (counter-therapeutic trend). With the introduction of RRR Instruction, there was again minimal change in level but a reduction in variability (M = 7.67; SD = 0.52). During RRR In-Class phase, academic engagement remained high and stable (M = 8.15; SD = 0.49). Due to a relatively high level of engagement during baseline, there was no opportunity to demonstrate an effect on academic engagement. The percentage change from baseline to RRR instruction was not significant at −6.12% (LRRi = −0.06 [−0.14, 0.01]). The percentage change from RRR instruction to RRR In-Class was also not significant at 6.27% (LRRi = 0.06 [0, 0.12]). Although changes in level were minimal, the reduction in variability across phases—while maintaining high engagement—may indicate improved stability in Brayden’s classroom performance. The reduction of variability while maintaining high engagement is a positive finding.
In terms of student self-reported outcomes, Brayden reported very few anxious feelings throughout RRR Instruction (M = 0.40, SD = 0.55) and RRR In-Class (M = 0.21, SD = 0.71); therefore, there was insufficient data to calculate an effect size for this outcome. Brayden reported a relatively high level of engagement (M = 7.63; SD = 1.07) during RRR In-Class, although the level was lower than observer-reported academic engagement (M = 8.15; SD = 0.49).
Allison
During baseline, Allison demonstrated high levels of engagement with low variability (M = 8.65; SD = 0.56). Upon beginning the RRR Instruction phase, there was a slight decrease in level and increase in variability (M = 7.38; SD = 1.85). The overall mean during RRR Instruction was impacted by a single outlying data point. During the RRR In-Class phase, engagement returned to near baseline levels and stabilized (M = 8.20; SD = 0.70). Due to the high level of engagement during baseline, there was no opportunity to demonstrate a replication of effect on academic engagement. The percentage change from baseline to RRR Instruction was not significant at −14.45% (LRRi = −0.16 [−0.33, 0.02]), nor was the percentage change from RRR instruction to RRR In-Class (10.77%; LRRi = 0.10 [−0.08, 0.28]).
In terms of student-reported outcomes, Allison reported moderate levels of anxious feelings during the RRR Instruction phase (M = 4.43, SD = 1.51) with a slight upward trend (0.16); self-reported data from the RRR In-Class phase indicated reduced levels of anxious feelings (M = 3.25, SD = 2.17) and a very slight decelerating trend (−0.05). The percentage change from RRR Instruction to RRR In-Class was not significant at −26.40% (LRRi = −0.31 [−0.69, 0.08]; see Table 7). Allison’s reported levels of academic engagement (M = 8.10; SD = 1.71) were highly consistent with observer-reported levels of academic engagement (M = 8.20; SD = .70) during RRR In-Class, although Allison reported higher levels of variability.
Student-Reported Anxious Feelings Effect Sizes From RRR Instruction to RRR In-Class (B:C).
Note. DV = Dependent Variable; LRRd = Long Response Ratio-decreasing; AF = Student-Reported Anxious Feelings. – indicates insufficient data within or across phases to calculate effect size.
Isabel
During baseline, Isabel demonstrated high levels of engagement with moderate variability (M = 8.48; SD = 1.00), which continued during RRR Instruction (M = 8.67; SD = 0.82) and RRR In-Class (M = 8.80; SD = 0.79), with a slight reduction in variability observed over the course of the intervention. Due to the high level of engagement during baseline, there was no opportunity to demonstrate an effect on academic engagement. The percent change from baseline to RRR Instruction was not significant at 2.17% (LRRi = 0.02 [−0.06, 0.11]), nor was the percentage change from RRR instruction to RRR In-Class (1.50%; LRRi = 0.01 [−0.08, 0.11]). Although minimal changes occurred for level of engagement, reduction of variability while maintaining high engagement is a positive finding.
In terms of student-reported outcomes, Isabel reported low levels of anxious feelings, although with a high degree of variability, during the RRR Instruction phase (M = 1.00, SD = 2.54) with a decelerating trend (−.45), which continued during the RRR In-Class phase (M = 1.11; SD = 1.45) with a flat slope (0.03) and reduced variability. The percentage change from RRR Instruction to RRR In-Class was not significant at −25.89% (LRRd effect size estimate = −0.30 [−2.44, 1.84]). Isabel reported a relatively high level of engagement (M = 9.56; SD = 0.73) during RRR In-Class, slightly higher than observer-reported engagement (M = 8.80; SD = 0.79).
Leo
During baseline, Leo demonstrated relatively low levels of engagement with a high degree of variability (M = 4.90; SD = 2.21) but no consistent trend. The introduction of RRR Instruction yielded an increase in engagement (M = 6.60; SD = 1.34) and a large decrease in variability with a minor upward trend (slope = 0.09). Increased levels of engagement were sustained during RRR In-Class, with an associated reduction in variability (M = 6.00; SD = 1.10) and a slight downward trend (−0.05). The percentage change from baseline to RRR Instruction was significant at 34.72% (LRRi = 0.30 [0.06, 0.54]). This change remained stable in the transition from RRR Instruction to RRR In-Class, with a nonsignificant percentage change (−9.21; LRRi = −0.10 [−0.33, 0.13]).
In terms of student outcomes, Leo reported no anxious feelings during the RRR Instruction phase (M = 0.00, SD = 0.00). During RRR In-Class, he reported anxious feelings at a low level (M = 2.00, SD = 2.74) and a high, stable level of academic engagement (M = 9.00, SD = 1.73, slope = 0.05). Student-reported academic engagement (M = 9.00; SD = 1.73) was substantially higher than observer-reported academic engagement (M = 6.00; SD = 1.10) during RRR In-Class.
Summary
To demonstrate a functional relation between the introduction of RRR Instruction (i.e., acquisition phase) and changes in students’ academic engagement (e.g., variability, level, trend), data would have to indicate one demonstration and at least two replications. In the current multiple-baseline design study, a clear functional relation was not established between levels of academic engagement. These results are consistent with the nominal BC-SMD effect sizes reported across phase changes A–B (BC-SMD = 0.08) and B–C (BC-SMD = 0.22), respectively. Yet, there was modest evidence of a functional relation between the introduction of RRR Instruction and decreased variability in students’ academic engagement. Belle, Brayden, Isabel, and Leo demonstrated decreases in variability between baseline and RRR Instruction phases, all of which were at a minimum sustained—and in some cases further decreased—into RRR In-Class phases. Yet, it is important to note these shifts were mostly small-magnitude changes evident in variability metrics (e.g., SD, Syx) and in need of replication and future inquiry before generalizing results.
We note students did not self-report anxious feelings using DBR during baseline phases, as DBR was part of the intervention package. As such, the design was not intended to determine a functional relation between the introduction of RRR Instruction and self-reported anxious feelings from the baseline to RRR Instruction phases. Yet, Brayden and Allison reported a decreased level of anxious feelings between RRR Instruction and RRR In-Class phases, although neither of these resulted in a statistically significant LRRd effect size.
Treatment Integrity
RRR Instruction
Mean RRR Instruction treatment integrity scores related to interventionist procedures, as self-reported by interventionists, ranged from 96.83% (SD = 2.97) to 100% (SD = 0.00) across all five students (see Table 3). The overall mean treatment integrity score was 98.15% (SD = 2.74). Ratings completed by secondary observers confirmed high implementation of interventionist procedures (M = 98.04%; SD = 2.91; see Supplemental Table S5) with an overall IOA of 95.28% and item-level agreement ranging from 81.25% to 100% (see Supplemental Table S8). Booster sessions followed similar patterns for level of interventionist implementation, with interventionists rating treatment integrity between 93.06% and 100% and secondary observers rating all booster sessions at 100%.
Mean RRR Instruction treatment integrity scores related to student participation, as reported by interventionists indicated lower treatment integrity scores for Belle (M = 77.78%, SD = 10.04) compared to the remaining students (range of 92.71% [SD = 5.77]–100% [SD = 0.00]; see Table 3). The overall mean student participation score was 94.61% (SD = 9.17), which was confirmed by levels reported by secondary observers (M = 94.74%; SD = 9.58; see Supplemental Table S6). Ratings across primary and secondary observers had an overall mean IOA of 86.05%, with item-level agreement ranging from 64.71% to 100% (see Supplemental Table S8). The lowest agreement was observed for the item: “Did the student make connections to their lives?,” which represented the generalization section of the lesson (64.71%). On this item, secondary observers rated participation moderately higher (M = 2.72, SD = 0.57) than scores reported by interventionists (M = 2.39; SD = 0.85). Booster sessions followed similar patterns for level of student participation, with interventionists rating student participation between 55.33% and 100% and secondary observers rating all booster sessions at 100%. The low participation score was from a single booster lesson taught to Belle. There was no secondary observer for that lesson, which accounts for the range restriction of secondary observers’ booster session scores.
RRR In-Class
Mean RRR In-Class treatment integrity scores as rated by teachers ranged from 73.75% (SD = 24.14) to 90.87% (SD = 9.65) across all students, with an overall M = 88.26% (SD = 14.21; see Table 3). Secondary raters observed 37.66% of sessions (range: 23.81%—83.33% across students). Overall, secondary observers’ ratings suggested slightly higher levels of treatment integrity, ranging from 82.29% (SD = 12.44) to 94.64% (SD = 4.64) across students, with an overall M = 90.66 (SD = 8.38; see Supplemental Table S7). Nevertheless, there was relatively high IOA across raters, with an overall IOA of 90.95% (range = 81.25%—95.00% across students; see Supplemental Table S9).
Summary
Collectively, treatment integrity scores across students and phases suggest (a) RRR Instruction was implemented by research personnel as designed and (b) RRR In-Class self-monitoring activities implemented by teachers were also implemented as planned.
Social Validity
Teachers’ social validity ratings indicated they viewed the goals, procedures, and outcomes of RRR as generally acceptable prior to and following the intervention, with a nominal decrease over time (see Table 8). Ms. Driscoll rated the intervention favorably at pre-intervention across her students, although slightly lower for Belle (73; 81.11%) and notably lower for Leo (67; 74.44%). Her ratings were stable at post-intervention across students, indicating RRR largely met her expectations, although also demonstrating persistent differences related to individual students’ needs. For example, at both timepoints her written comments on the social validity form expressed concern about Leo’s self-awareness skills as a potential barrier to the success of the self-monitoring component. At post-intervention, though, her comments noted self-monitoring had provided a helpful prompt for on-task behaviors. In her post-intervention social validity interview, Ms. Driscoll shared the procedure of providing feedback at set intervals during self-monitoring (RRR In-Class) felt somewhat artificial, and she had some concerns that students with internalizing challenges were likely to interpret feedback as corrective. However, she also pointed out these students needed frequent reassurances that they are on track. Overall, she indicated it was an “easy intervention to manage in the classroom.”
Social Validity for Studies 1 and 2.
Note. Teacher social validity assessed using Intervention Rating Profile-15 (IRP-15; Witt & Elliott, 1985), possible scores range from 15 to 90; parent social validity assessed using modified IRP-15, possible scores range from 15 to 90; student social validity assessed using Child Intervention Rating Profile (CIRP; Witt & Elliott, 1985), scores range from 7 to 42.
Ms. Avery’s social validity rating for Isabel at pre-intervention was high (84; 93.33%) but showed a slight decrease following intervention (76; 84.44%). Nevertheless, her post-intervention rating suggested she found RRR still acceptable for this student’s needs. Her comments on the social validity rating indicated she had “initial concerns, however with time turned out to be exactly what she needed [sic]. Very impressed.” In her social validity interview, she also shared reflections on this student’s needs and developmental levels (e.g., struggles with multitasking, which the teacher associated with self-monitoring), although she felt the intervention provided structure and tools that were overall positive influences for the student (see Study 2 for more findings from this teacher’s social validity interview).
In terms of parent perceptions of social validity, prior to intervention, parents rated RRR as generally acceptable. Overall, there was a greater decline in parent ratings from pre- to post- than was observed with teacher ratings (M = 80.22% at pre, 72.50% at post) although most respondents still indicated moderate acceptability of intervention procedures. The most notable change in ratings was for Leo (pre: 76 [84.44%], post: 58 [64.44%]).
Across students, four of five students (Belle, Allison, Isabel, and Leo) indicated favorable ratings at pre and post, suggesting RRR largely met or exceeded students’ expectations. Isabel’s ratings showed the largest difference across timepoints (pre: 35 [83.33%], post: 42 [100%]). The exception was Brayden, who rated social validity lower at pre- and post-intervention, indicating some disagreement with the statement I think I will like being in this program at both timepoints. After collecting data during the pre-intervention timepoint, we reminded the student participation was voluntary, and he indicated he would still like to try the program despite his rating.
In sum, social validity reflected general acceptability across perspectives, although there was notable variability from student to student, particularly from parent perspectives.
Study 2: Method
Participants and Setting
Participants in Study 2 included three fourth-grade students from Ms. Avery’s class (the same fourth-grade teacher and school as in Study 1). Two students, Arlo and Kevin, were male and, one, Ava, was female. All students were White. None were receiving special education services (see Table 1).
Measures and Procedures
Measures and procedures were identical to Study 1. Study 2 took place chronologically first. The first author taught RRR Instruction lessons to students in a small-group format in the conference room described in Study 1.
Design and Analysis
We utilized a nonexperimental A–B design to gather descriptive information of implementation when implementing RRR in a small-group setting. We analyzed the one phase change to examine the impact of RRR on student engagement, examining stability, level, and trend along with effect-size calculations (see descriptions in Study 1). Similar to Study 1, we analyzed treatment integrity and social validity data using descriptive procedures (e.g., computing means and standard deviations) and effect sizes. For this nonexperimental A–B design we used BC-SMD for multiple-baseline design descriptively.
Study 2: Results
Student Outcomes
All students demonstrated high academic engagement during baseline with mean levels ranging between 8.67 (SD = 1.22) to 9.00 (SD = 0.97; see Table 4). Data for Ava showed high variability, whereas data for Arlo and Kevin were more stable (see Figure 2). With the introduction of RRR Instruction, there were nominal changes in level or variability for Ava. Arlo demonstrated a slight increase in level of academic engagement (8.33% increase) and reduction in variability (M = 9.75, SD = 0.50), yielding a statistically significant LRRi effect-size estimate (0.08 [0.01, 0.15]). Kevin showed slightly reduced variability with the introduction of RRR Instruction, with a small increase in academic engagement from baseline (M = 8.75, SD = 0.97 in baseline; M = 9.50, SD = 0.58 in RRR Instruction), although this change did not produce a significant LRRi effect size (see Supplemental Table S10). With the introduction of RRR In-Class, all students continued demonstrating high levels of academic engagement, although there were small decreases in level for Arlo and Kevin and slightly higher variability across all three students. These results are consistent with the nominal BC-SMD effect sizes reported across phase changes A–B (BC-SMD = 0.31) and B–C (BC-SMD = −0.005), respectively. We interpret these effect sizes descriptively, as Study 2 was a nonexperimental A–B design.

Study 2: Academic Engagement for the Three Student Participants.
Regarding student-reported anxious feelings, Ava reported moderate levels of anxious feelings during the RRR Instruction phase (M = 5.00, SD = 1.00), which decreased to near zero during RRR In-Class, except for one data point late in the study (M = 0.31, SD = 0.85). Arlo reported no anxious feelings during RRR Instruction or RRR In-Class. Kevin’s reports were variable, characterized by intermittent spikes of anxious feelings throughout RRR Instruction and RRR In-Class phases. On average, this student reported a decrease in anxious feelings from the RRR Instruction phase (M = 3.00, SD = 4.36) to RRR In-Class (M = 1.18, SD = 2.96).
In terms of student-reported academic engagement, Ava began the RRR In-Class phase by reporting high levels of academic engagement. Yet, her ratings indicated a clear decelerating trend beginning midway through this pattern. This trend was not reflected in teacher-reported engagement. In contrast, Arlo and Kevin both reported high and stable engagement, with students’ ratings demonstrating a high level of correspondence with teacher-reported engagement.
Treatment Integrity
RRR Instruction
In Study 2, overall RRR Instruction treatment integrity scores related to interventionist procedures, as self-reported by the interventionist, were 98.41% (see Table 3). Ratings completed by secondary observers confirmed high implementation of interventionist procedures with an overall IOA of 93.94% and item-level agreement ranging from 83.33% to 100% (see Supplemental Table S8). Booster sessions followed similar patterns for the level of interventionists’ implementation, with interventionists rating treatment integrity at 94.07% and secondary observers at 100% (see Table 4).
Mean RRR Instruction treatment integrity scores related to student participation, as reported by interventionists, indicated lower treatment integrity scores for Ava (M = 79.84%, SD = 19.94) compared to the remaining students (range of 96.19% [SD = 7.56]–97.14% [SD = 5.25]; see Table 3). Ratings across primary and secondary observers had an overall IOA of 91.25%, with item-level agreement ranging from 73.33% to 100%. Student participation in booster sessions was rated at 100% for all students by the interventionist, and secondary observers rated booster session student participation at 96.67% for all students.
RRR In-Class
In Study 2, RRR In-Class treatment integrity scores as rated by the teacher ranged from 74.34% (SD = 18.85) to 77.08% (SD = 16.42) across all students, with an overall M = 76.11% (SD = 17.17; see Table 3). Overall, secondary observers’ ratings suggested slightly higher levels of treatment integrity, ranging from 83.33% (SD = 14.19) to 86.11% (SD = 14.35) across students, with an overall M = 84.72 (SD = 13.18; see Supplemental Table S7). Nevertheless, there was moderate IOA across raters, with an overall IOA of 84.03% (range 81.25%—85.42% across students; see Supplemental Table S9).
Social Validity
Teacher perceptions on IRP-15 at pre-intervention were indicative of high acceptability. Post-intervention ratings declined slightly for Ava, although they still indicated a high degree of acceptability. Ratings for Kevin held constant at a similar level of acceptability (see Table 8). Ratings for Arlo indicated a different pattern, beginning with a high rating and declining following the intervention (pre: 84 [93.33%]; post: 59 [65.56%]). Written comments on this form indicated concern that the student may have struggled with self-awareness skills needed to accurately self-monitor. In her post-intervention social validity interview, this teacher expressed broad enthusiasm for RRR, particularly the component related to teaching relaxation strategies. She suggested creating visuals to post in the class to remind students to use those strategies. She also indicated she would like to see more parental involvement in the intervention.
Parents’ pre-intervention social validity ratings indicated high acceptability, with ratings remaining fairly constant for Ava and Kevin. Ava’s parents wrote on the post-intervention social validity form, “She [student] expressed positive thoughts when telling me about having a session.” Ratings for Arlo showed a similar decline as the teacher’s ratings (pre: 79 [87.78%]; post: 57 [63.33%]), with parents indicating disagreement to the item This intervention proved effective in supporting my child’s needs. No written comments were provided to detail this decline in social validity.
From the students’ perspective, all students in Study 2 rated RRR as highly acceptable across pre- and post-intervention timepoints. In sum, ratings across multiple perspectives largely indicated acceptability of RRR, particularly from the students. Yet, data indicated variability across students in how teachers and parents perceived intervention acceptability and outcomes.
Discussion
Nationally, educators, families, policymakers, and researchers have prioritized concerns for the well-being of school-age youth as they navigate through the aftermath of the COVID-19 pandemic (NCES, 2023; Richtel, 2023). We conducted this study, funded by IES (IES, 2021; Lane, 2022–2026), to explore how to best support elementary-age students in maximizing their academic engagement by empowering these students to learn how to manage their anxious feelings. Recognizing teachers had an immediate need for research-based interventions to promote fourth-through fifth-grade students’ academic, behavioral, and social and emotional well-being, we partnered with a district implementing Ci3T district wide (K–12), to conduct an initial test of RRR, a Tier 2 intervention developed as part of Project ENGAGE (Lane, 2022—2026).
Impact for Students
Findings from Study 1, a multiple-baseline across participants single-case design, provide preliminary modest evidence of a functional relation between the introduction of RRR Instruction and changes in the variability of students’ academic engagement, noting these shifts were mostly small-magnitude changes evident in variability metrics (e.g., SD, Syx) and in need of replication and future inquiry before generalizing results. Four of five students (Belle, Brayden, Isabel, and Leo) demonstrated reductions in variability between the baseline and RRR Instruction phase, which was sustained—and in some cases variability further decreased—in the RRR In-Class phase when students applied newly learned skills using self-monitoring. In addition, two students (Belle and Leo) demonstrated positive shifts in their mean levels of engagement with the introduction of RRR Instruction, with LRRi effect sizes indicating moderate improvements (63.15% and 34.72% increases, respectively). Ceiling effects during baseline for Brayden, Allison, and Isabel eliminated the possibility of establishing a functional relation between RRR Instruction and changes in students’ mean levels of academic engagement at the onset. Although analysis of slope and level of behavior are the most commonly described outcomes in single-case design inquiry, variability is an important and oft-overlooked priority (Ledford & Gast, 2024).
Although the reductions in variability observed in Study 1 were slight, these modest shifts may hold practical value to teachers and students alike. Specifically, unpredictable behavior patterns can pose challenges for teachers when planning and designing instruction (Lane, Menzies, Smith-Menzies, & Lane, 2023). Similarly, when a student’s engagement oscillates, it can impede their ability to engage in learning activities and can interfere with other students’ learning when distractions occur (Walker et al., 2014). Yet, given the modest evidence of a functional relation coupled with the small-magnitude changes evident in variability metrics (e.g., variance about the mean, and variability about the slope lines), we do not draw definitive conclusions and encourage replication and future inquiry before generalizing results.
We note the main variable of interest in this pre-registered study was academic engagement. Although the RRR intervention is grounded in cognitive–behavioral principles targeting anxious thoughts and feelings, this initial test focused on academic engagement as a measurable, observable proxy for students’ capacity to regulate anxiety and sustain participation in instruction. We included student-reported anxious feelings as an important additional outcome to consider. Yet, for practical and ethical reasons, we did not ask students to self-monitor anxious feelings during baseline phases, which would have been necessary for experimental control in the multiple-baseline design. Several factors influenced this decision. In terms of ethical considerations, we determined it could be counter-productive (and potentially harmful) to ask students to self-monitor anxious feelings prior to learning any strategies to manage these emotions. Relatedly, asking students to self-monitor anxious feelings during the baseline phase of a multiple-baseline design would entail students potentially reporting anxious feelings without empowering educators to respond to these reports (i.e., to maintain consistent baseline conditions) and to maintain this status for an extended time required for the multiple-baseline design (e.g., students in the third leg of the design). There were practical and design considerations as well. For example, students would not have access to instruction on recognizing and labeling anxious feelings until the intervention began (i.e., lessons during RRR Instruction), which would have produced ratings of questionable accuracy. Finally, the act of observing one’s own behavior can change behavior (Briesch & Chafouleas, 2009); so, including a self-monitoring procedure before starting the intervention phases (e.g., using self-reported DBRs of anxious feelings) may have prevented establishment of a true baseline. As such, our design was not intended to determine a functional relation between the introduction of RRR Instruction and a reduction in student self-reported anxious feelings as measured by DBR data.
Although nonexperimental in nature, DBR data indicated Brayden and Allison reported a decreased level of anxious feelings between RRR Instruction and RRR In-Class phases, although neither of these resulted in a statistically significant LRRd effect size. Surprisingly, some students (Belle and Brayden) reported few or no anxious feelings through the course of the intervention. Given that internalizing behavior patterns are not always related to anxiousness (e.g., social withdrawal), it may be that these students’ specific needs were not well-aligned to the focus of this intervention. Indeed, teacher-reported descriptive data for Belle indicated concerns for withdrawn/depressed rather than anxious/depressed (see Supplemental Table S4). Nevertheless, Belle showed a positive outcome for increased academic engagement. Future inquiry may study if students are more likely to respond to RRR depending on specific needs related to anxiousness, or whether this approach may be suitable for groups of students with heterogeneous internalizing behaviors.
In response to an additional ethical consideration, we amended our pre-registered plan of conducting two multiple-baseline studies to instead conduct an A–B study with three fourth-grade students demonstrating ceiling effects for academic engagement during baseline (Study 2). We did so to honor our commitment to their teacher, family members, and the students themselves to access the RRR instruction even though baseline data prevented the potential for identifying a functional relation before any phase changes occurred. In addition, we note descriptive data from baseline indicated clinical concerns on the internalizing behavior broadband scale of the TRF for all three students and the anxious/depressed subscale of the TRF for two of three students (see Supplemental Table S4). Outcomes for Study 2 yielded some similar information as Study 1. Namely, we observed small reductions in variability of academic engagement for two students (Arlo and Kevin) and a small increase in level of academic engagement for Arlo. We also observed a moderate decrease in anxious feelings for Ava over the course of the study. Although results from Study 2 are not experimental, they provide additional descriptive evidence that RRR has potential for impacting academic engagement, including reducing variability of performance.
Finally, regarding student outcomes, across Studies 1 and 2 we observed some promising descriptive results in addition to the previously described results from single-case design inquiry. Namely, one student from each study made clinically meaningful change on anxious/depressed subscale scores of the TRF (see Supplemental Table S4), meaning they moved across normative bands. Specifically, Isabel from Study 1 moved from a score indicating clinical concern to a score indicating borderline concern, and Ava from Study 2 moved from a score in the clinical concern range to a score in the normative range. In addition, no students demonstrated clinically meaningful change in a counter-therapeutic direction on teacher-completed measures (although we note Leo in Study 1 showed a counter-therapeutic change on the parent measure, described subsequently). We interpret changes in descriptive measures with caution in terms of their relation to the RRR intervention, given this research design does not offer experimental control over those outcomes. Nevertheless, assessing outcomes on multi-informant behavioral rating scales will prove important to future inquiry—particularly in studies involving group-design methodology (e.g., randomized control trials) examining RRR impact—given some of the challenges with using academic engagement DBRs (see Limitations and Future Directions).
Treatment Integrity
In these preliminary studies of RRR, overall levels of treatment integrity were high for research personnel implementing RRR Instruction lessons. Given that RRR Instruction was carried out by a co-developer of RRR (Study 2) and by research personnel with direct training and modeling from co-developers (Study 1), it remains to be seen whether classroom teachers can implement this component of the program with similar levels of fidelity. Nevertheless, it is promising that research personnel in Study 2 achieved very high levels of fidelity with a level of training comparable to standard professional educational learning opportunities (≈ 2 hr). Prior to recommending RRR, it is important to test the feasibility and effectiveness of RRR when instruction is provided by general education teachers within the context of the school day.
Regarding student participation during RRR, results indicated most students were very engaged, with four of five students from Study 1 and two of three from Study 2 scoring > 90% in participation. Interestingly, Belle from Study 1 showed the greatest improvement in academic engagement of any student in either study, yet her participation (77.78% during lessons; 53.33% during Booster lessons; see Table 3) was lowest among all students. It may be that students can still benefit from RRR Instruction even if they are slightly less engaged, which is potentially noteworthy, as engaging some students with internalizing behaviors may prove challenging. So, it is promising that even students who are slightly reticent may be able to acquire and use skills taught during RRR Instruction, although this also requires careful consideration of social validity from the student perspective (see Stakeholder Views). We also note potential differences in how students engaged in RRR Instruction between Study 1 and Study 2. In Study 2, students received RRR Instruction as a small group, whereas students in Study 1 received instruction in a 1:1 setting. Research personnel indicated anecdotally the small group experience offered in Study 2 created an environment of peer social support, which may have facilitated productive participation and group conversations compared to the 1:1 offering in Study 1. While some students in Study 1 appeared to enjoy the 1:1 lessons with research personnel, others required more scaffolding to engage, and participation occasionally lagged. It is possible situating RRR Instruction in a small-group setting will produce better student participation, which possibly could enhance outcomes (although again, we note the less engaged student in Study 1 demonstrated increases in academic engagement).
In terms of the implementation of RRR In-Class, the self-monitoring component of RRR, we observed treatment integrity scores were somewhat lower for RRR In-Class than RRR Instruction. Yet, given RRR In-Class was implemented by general education teachers in both studies, it is perhaps not surprising that fidelity was slightly lower than RRR Instruction, which was implemented by university staff. Furthermore, although scores were lower, results nevertheless indicated moderate to high levels of fidelity. We note Ms. Avery reported somewhat lower fidelity (73.75% for Isabel in Study 1; 74.34%—77.08% for students in Study 2) than Ms. Driscoll (88.19%—90.87%). Further investigation of fidelity data indicated both teachers tended to rate their fidelity as lower than a secondary observer, suggesting they may have been overly harsh when reporting implementation. Similar patterns have emerged in other studies of fidelity assessment (e.g., Lane et al., 2020), which highlights the added value of assessing treatment integrity from multiple perspectives.
Stakeholder Views
Across Studies 1 and 2, we observed mostly positive findings for the acceptability of the goals, procedures, and outcomes of RRR. Teachers tended to rate RRR highly prior to the intervention and to maintain a high degree of approval after the study. This was a pattern that held true except for some slight reticence on behalf of Leo in Study 1 and Arlo in Study 2. In both instances, teachers raised concerns about these students’ self-awareness skills. The implication was they struggled somewhat with the self-monitoring component, particularly recognizing when they were academically engaged. These concerns warrant potential consideration of whether additional instruction or scaffolds could be provided if students enter the intervention with significant needs in the area of this critical social-emotional skill. For example, some students may need more instruction in the Record portion of RRR Instruction, which involves developing skills to observe one’s own behavior. Alternatively, future RRR intervention inquiry may find the intervention is more suited to some students than others based on developmental levels. The literature on self-monitoring suggests some students do not yet have the meta-cognitive abilities to fully self-monitor their behavior, so a different intervention approach may be preferred in those instances (Lane, Menzies, et al., 2011).
From the parent perspective, again, most ratings reflected positive acceptability of RRR. In the few instances where this was not the case (e.g., Leo in Study 1; Arlo in Study 2), we sought insight in item-level and descriptive data, as there was minimal qualitative data (e.g., written comments). Leo’s parents (Study 1) indicated disagreement with the question This was an acceptable intervention for my child’s needs but indicated agreement with the question that I would be willing to use this intervention in the home setting. These responses, in conjunction with parent-completed descriptive measures, suggest increasing concerns related to internalizing behaviors for this student over the course of the study (see Supplemental Table S4). Leo’s parents may have felt that the goals of RRR were important but that the intervention was either not intensive enough or not ideally matched to the student’s needs (e.g., mirroring his teacher’s concern about self-awareness skills). Given this student’s parents were willing to use this intervention in the home setting, it may be important to increase opportunities for information, materials, and instruction from RRR to be shared with parents and families. This may empower family members in the home to add intensity to the intervention and increase opportunities for generalization. Teachers also suggested the value of increasing opportunities for interfacing with parents and families through the RRR intervention, giving further support to this interpretation.
Finally, most students indicated very positive experiences with the intervention, even when treatment integrity data for those students suggested decreased participation (e.g., Belle). With the exception of Brayden, who expressed some ambivalence even prior to participating, all students’ social validity ratings at post were well above 80%. This presents an interesting contrast, suggesting most students found value in their experience and enjoyed the procedures, even in cases where adults expressed less acceptability (e.g., in the case of Leo and Arlo). Anecdotally, we noted students in Study 2 particularly seemed to enjoy the group aspect of RRR Instruction, as we regularly observed students providing social support tied to terminology introduced during instruction both within and beyond the group. Future inquiry may further examine how students experience the intervention and seek to understand how specific components (e.g., RRR Instruction, RRR In-class) support their skill development.
Implications
These studies offer a methodologically sound illustration of Tier 2 inquiry focused on supporting students with internalizing challenges, with potential implications for measurement in future related inquiry. Study 1 was designed and implemented with aims to achieve quality indicators set out in the Council for Exceptional Children Standards for Evidence-based Practices in Special Education (Cook et al., 2014; see Supplemental Table S11). A noteworthy strength of both studies was the prioritization of reliability of measurement. Specifically, we used DBRs of academic engagement as the primary dependent variable and were able to achieve a rigorous level of IOA both in terms of proportion of observations (i.e., 47.11% of observations with reliability data in Study 1, 50.98% in Study 2) and level of IOA (i.e., 86.84% IOA across all observations in Study 1; 92.31% IOA across all observations in Study 2). Although DBR has been used as an outcome in Tier 2 internalizing inquiry, it has often been used without a secondary rater, leading to calls for more rigorous measurement procedures (Eklund et al., 2021). Our emphasis on the use of secondary raters, and of including student self-reported data, contributes to refining measurement approaches in this area of literature. Yet, we note our DBRs were completed by research staff. Future inquiry should investigate the extent to which teacher-reported DBR can attain similar levels of reliability while implementing the multiple demands of a lively general education setting (see Limitations and Future Directions). In addition, challenges related to ceiling effects reinforce difficulties related to assessment of internalizing behaviors, which tend to be covert. Our attempt to assess a target replacement behavior (i.e., academic engagement) was only somewhat successful. Future studies of RRR may need to engage in dual measurement schemes to more directly assess internalizing challenges alongside replacement behaviors or rely more heavily on multi-informant rating scales for determining impact.
Another important implication is the potential shown for RRR to impact student outcomes. We report modest evidence of a functional relation between the introduction of RRR Instruction and reduced variance of academic engagement (Study 1, see caveats in Impact for Students), as well as nonexperimental evidence suggesting some students reduced their overall level of anxiousness on self-reported measures and teacher-completed rating scales. Coupled with evidence the self-monitoring component of RRR appeared feasible to implement and socially valid from multiple perspectives, these preliminary findings suggest interventions like RRR may prove a valuable addition to the growing body of literature of potentially efficacious Tier 2 interventions to support students with internalizing behaviors.
Limitations and Future Directions
We encourage readers to interpret results relative to the following considerations, some of which have been mentioned previously. First, while not specifically a limitation, it is important to note researchers led the RRR Instruction phase. It will be important for future studies to explore the impacts of general education teachers implementing all components of RRR. This is an important future research objective to determine the feasibility, practicality, and impact of RRR in more authentic implementation contexts. Relatedly, having research personnel serve as the RRR instructor may reduce generalization of skills being reinforced beyond the training session.
Second, as noted in Studies 1 and 2 presented here, measurement issues created some challenges in assessing intervention effects. First, several students’ baseline DBR data suggested ceiling effects during baseline for academic engagement. These ceiling effects prevented the possibility of a change in level, thereby impeding the possibility of detecting a functional relation between the introduction of RRR and changes in students’ level of engagement. Moreover, for the ethical and practical reasons described previously, we did not require students to self-report anxious feelings during baseline. This limited our ability to determine a functional relation on this theoretically important construct. We intentionally prioritized academic engagement as an observable indicator of adaptive functioning and anxiety regulation within the classroom context. Future studies should include direct observations of behaviors associated with internalizing concerns (e.g., avoidance, self-critical statements) and validated rating scales (e.g., ASEBA, SSiS) collected across all phases to more directly assess changes in anxiety-related symptoms. In addition, subsequent trials might employ group-design methodologies (e.g., randomized controlled trials) to test RRR impact on engagement and internalizing outcomes, while also examining cost and cost-effectiveness (Levin et al., 2017).
Third, we noted some students reported low levels of anxious feelings during RRR In-Class. While not inherently a limitation, it suggests the possibility of using validated rating scales to determine areas of need to inform intervention efforts. Future studies might consider obtaining consent for teachers, parents, and students to complete diagnostic rating scales to determine students’ specific needs (e.g., anxious/withdrawn, reduced engagement) to warrant participation in RRR as a Tier 2 intervention. This process has been used extensively in reading, math, writing, and social skills interventions (e.g., Common et al., 2019; Lane, Harris, et al., 2011).
Fourth, input from social validity data suggested the importance of refining parent involvement in RRR. At a minimum, it will be important to create additional intervention materials (e.g., relaxation strategy cards) to share with parents so they can learn more about the content of the lessons taught. It would be particularly valuable for parents to be aware of the language used to describe anxious feelings as well as the relaxation strategies and self-monitoring procedures. By being more familiar with intervention components, parents could support the generalization of newly learned skills in the home and other settings beyond the school day (e.g., programming for generalization; Cooper et al., 2020).
Fifth, although RRR was designed for students with moderate levels of internalizing behaviors, three students with high levels of internalizing behaviors in fall were included at teachers’ requests. When interpreting intervention outcomes, it may be some students with more intensive intervention needs from the onset could have benefited from a more intensive intervention (e.g., functional assessment-based interventions; Umbreit et al., 2024; cognitive behavior therapy; Tse et al., 2023). It will be important for the next test of RRR to explore the impact of RRR on students with moderate levels of internalizing behaviors specifically.
Finally, we note this study focused on a fairly homogeneous sample in terms of geographic locale and participant characteristics. In addition, both participating teachers selected a common subject as the focus for self-monitoring. These aspects of the present study limit the generalizability of findings and require future replication and extension. Despite these limitations, this initial test of RRR provides important insights into the utility and feasibility of this newly developed intervention to assist students with internalizing behaviors. Also, we have gained important information about next steps for testing RRR with teachers as interventionists, as well as using group-design methodology, which will enable the use of other validated tools to assess covert internalizing challenges (e.g., anxious/withdrawn).
Summary
In this article, we report outcomes of two initial studies examining the impact of a new Tier 2 Intervention, RRR. We designed RRR to support fourth- and fifth-grade students with internalizing behaviors, with the intent to increase their academic engagement by teaching them strategies for managing anxious feelings. We conducted this inquiry in partnership with teachers in schools implementing Ci3T, providing a methodological illustration of data-informed intervention efforts for students needing more than Tier 1 practices. We collaborated with teachers to use schoolwide systematic screening and attendance data collected as part of regular school practices to determine which students might benefit from RRR. Results of a multiple-baseline across participants design with five students yielded modest evidence of a functional relation between the introduction of RRR Instruction (provided 1:1 by research personnel) and decreased variability in students’ academic engagement during academic instruction as measured by DBR for four students. Belle, Brayden, Isabel, and Leo demonstrated decreases in variability between baseline and RRR Instruction phases, all of which were at a minimum sustained—and in some cases further decreased—into RRR In-Class phases. Yet, as mentioned previously, we note these shifts were mostly small-magnitude changes evident in variability metrics (e.g., SD, Syx) and in need of replication and future inquiry before generalizing results.
Treatment integrity data suggested high levels of RRR Instruction and RRR In-Class (where students used self-monitoring to support their newly learned skills) implementation. Furthermore, most teachers, parents, and students generally rated the intervention favorably, although there was some variability in ratings across students, suggesting the intervention was more acceptable for supporting some students than others, and in some instances rated more favorably by students compared to adults. Results of an A–B nonexperimental study conducted with students who had high ceiling effects for academic engagement during baseline provided descriptive data suggesting the potential benefits of implementing RRR Instruction in a small-group format. Collectively, lessons learned from this initial inquiry have provided clear next steps for refining intervention components (e.g., family connections), measurement, and experimental designs as we continue to RRR implementation, impact, and feasibility when implemented by general education teachers during the traditional school day.
Supplemental Material
sj-docx-1-ebx-10.1177_10634266261417641 – Supplemental material for Preliminary Testing of Recognize. Relax. Record.: A Tier 2 Intervention for Elementary Students With Internalizing Behaviors
Supplemental material, sj-docx-1-ebx-10.1177_10634266261417641 for Preliminary Testing of Recognize. Relax. Record.: A Tier 2 Intervention for Elementary Students With Internalizing Behaviors by Mark Matthew Buckman, Kathleen Lynne Lane, Wendy Peia Oakes, Eric Alan Common, Amy A. Buffington, Allison M. Bernard and Kathleen N. Tuck in Journal of Emotional and Behavioral Disorders
Footnotes
ORCID iDs
Funding
The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R324X220067 to the University of Kansas and from the Office of Special Education and Rehabilitative Services, Office of Special Education Programs, U.S. Department of Education, through Grant H325D220011 to Arizona State University. The opinions expressed are those of the authors and do not represent views of the U.S. Department of Education.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Research materials are available from the first and second authors. Additional supplemental materials can also be accessed at the Open Science Framework page associated with this study (Lane et al., 2022).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
