Abstract
The sudden global closure of schools due to the COVID-19 pandemic created an urgent need for quality distance learning options for students with autism spectrum disorder (ASD). The purpose of this study was to explore asynchronous implementation of Modified Schema-based Instruction (MSBI), established evidence-based practice, through caregiver-assisted video-based instruction for secondary students with ASD. An embedded experimental mixed-methods design was used to evaluate both quantitative effects on and qualitative indicators of problem solving. Visual analysis of quantitative data indicated five clear demonstrations of effect and one weak demonstration of effect across tiers, supported by two estimates of effect size. Qualitative analysis of recorded sessions revealed changes in students’ frustration, confidence, and use of effective mathematical practices, which corresponded with the magnitude of intervention effects. Integration of both data sets through a joint display strengthened interpretation of findings and identified contextual and affective factors that may have influenced variability in outcomes.
Keywords
School closures in March 2020 due to the COVID-19 pandemic left schools needing to find the best way to provide instruction to all students, including those with autism spectrum disorder (ASD). The specialized instruction students with ASD receive at school not only provides them access to the general curriculum but also promotes development of communication, social skills, independence in daily routines, and function-based intervention for problem behavior. The abrupt shift to distance learning forced students with ASD as well as their teachers and caregivers (e.g., parents, guardians) to adapt quickly. Distance learning required new skill repertoires for teachers, students, and caregivers and “illuminated the complexity of providing FAPE [free appropriate public education] for students with ASD in the face of local and national crises” (Stenhoff et al., 2020, p. 211).
This societal crisis amplified the need for collaboration between teachers and caregivers in instructional delivery for learners with ASD (Hurwitz et al., 2022; Stenhoff et al., 2020), especially those with co-occurring intellectual disability (ID). Whereas the research base for caregiver (e.g., parent) mediated or assisted interventions for typically developing students has focused primarily on academic outcomes, the focus for students with ASD has been in reducing challenging behavior and improving communication, adaptive, and social skills (Ratliff-Black & Therrien, 2021). Despite this dearth of empirical research on caregiver-assisted academic instruction for students with ASD, the COVID-19 pandemic made such collaboration unavoidable; it became an essential component of instruction during school closures. A survey of caregivers of students with intellectual and developmental disabilities (including ASD) in the United States about their child’s instructional experiences during the COVID-19 pandemic found most (68.8%) were either attending school in a hybrid model or were still fully remote in the fall of 2020 (Root et al., 2023). Furthermore, most caregivers reported spending 15–30 min (33.3%) or 30–60 min (38.3%) supporting math instruction each weekday during the fall of 2020. This was a stark increase from the data in pre-COVID caregiver reports, where 25% reported spending 15–30 min daily and just 7.8% reported spending 30–60 min daily with their child on math instruction.
The change in instructional roles also impacted special education teachers, who reported employing numerous strategies to assist caregivers of students with ASD, including both modifying interventions so caregivers could help with delivery and developing new ways to collect data to monitor progress (Hurwitz et al., 2022). Baweja and colleagues (2022) commented on the challenge distance learning presented for caregivers, acknowledging that they can be both the best expert on their child and not have the required training and experience to effectively support their child to learn academic content.
Despite the abrupt and unprecedented shift to fully digital learning, the use of technology to deliver instruction to students with disabilities is not novel. Technology-assisted instruction has been classified as an evidence-based practice (EBP) for students with ASD (Root et al., 2017), including for teaching mathematics to students with ASD and co-occurring ID (Root et al., 2021; Spooner et al., 2019). Video-based instruction (VBI) falls under the umbrella of technology-assisted instruction and has extensive research support for teaching students with ASD. Distance learning is an opportune scenario to utilize VBI to deliver high-quality mathematics instruction because it provides opportunities for repeated viewing, increasing efficiency and consistency in the delivery of instruction (Cox et al., 2021). A particular benefit of VBI is the ability to asynchronously provide high-quality explicit instruction with consistent, structured language and video models that can be accessed as many times as needed (Yakubova et al., 2020). Students with ASD and co-occurring ID have demonstrated they can acquire mathematical problem-solving skills through VBI, including additive word problems (Saunders, 2014), estimating money needed for a purchase and calculating change (Burton et al., 2013), comparing prices (Weng & Bouck, 2014), and multiple step problems requiring multiplication and division (Kellems et al., 2016).
VBI has been used in combination with other EBPs to develop mathematics skills. For example, modified schema-based instruction (MSBI) is an EBP for teaching mathematical word problem solving to students who have ASD (with and without co-occurring ID) that has been combined with VBI (Root et al., 2021). Saunders (2014) taught elementary students with ASD and co-occurring ID to solve and discriminate between two types of additive word problems through MSBI delivered via VBI that incorporated principles of explicit instruction (Archer & Hughes, 2011). Students first watched a model video that was a screen recording of the researcher using think-alouds with consistent academic language while solving problems and then had the opportunity to practice solving similar problems on the computer using the same program. Results of the multiple-probe across-participants single-case design found a functional relation between MSBI delivered via VBI and acquisition of mathematical problem-solving skills.
Current Study
Identifying efficient, effective, and equitable ways to teach students with ASD during remote instruction—which was a novel pedagogical environment—was a high priority for teachers, families, and education researchers in the initial months of the COVID-19 pandemic. In the summer of 2020, we were not able to identify any experimental studies that evaluated the effects of parent- or caregiver-mediated intervention on academic skill development for students with ASD, the very task given to caregivers and teachers across the world. Hurwitz et al. (2022) surveyed special education teachers in Indiana about their efforts to implement EBPs in response to the pandemic, with 77% of the 106 respondents reported making at least one contextual modification to how interventions were delivered. Thus, while caregivers were suddenly positioned as key instructional partners, the field lacked experimental guidance on how academic interventions could be feasibly mediated at home. Recognizing that teachers’ adaptations during the pandemic represented natural experiments in implementation, we designed the current study to test one such approach through an intervention mixed-methods framework (Fetters et al., 2013).
Specifically, this embedded experimental mixed-methods study aimed to evaluate the effects of caregiver-assisted video-based MSBI for secondary students with ASD. We used a single-case experimental design to generate quantitative data on the impact of the intervention on student word problem-solving behaviors as operationally defined and measured in prior MSBI studies. Screen recordings were analyzed qualitatively to capture students’ engagement in problem solving across phases. By integrating quantitative outcomes with qualitative process data, we evaluated both whether the intervention improved performance and how students enacted problem-solving strategies across phases. The following research questions guided our inquiry:
Is there a functional relation between MSBI when implemented asynchronously through caregiver-assisted VBI and an increase in word problem-solving behaviors for secondary students with ASD?
How do secondary students with ASD engage in problem solving before and during the intervention?
How does integration of qualitative and quantitative data on student word problem solving impact the interpretation of the effects of the intervention?
Method
Participants and Setting
After obtaining approval from the university human subjects committee, participants were recruited via social media. To be eligible for participation, students had to meet the following inclusion criteria: (a) enrolled in 6th–12th grade, (b) parent report of ASD diagnosis or eligible for special education services under the area of autism, and (c) satisfactory performance on researcher-created mathematical screening measure. Potential participants and their caregivers were invited to a virtual orientation meeting over Zoom. All potential participants that attended the orientation were emailed a link to electronic consent forms. Following consent, potential student participants engaged in a brief individual screening over Zoom with a member of the research team. The screening assessed mathematical skills including one-to-one correspondence, whole number identification, mathematical symbol recognition, operations with whole numbers, and word problem solving. A copy of the screening can be accessed at https://shorturl.at/lyLAP. Students were excluded if they could already solve multiplicative comparison word problems.
A total of 31 students were screened, 9 were eligible, and 7 chose to participate in the study. Demographic information about the students and their caregivers can be seen in Table 1. Specific data on socioeconomic status and academic performance (grades) were not requested. Research took place virtually in participant homes in the fall of 2020.
Student and Caregiver Demographics.
Note. F2F = face to face, SPED = special education, CG = caregiver, DIST = distance learning, HYB = hybrid, GEN = general education.
Parent reported ASD with co-occurring ID.
Research Design
This study used an embedded experimental convergent mixed-methods design (Creswell, 2022), as shown in the procedural diagram found in Figure 1. Integration occurred at the design, methods, and interpretation and reporting levels (Fetters et al., 2013). At the design level, an intervention mixed-methods framework was used to collect qualitative data to support understanding results of the experimental design. The qualitative strand was embedded to enhance fit by capturing process-level evidence aligned with experimentally measured outcomes. Merging was used for integration at the methods level by bringing the quantitative and qualitative datasets together for analysis and comparison. Integration at the interpretation and reporting level began with data transformation (qualitative to quantitative) through content analysis (Krippendorff, 2019). A joint display was used as both a process for analysis and product for reporting the mixed findings (Fetters et al., 2022; Root & Lindstrӧm, 2024).

Procedural diagram of embedded experimental convergent design.
Single-Case Experimental Design
A single-case multiple-probe across-participants experimental design (Ledford & Gast, 2018) was used to evaluate the effects of the intervention on mathematical problem-solving behaviors with two experimental conditions (baseline and intervention). A priori decisions about participant start-points (i.e., baseline lengths) and dosage of the intervention were made to increase feasibility and acceptability of participation (Ledford, 2018). Participants were randomly assigned to tiers, with the first tier completing four baseline probes, second tier completing five baseline probes, and the third tier completing six baseline probes. This a-priori decision was made to reduce potential frustration of caregivers and students that may have resulted from prolonged baseline (Ledford et al., 2023). Researchers predetermined dosage of six intervention sessions based on rates of learning reported in prior MSBI research with secondary students with ASD. Consistent with prior MSBI studies, the dependent variable was word problem-solving behaviors, measured by a 6-point researcher-created rubric (see Table 2).
Expected Student Response for Each Measured Behavior and Participant Average Correct Responses by Phase.
Note. BL = average correct in baseline; IV = average correct in intervention.
Intervention
The independent variable was MSBI delivered asynchronously via VBI with assistance of caregivers. Participants accessed the intervention through SeeSaw, a learning platform available via web browser or app. Three participants borrowed an iPad and Apple pencil from researchers while the remaining four used personal technology. In baseline and intervention sessions, student materials included (a) electronic worksheets containing a multiplicative comparison word problem and the heuristic (see Figure 2), (b) Unifix cubes and calculator that were mailed to them, and (c) anchor videos. Anchor videos were 30-second videos that showed a person engaged in the setting described in the word problem, such as a restaurant or a home kitchen. In intervention, participants also accessed instructional videos (described further below).

Annotated screenshot of materials in the SeeSaw platform.
Word problems were written by members of the research team following guidelines from Spooner et al. (2017) and contextualized the targeted problem type (multiplicative comparison) within high-interest real-world situations based on participants’ interests indicated in a preintervention survey (described further in social validity section below). The first sentence introduced the context, the second sentence made a comparison, and the final sentence concluded with the question (see Figure 2). Problems used single- and two-digit numbers with quantities that made sense given the context with products of 40 or fewer. Problems were never repeated.
Following procedures from prior MSBI studies, anchor videos were 20–30 s in length and provided background information on the real-world application of mathematics within the selected “theme” for that session (e.g., coffee shop). Anchor videos had voiceover narration to describe the setting and what behaviors may occur there (e.g., you might go to a coffee shop to purchase a drink and talk to friends). In intervention sessions, students also watched researcher-created instructional videos that displayed the instructor’s computer screen and a small picture-in-picture view of the instructor (third author) talking. Instructional videos were created using Loom, a screen-recording program that was free for educators at the time of the study. The instructor modeled how to use the heuristic and tools in the SeeSaw platform (Figure 2). Content of instructional videos is described in further detail below.
Procedures
Study procedures were intended to balance the need for feasibility for caregivers to implement and facilitate the intervention with experimental control (Ledford et al., 2023). All study activities were conducted virtually through Zoom and SeeSaw. Prior to baseline, students completed a “getting to know you” activity via SeeSaw to give them an opportunity to practice using the tools within the platform and build rapport with researchers.
Prestudy Social Validity Measures
Participants completed a short prestudy survey via Google Forms prior to baseline. They were provided a list of themes (e.g., art, candy store, cooking) and asked to rank which were of interest to them. Responses were used to tailor word problems assigned to each student to their interests. Participant feelings about mathematics were captured using yes/no and Likert-type scale questions. Caregivers also completed a short survey via Google Forms prior to baseline to (a) gather demographic information about themselves and their student; (b) assess their current perceptions of their student’s strengths, needs, and interest in mathematics; and (c) to assess their comfort supporting their student in mathematics. Caregivers were invited to participate in a prestudy interview via Zoom with a researcher. A semi-structured interview protocol was used which asked about how their role in their student’s instruction changed during COVID-19 pandemic, their perception of their student’s strengths and barriers in mathematics, and whether they felt their student’s current instruction at school was meeting those needs.
Caregiver Support
Prior to baseline, all caregivers were mailed an anticipated timeline and schedule, description of research activities with the purpose and anticipated time each would take, and a handbook (protocol) for supporting their student. The handbook included login information, the sequence of activities in each lesson, and what their role was in each lesson as it differed by phase (e.g., baseline, intervention). The handbook also included QR codes to access Zoom meetings and SeeSaw, as well as researcher contact information for quick reference.
Baseline
During baseline sessions, students completed between four and six problem-solving probes depending on whether they were randomly assigned to the first (4), second (5), or third (6) tier. They were asked to complete only one activity per day. Word problems were assigned to students based on the interests they indicated in the prestudy survey. Each activity included watching an anchor video aligned with the theme of the word problem and solving one word problem independently. The caregiver handbook instructed them to record the screen within the SeeSaw platform, read the problem aloud, ask the student to solve the problem, provide technical assistance when needed, and to provide only encouraging statements without academic instruction or feedback accuracy.
Intervention
The first two intervention lessons provided instruction on the mathematical vocabulary and problem-solving routine via researcher-created instructional videos. The videos instructed students to follow along by completing the same steps on the electronic worksheets. The first two sessions were not recorded, and no data were collected as students did not have an opportunity for an independent response. For the subsequent six intervention sessions, students were assigned themed activities based on the preferences they indicated on their prestudy survey. Just as in baseline, activities began with the student watching the anchor video aligned with the theme of the word problem. Next, students engaged in lessons that used a model (e.g., watching a model video), guided practice (e.g., completing a problem with their caregiver), and independent practice (e.g., completing a problem on their own) format (Archer & Hughes, 2011). Each video used in intervention activities lasted 3–4 min and featured the same researcher using think-alouds to model following the problem-solving routine (see Figure 2).
The handbook directed caregivers to support their student to complete the model problem along with the video by opening the video in a separate tab, watching and following along with the model video, and providing problem-specific feedback. Next the students and caregivers were directed to complete the guided practice problem together, with the caregiver handbook directing them to watch the guided practice video to review after completing their own work if needed. The students then completed the independent practice problem on their own. The caregiver handbook prompted caregivers to record the screen within the SeeSaw platform for independent practice problems, read the problem aloud, and not provide any corrective or confirmative feedback to students. After completing the independent practice problem, they were directed to watch the independent practice model video and check their work.
Researchers monitored participant engagement daily by reviewing data within SeeSaw (e.g., completion of assigned learning activities, video recordings) and provided feedback regarding adherence to the protocol to caregivers via email as needed. A researcher observed approximately 30% of sessions in each phase for each participant via Zoom. During these sessions, they provided feedback and support to caregivers and students as needed or requested.
Poststudy Social Validity Measures
At the conclusion of the study, students and caregivers completed a poststudy survey via Google Forms with parallel questions to the prestudy survey and additional questions regarding their experience engaging in the research study, including feedback on (a) training and support, (b) feasibility of the intervention, (c) impact of the intervention, and (d) appropriateness of the intervention. Four caregivers also chose to engage in individual, brief poststudy interviews via Zoom with a member of the research team.
Data Analysis
In this embedded experimental mixed-methods design, we followed the steps for data analysis outlined by Creswell (2022). To answer the first research question, we conducted visual analysis of quantitative data (e.g., word problem-solving behaviors as measured by a researcher-created rubric). Next, we analyzed qualitative data (e.g., problem-solving behaviors observed through screen recordings) to answer Research Question 2. Finally, we created a joint display to bring the two datasets together for analysis to answer the third mixed-methods research question. Presurveys and postsurveys and interviews were analyzed in concert to gain understanding of social validity of the intervention.
Quantitative Data Analysis
The first and second authors independently conducted visual analysis of Figure 3 to determine whether there was a functional relation between the intervention and mathematical word problem-solving behaviors. Descriptive statistics (i.e., mean, range, percentages) were also calculated between phases for each participant. We also included an effect size estimate focused on goal attainment. Rather than being rated by expected outcome on a scale as with other effect size estimates (e.g., -2 to 2), the percent of goal obtained (PoGo) criterion is aligned to the goal level for a specific behavior and aligns with graphic displays common in single-case designs (Ferron et al., 2020). Effects can be interpreted as little to none (<20%), small (20%–40%), medium (40%–60%), moderately large (60%–80%), and large (>80%).

Graph of problem-solving behaviors.
Reliability and Fidelity
Interobserver Agreement (IOA) and fidelity were assessed for a minimum of 30% of sessions for each student in each condition. Sessions to measure IOA were randomly selected before the study began to reduce bias. Point-by-point agreement was calculated by taking the number of agreements and dividing it by the total number of agreements plus the total number of disagreements and then multiplying by 100%. Disagreements were discussed weekly by the research team until consensus was reached. Average IOA during baseline was 91% (range 75%–100%) and was 97% (range 83%–100%) during intervention. Table A in the Supplemental Material (https://shorturl.at/lyLAP) reports IOA and fidelity by participant by phase.
A member of the research team observed sessions with each caregiver-student pair via Zoom to collect fidelity data live, which are marked with open circles in Figure 3 because the presence of an observer altered setting conditions. In baseline, the fidelity checklist assessed four components: (a) materials provided, (b) watched anchor video, (c) caregiver or student read problem aloud, and (d) caregiver did not provide any prompting or feedback. During intervention, the checklist assessed an additional three components: (a) watched model problem, (b) caregiver or student read guided practice problem aloud, and (c) caregiver and student completed guided practice problem together. The total number of procedural elements correctly implemented was divided by the total number of procedural elements and then multiplied by 100%. Average fidelity during baseline was 94% (range 75%–100%) and 100% in intervention. Figure 4 reports fidelity by participant by phase.

Joint display of quantitative index of effect and observed frequency of qualitative codes.
Qualitative Data Analysis
To document the problem-solving behaviors students demonstrated across baseline and intervention sessions, researchers independently watched and listened to screen recordings of students completing independent practice problems and noted their impressions. As a team, researchers then met to discuss impressions from these observations and develop consensus codes. A coding manual with definitions of each code along with examples and nonexamples is available from authors on request. Once all videos were double coded for consensus themes by two members of the research team, the second author conducted a thematic analysis to identify patterns of student as well as caregiver behavior. Differences between baseline and intervention were examined by comparing codes between phases. Steps were taken to address credibility and trustworthiness of our interpretations through triangulation of data from multiple sources and peer debriefing to consider how our own positionality was impacting our interpretations.
Integrated Analysis
We used data-transformation merged analysis (Creswell & Plano Clark, 2011) through content analysis (Krippendorf, 2019) to transform qualitative data into numeric information so that both databases could be compared and analyzed. The first author transformed qualitative codes into frequency counts before creating a joint display (Figure 4) to merge the datasets in a way that enabled drawing meta-inferences (Creswell, 2022). Participants were arranged vertically based on quantitative index of effect using PoGo (Ferron et al., 2020) from the largest estimated effect (Carl) to the smallest estimated effect (Ann). Qualitative codes are displayed horizontally, with frequency of observed codes for each experimental condition in corresponding cells. Researchers also compared thematic codes to visual analysis of the graph, with attention to potential impact of researcher presence during the session (open circles in Figure 3). This led to development of two additional codes (i.e., caregiver adherence, deviation from protocol). All recordings were double coded for adherence and deviation by two members of the research team. Once these additional codes were added to the joint display, the first author once again compared thematic codes to visual analysis of the graph, noting patterns used to draw conclusions and interpretations from both dimensions. The second author reviewed the identified patterns and resulting meta-inferences for conversion legitimization (Onwuegbuzie & Johnson, 2006).
Positionality Statement
This study was motivated by both personal and professional experiences of the research team in supporting autistic students as parents, family members, teachers, and related service providers. In our current work as researchers and teacher educators, we are committed to advancing the field’s understanding of and supports for ensuring access to the general curriculum. Our dual roles as researchers and parents/educators of autistic children shaped our assumptions and interpretations. Three members of the research team were special education teachers or administrators in the spring of 2020 and had their own experiences of striving to provide continuity of instruction to students with ASD through distance learning. These intersecting personal and professional experiences played a role in our decision to pivot from teacher-implemented to caregiver-assisted MSBI in a way that foregrounded flexibility and authentic problem contexts. At the time of the data collection, three members of the research team were also experiencing the impact of the COVID-19 pandemic on the learning experiences of our own children, including those with ASD. This gave us heightened sensitivity to family’s circumstances and the realities of remote learning. We engaged in and documented reflexive dialogue weekly to balance insider understanding with systematic inquiry (The QR Collective et al., 2023).
Results
Research Question 1: Quantitative Analysis of Word Problem-Solving Behaviors
Figure 3 displays the independent correct word problem-solving behaviors (from the six-item rubric in Table 2) for each of the seven students for each session across conditions. The gap between baseline and intervention data for each student represents the two sessions of teaching lessons when no data were collected. Open circles in Figure 3 represent when a member of the research team observed the session live via Zoom to assess procedural fidelity. The average number of correct responses for each behavior by student and phase is also shown in Table 2. PoGo is reported for each participant in Figure 4, ranging from 95% (Carl, large effect) to 5% (Ann, little to no effect); five out of the seven participants had a moderate-to-large or large effect.
There was a stable and predictable pattern for five of the seven students (Lisa, Jon, Ann, Ray, and Tim) during baseline. After beginning the intervention, an immediate effect was observed for both students in Tier 1 (Carl and Lisa), one of the students in Tier 2 (Jon), and one of the students in Tier 3 (Tim). Intervention data for Carl, Lisa, Ray, Zion, and Tim followed an accelerating trend. Jon’s accuracy during intervention was variable, while Ann’s accuracy did not meaningfully increase. There was no observed overlap in data between baseline and intervention conditions for Carl, Lisa, Jon, or Tim. Ray’s first intervention session overlapped but increased on his second intervention session and remained stable thereafter (range = 5–6). Similarly, the first two intervention sessions for Zion overlapped with baseline, but his performance improved on the third intervention session and remained between 5 and 6 for his final intervention sessions.
Both Carl (range = 2–3) and Lisa (range = 0–2) had a slight accelerating trend in baseline but had an immediate clear level change in the number of accurate behaviors for both participants after the introduction of the intervention. Possible explanations for the variability observed in baseline are explored in the discussion of our meta-inferences. With stable and nonoverlapping data, there is evidence to support a basic effect for both Carl and Lisa.
In Tier 2, Jon had stable baseline responding (0 correct) with a variable pattern of performance in intervention. Researchers analyzed problem-solving tasks and hypothesize variability was a result of the student using addition rather than multiplication if the numbers presented were outside of his known mental math facts (e.g., 7 and 3). After four intervention sessions with variable performance, a member of the research team began to lead the guided practice sessions for his final two intervention sessions (indicated with a solid star on Figure 3). With highly variable intervention data, Jon’s data support weak evidence of a basic effect.
Anne’s data does not demonstrate an effect as her data remained stable between 0 and 1 throughout her baseline and intervention sessions. The only behavior Ann completed independently was Step 1 (identifying the problem type) during Intervention Sessions 2 and 4. During the fifth intervention session, researchers worked with the caregiver to introduce a token economy to encourage independence. During model and guided problems, she attended to and completed each step. During independent problems, she indicated she was finished with the problem without completing all steps. With the addition of the token economy in Session 5, she added information to each section of the worksheet, but it was not accurate. Due to the family’s schedule changes, no additional data were collected.
All three students in the third tier demonstrated a clear level change between baseline and intervention. Ray’s baseline data were mostly consistent at 0 correct, with one outlier on his final baseline session of 2 correct (i.e., diagram the relationship, solve). Interpretation of an effect for Ray must consider the threat of the accelerating trend in baseline along with the one overlapping intervention datapoint, though the accuracy increased and remained stable for his final five intervention sessions. Zion’s baseline performance also increased after three sessions from 0 correct to 3 correct (i.e., wrote the equation, solved, reasoned answer makes sense); however, his baseline performance stabilized prior to implementation of the intervention. Zion reached ceiling performance during the third and fourth intervention sessions. Tim’s performance in baseline was stable (range = 0–1) and immediately increased to 5 after the implementation of the intervention with a clear level change, immediacy of an effect, and no overlapping data.
Research Question 2: Qualitative Analysis of Word Problem-Solving Behaviors
Four initial qualitative codes were identified, and researchers compared observations of baseline and intervention session recordings: (a) signs of frustration, confusion, or anxiety; (b) signs of confidence; (c) effective use of mathematical practices; and (d) ineffective use of mathematical practices (see joint display in Figure 4). Interference with data interpretation (i.e., missing audio and/or screen recording) was added as an additional code after the initial creation of the joint display to demonstrate how much data were available for coding. Zion and Lisa had the least amount of qualitative data available, but they both had high procedural fidelity when sessions were observed live by a member of the research team.
Multiple participants consistently demonstrated signs of frustration, confusion, or anxiety at a much higher rate during baseline than in intervention sessions. For example, Carl repeatedly asked for confirmation from their caregiver and said things like “I can’t do this” and “I can’t do them all” in baseline, but said things in intervention like “It’s 12 minutes, right?” (correct answer) and “You will see if I get it right!” Most participants demonstrated signs of increased confidence after beginning the intervention by saying things like “I’m doing it!”, having a faster pace of completion, and self-correcting errors. Ann, whose quantitative data demonstrated the least amount of growth, said, “I did good, right?” and “I’m doing it!” in her final intervention sessions. Students also demonstrated the use of effective mathematical practices by correctly diagraming the mathematical relationship, articulating their reasoning, and explaining how they chose the letter to represent the variable during intervention sessions. This was a change from baseline sessions when students were observed using ineffective mathematical practices such as drawing unrelated pictures, using addition to solve all problems, and giving multiple answers.
Research Question 3: Integration of Quantitative and Qualitative Data
The joint display in Figure 4 was created to both analyze and report integrated findings (Root & Lindstrӧm, 2024). This integrated analysis refined our interpretation of the quantitative results by revealing affective, behavioral, and contextual patterns that helped explain individual differences in responsiveness to the intervention. Our analysis resulted in three meta-inferences.
Meta-Inference 1: Influence of Caregiver Adherence to and Deviation From Protocol
Observations of caregiver adherence to the protocol increased, and observations of deviations from the protocol decreased when participants moved from baseline to intervention. Although no pattern was detected between observed adherence to the protocol and quantitative outcomes, there was a pattern between observed deviations and outcomes in baseline for two participants. Therapeutic trends in baseline for participants Carl and Lisa occurred when the researcher was not observing the session (closed circles in Figure 3) and corresponded with observed deviations from the research protocol (noted in Figure 4). Their caregivers provided prompts or cues (e.g., “use the calculator,” “check your answer”) during the baseline sessions where the participants’ mathematical problem-solving behaviors increased prior to intervention. Although there were observed deviations in intervention as well, the only noted pattern was that they decreased between baseline and intervention for all participants except Ann (for whom the intervention was least effective).
Meta-Inference 2: Changes in Frustration and Confidence
We observed a pattern between changes in observed instances of both frustration and confidence and the quantitative index of effect. If PoGo indicated the intervention had a large effect, observed instances of frustration decreased and confidence increased when participants moved from baseline to intervention. When PoGo indicated the intervention had a moderate to large or medium effect, signs of confidence increasing between phases was the only discernable pattern. Observed instances of frustration and confidence increased between baseline and intervention when the intervention had little to no effect (Ann).
Meta-Inference 3: Impact on the Use of Mathematical Practices
We identified patterns between the type of mathematical practices observed and experimental condition for students whose PoGo indicated the intervention had a medium or greater effect. For these students, their effective use of mathematical practices increased, and ineffective use of mathematical practices decreased when they moved from baseline to intervention. As students gained familiarity with the problem schema and heuristic, they appeared to shift from “trial-and-error” behaviors to deliberate use of cognitive and metacognitive strategies.
Social Validity
Results from the prestudy and poststudy surveys completed by student and caregiver participants are reported in Table 3. Student survey data indicated an increase in positive feelings toward mathematical problem solving and confidence in mathematics and problem solving. Two notable changes were in the number of students who responded “yes” to being able to solve word problems (28% pre to 66% post) and caregivers who indicated their child could solve word problems (0% yes and 29% not sure pre to 16% yes and 33% not sure post). When asked in the prestudy interview about their student’s strengths in mathematics, all except Ann’s caregiver stated computation (i.e., procedures to carry out operations). Five caregivers described difficulty comprehending mathematical concepts or situations, three stated word problems, and four described their student’s frustration with processing or organization as barriers to mathematical learning. While all seven caregivers identified “functional math” as an instructional priority (e.g., “daily living,” “money skills,” and “understanding what he is doing”), three stated reading comprehension was their primary academic priority. Caregivers indicated the skills they learned to support their student during this intervention could be used to help their student in other areas, they felt this intervention was beneficial for both their student and themselves, and that they would be interested in participating in future studies.
Student and Caregiver Prestudy and Poststudy Survey Responses.
Discussion
Conducted during the fall of 2020, a period of widespread school closures, this study responded to the instructional challenges created by the COVID-19 pandemic when most students with intellectual and developmental disabilities were engaged in distance learning (Root et al., 2023). The pandemic fundamentally altered how special educators used and evaluated the effects of EBPs for their students with ASD (Hurwitz et al., 2022). Recognizing these shifts, the primary aim of this study was to evaluate the effects of MSBI, an established EBP for students with ASD (Root et al., 2021), when delivered asynchronously through VBI with the assistance of a caregiver. The qualitative and quantitative data were then brought together using a joint display to generate meta-inferences regarding the impact of the intervention. Below, we first discuss how quantitative outcomes of MSBI delivered asynchronously through caregiver-assisted VBI compared to quantitative findings of prior MSBI studies before discussing our meta-inferences regarding how the integrated findings impacted interpretation of effects of the intervention.
Interpreting Variability in Quantitative Outcomes
Quantitative results of the multiple-probe across-participants single-case design indicated a basic effect for both participants in the first tier and all three participants in the third tier, but a weak basic effect for one participant in the second tier and no effect for another. As a result, there are five demonstrations of effect at two points in time (first and third tiers) and a sixth weak demonstration of effect at a third point in time (second tier), which is an artifact of random assignment of participants to tiers (Ledford et al., 2023). PoGo aligned with visual analysis for each participant, as shown in Figure 4.
The intraparticipant variability and need for tailored support is not unique to this study. Secondary students with ASD (with and without co-occurring ID) in prior studies where researchers implemented MSBI targeting multiplicative problem solving had similar response patterns (e.g., Root et al., 2018, 2020). Researchers reported addressing this variability through corrective feedback, as was provided to Jon, and adding contingent reinforcement, self-monitoring, and additional visual supports, as was provided to Ann. Ann is an important outlier in published research using MSBI. Her caregiver stated in the poststudy interview, “If we had continued to practice those problems, I think she would have gotten it. Unfortunately, part of the issue was, she’s just really busy right now and I didn’t have much time to devote to that.” To benefit from explicit instruction, students need to receive it as intended and at an adequate dosage (Fuchs et al., 2017). Ann may have benefited from more frequent and/or synchronous MSBI. It is also possible that Ann’s caregiver would have benefited from a differently structured intervention or coaching, as she made multiple requests during interviews and fidelity observations for training to support her new homeschooling endeavor.
Impact of Integration on Interpretation of Intervention Effects
The third research question focused on how integration of qualitative and quantitative findings shaped interpretation of intervention effects. The joint display analysis revealed that qualitative process indicators provided essential context for understanding outcome variability, strengthening causal and construct inferences. Three integration-based interpretations emerged, refining conclusions about effectiveness of the intervention in this authentic context.
Clarified Internal Validity Threats
Regarding the meta-inference about caregiver adherence to and deviation from protocol, the integration of qualitative observations revealed that unobserved baseline sessions may have included unplanned caregiver assistance, decreasing confidence that observed quantitative gains reflected true baseline performance (i.e., internal validity). This integrated finding changes interpretation of Tier 1 baseline trends. Rather than suggesting early acquisition prior to the onset of intervention, the qualitative strand indicated baseline contamination via caregiver prompts, increasing confidence that the level change at intervention reflects the independent variable rather than maturation or practice effects. Without the qualitative strand, these baseline improvements could be misread as weakened experimental control. This underscores a broader methodological challenge in special education and applied behavior analysis of ensuring treatment integrity when interventions are implemented in authentic environments (Detrich et al., 2017; Falakfarsa et al., 2022). In such contexts, traditional quantitative fidelity measures may be insufficient to detect contextual adaptations that affect outcomes (Tolmatcheff et al., 2024). Qualitative data can surface validity threats not otherwise apparent from quantitative data alone, underscoring the benefits of methodological pluralism that can be achieved through mixed-methods research (Root & Lindstrӧm, 2024).
Identified Mechanism-Adjacent Process Indicators
The meta-inference about the observed relationship between frustration, confidence, and intervention outcomes informs interpretation of differential responsiveness by suggesting affective regulation may function as a mediator or enabling condition for MSBI enactment during independent practice—students who remained dysregulated showed attenuated outcome gains, which aligns with broader cognitive-affective models of learning. Muis et al. (2018) demonstrated that students’ perceptions of control and task value predict emotional trajectories during mathematical problem solving and that emotions such as confusion or frustration can be either productive or detrimental depending on learners’ self-regulatory skills. Similar to their findings, participants in our study who exhibited larger quantitative effects also demonstrated decreased frustration and increased confidence, suggesting affective regulation may be a proximal mechanism through which MSBI supports problem-solving success. In contrast, persistence of frustration for nonresponses resembles the “unproductive confusion to frustration” pathway described by Muis et al. Methodologically, these initial emerging patterns justify exploring how incorporating brief, session-level affect indicators as planned secondary outcomes and integrating them into joint displays may support mechanism-adjacent inferences, as integration could reframe variability as potentially systematic variation associated with context rather than random “noise.”
Strengthened Construct Interpretation of Outcomes
This inverse relationship between effective and ineffective use of mathematical practices across experimental conditions when the intervention had a medium or greater effect on outcomes may reflect students’ growing internalization of the verbal problem-solving routine modeled in instructional videos. Integration of the two datasets strengthened our construct interpretation of quantitative gains: Improvements coincided with observable shifts toward schema-consistent representations and reasoning, suggesting changes reflected strategic problem solving rather than isolated correct responses. This meta-inference also reinforces the role of explicit schema instruction in promoting strategic and conceptual engagement during problem solving. Saunders (2014) demonstrated MSBI delivered through VBI can promote metacognitive self-talk for elementary students with ASD and co-occurring ID by repeatedly exposing students to an expert model’s explicit think-alouds while solving problems. Similarly, participants in the present study who had larger quantitative effects often verbalized problem-solving steps or reasoning aloud, suggesting the videos may have supported transfer of modeled self-talk to independent practice.
Integrated Conclusions About Effects
Taken together, the integrated findings support a conclusion that the intervention did improve word problem solving for many secondary students with ASD, while also revealing that responsiveness is contingent on both implementation conditions and learner engagement. Integration strengthened internal validity for several cases by identifying baseline contamination and clarified that outcome gains were accompanied by strategy-consistent practices. At the same time, integration highlighted boundary conditions: When dosage or independent engagement is insufficient, affective optimism may increase without corresponding performance change. Thus, the intervention’s effectiveness should be interpreted as robust for most participants but sensitive to implementation supports and opportunities for productive independent practice.
Limitations and Suggestions for Future Research
The findings of the current study point to a need for future research that deepens understanding of both instructional and learner-level mechanisms underlying mathematics learning for students with ASD, especially those with co-occurring ID. One limitation of our study is that we did not control for potential confounding variables such as consistency of instructional delivery time, which may have impeded the students’ ability to demonstrate growth. Future research could employ frameworks such as FRAME (Wiltsey Stirman et al., 2019) to document how adaptations naturally occur as caregivers and educators implement interventions in diverse naturalistic contexts and to clarify how such adaptations influence fidelity and outcomes (Hurwitz et al., 2022). Parallel efforts should explore how learners’ affective responses, particularly frustration and confidence, can inform data-based decision-making. Future studies might also analyze the role of self-talk modeled in VBI as a bridge between explicit modeling and students’ emerging metacognitive strategies. Collectively, these efforts will illuminate how to optimize both delivery of and engagement with MSBI.
Finally, the findings of this study highlight risks of making randomized a priori decisions in single-case research instead of response-guided decisions (Ledford, 2018). We aimed to balance methodological rigor with feasibility and favorability for students and caregivers. Future research teams should weigh the ethical, practical, and methodological consequences of rigid adherence to predetermined baseline lengths (i.e., intervention start-points). While randomization can strengthen internal validity, it also reduces opportunities to make instructional adjustments that reflect participant needs. Response-guided approaches, when implemented transparently and systematically, may provide a more context-responsive way to preserve rigor. Future research teams could integrate adaptive or hybrid approaches, combining prespecified randomization with real-time decision rules guided by quantitative and/or qualitative data. Such mixed-methods innovations could enhance both experimental control and ecological validity by allowing researchers to systematically document and justify necessary adaptations while preserving transparency (Ledford et al., 2023).
Implications for Practice
While the impetus and context for this study were the COVID-19 pandemic, we can use these findings to inform how we teach word problem solving to students with ASD in our postpandemic educational settings. Findings highlight the potential for video-based, caregiver-assisted instruction to maintain critical features of EBPs. When educators design or recommend interventions for home or hybrid learning, caregivers should be provided structured guidance, clear role expectations, and ongoing feedback to support fidelity and reduce unintentional adaptations. Incorporating modeled self-talk and explicit problem-solving routines within instructional videos may help students internalize effective mathematical practices and foster independence. Practitioners should also attend to students’ affective engagement by monitoring signs of frustration and confidence and providing supports that transform confusion into productive persistence.
Footnotes
Ethical Considerations
All procedures performed in the reported study involving human participants were in accordance with the ethical standards of the institution and the 1064 Helsinki Declaration and its later amendments of comparable ethical standards. The study was approved by the Human Subjects Committee at Florida State University.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This manuscript was developed with financial support from the Institute for Education Sciences under award number R324B190019, the Office of Special Education Programs under award number H325D190024, the Autism Science Foundation, and American Education Research Association Division K.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Consent to Participate
Informed consent was obtained from all legal guardians of participants.
Editor in Charge: Fred Spooner
