Abstract
Background
The Goal-Based Outcomes (GBO) tool tracks patient goals. Its implementation and effectiveness, especially in child and youth mental healthcare (CYMH) clinics using the Choice and Partnership Approach (CAPA) framework has been understudied.
Methods
We used a Hybrid Type II randomized controlled trial (RCT) and the Reach Evaluation Adoption Implementation Maintenance (RE-AIM) framework; 152 caregiver-patient (aged 5 to 17 years) dyads were randomized to GBO + treatment as usual (TAU) or TAU. Effectiveness outcomes included the Strengths and Difficulties Questionnaire (SDQ) and Children’s Global Assessment Scale (CGAS) pre-post treatment scores, which measured effectiveness using bivariate, ANCOVA (intention-to-treat), and Reliable Change Index analyses. Post-treatment Evaluation of Services Questionnaire (ESQ) measured care experience. Implementation outcomes included patient flow assessed using the total number of appointments across partnerships and the core partnership number of appointments, a fully recorded session, and the cost to ensure data collection for what was essentially a measurement-based care paradigm. Content of goals and patient/clinician qualitative reports were subjected to thematic analyses.
Results
Reach was high in patient and clinician populations. Effectiveness data did not show a group effect. Goal progress exceeded SDQ changes; goal themes were different from SDQ constructs. Adoption was high, as was implementation. All participants had positive experiences with GBO. Clinic function was not negatively affected by the GBO tool.
Conclusions
The GBO tool was implemented successfully into a CAPA-based CYMH clinic, but group effectiveness was mixed. However, the GBO tool identified unique treatment targets, suggesting potential to enhance tailored, precision CYMH.
Plain Language Summary
Why was the Study Done?
The Goal-Based Outcomes (GBO) tool helps young people and their families set personal goals and track their progress during mental health treatment. It could be especially useful in personalized child and youth mental health care. But until now, no studies have tested how well it works in regular clinical settings.
What did the Researchers do?
This study looked at how well the GBO tool could be used and how effective it was in a real-world child mental health clinic that follows the Choice and Partnership Approach (CAPA). Researchers ran a randomized controlled trial (RCT), which means participants were randomly put into two groups: one receiving usual care with the GBO tool (GBO + TAU), and the other received usual care only (TAU).
What did the Researchers Find?
A total of 152 child-caregiver pairs took part, split evenly between the two groups. Both groups showed improvement over time, but there were no major differences between them in standard mental health scores. However, patients made more noticeable progress on the personal goals they set using the GBO tool. The goals identified were different from what standard questionnaires measured. Families and clinicians liked the tool, and most clinicians said they would keep using it.
What do the Findings Mean?
The GBO tool was easy to use in a real clinic without disrupting care. Families and clinicians found it helpful. It helped identify personal goals that are not captured by standard mental health tools and may help clinicians create more personalized treatment plans.
Keywords
Introduction
Child and youth mental health (MH) is now a worldwide crisis (Benton et al., 2021; Racine et al., 2021) exemplified by the increasing incidence of mental disorders and problematic behaviours such as self-harm (Cairns et al., 2019; Griffin et al., 2018; Morgan et al., 2017). Inadequate access to care is blamed for many of these problems, resulting in long delays in receiving help while children and youth get worse (Morales et al., 2020; Zayed et al., 2016). Although access is a problem, once children and youth get into care, 30–50% of them will not respond to treatments (Bridge et al., 2007; Dobson et al., 2019). This statistic is driving a surge of interest in the often-repeated child and youth MH question: “What works for whom?” (Fonagy, 2015).
The precision child and youth (PCYMH) care paradigm promotes more precise or tailored treatments for the individual, based on data collected about a patient’s biological function, lifestyle, and environment. This is in contrast to the current practice of relying only on information about symptoms, age, sex and gender, and possibly family history (Pajer et al., 2024; Posner, 2018).
Unfortunately, although PCYMH has great potential to transform research and MH care outcomes, there are significant challenges (Pajer et al., 2024; Passos et al., 2022; Posner, 2018; Szatmari & Susser, 2022). One of these is the lack of standardized collection of patient-based high-quality data in real world clinical settings, also known as measurement-based care (MBC). Despite strong evidence that MBC can improve the quality and effectiveness of care, few clinical practices or academic centres have valid and well-organized MBC systems (Aboraya et al., 2018).
Empirical data from child and youth MH services indicate that many systems label themselves as using MBC, but in practice, measurement tends to rely mostly on patient- or caregiver-reported outcome measures (PROMs), with only a small proportion of clinician-reported measures, and comparatively few capturing other outcome domains (e.g. goal attainment) or using multi-informant/multi-modal data (Thapa Bajgain et al., 2023; Whitmyre et al., 2024). While going a long way to improve the problem, the use of PROMS and clinician-reported measures has limitations. For example, although standardized instruments have the advantage of good psychometric properties enabling us to compare patients and caregivers to others (Barkham et al., 2001; Green, 2016; Lutz et al., 2005), data are not collected and tracked about the individual’s goals and needs, a level of detail required for truly tailored care.
Idiographic-PROMs (i-PROMs) are an alternative type of instrument that can fill this gap (Sales et al., 2023). Using self-report from patients and caregivers, this process differs from standardized instruments in that the questions are stimuli to help respondents determine their own unique needs or care goals. Filled out with the clinician’s guidance, the patient characterizes treatment goals most meaningful to them, issues often not apparent from symptom questionnaires (Sales et al., 2023) or even clinical interviews. Examples of i-PROMs include the Goals-Based Outcomes (GBO) tool (Law, 2019), the Goals Form (Cooper & Xu, 2023), and the Goal Attainment Scaling (Kiresuk & Sherman, 1968).
The GBO tool is a commonly used tool in child and youth MH care settings. It facilitates clinician-patient/caregiver collaborative goal setting, the goals comprising a yardstick by which treatment progress is measured per session and longitudinally (Jacob et al., 2016; Law, 2011). Advantages of the tool are that it: (1) can be used with any treatment; (2) personalizes care; (3) facilitates shared decision making; (4) is sensitive to goal changes; and (5) helps clarify conditions for ending treatment, often a problem in MH care (Miller et al., 2008).
Despite the intuitive appeal of the GBO tool, there are several questions about how it fits into real-world practice, e.g., its feasibility, tolerability, and effects on patient flow. It is also not clear what such an individualized tool can contribute to the assessment of service-level outcomes (Elliott et al., 2016). Goal attainment scaling can be time-consuming thus, may disrupt care flow (Turner-Stokes, 2009). And although the GBO tool may be a more sensitive measure of individual change than standardized nomothetic measures, such as the Strength and Difficulties Questionnaire (SDQ) and the Children’s Global Assessment Scale (CGAS), there is limited information about whether the GBO tool generates unique data compared to PROMS or clinician-reported measures used in clinical trials (Bergman et al., 2018; Jacob et al., 2023).
Additionally, it is unknown how well the GBO tool can be integrated into mental healthcare efficiency management systems such as the Choice and Partnership Approach (CAPA) (The Choice and Partnership Approach, 2025). CAPA is rapidly developing into a leading framework to provide efficient and effective care across the UK, Australia, New Zealand, Belgium, Iceland, and Canada (Clark et al., 2018; Pajer et al., 2023). Thus, questions about the practicality of integrating the GBO tool into a CAPA-based clinical setting are important.
Therefore, we conducted a study to answer the following questions in a child and youth MH OP clinic operating within the CAPA framework: (1) Implementation outcomes: Can the GBO tool be successfully implemented in this real-world setting (RWS) without disrupting care? (2) Effectiveness outcomes: Does the GBO tool provide unique data about patient change compared to other instruments, e.g., the SDQ? We believe that compared to treatment as usual (TAU), the GBO tool will be associated with greater patient improvement in symptoms of mental illness, function, as well as greater patient satisfaction.
Materials and Methods
Study Design
This mixed methods study used a Hybrid II Randomized Controlled Trial (RCT) design, as it best met the characteristics of our questions and setting, e.g., face validity for the GBO tool is strong and valid, feasible methods are available to measure real-world implementation in a MH Outpatient (MH OP) clinic (Curran et al., 2012). The RCT was a parallel-group design with an allocation ratio of 1:1 (TAU: TAU + GBO tool). This paper was structured by the Consolidated Standards of Reporting Trials (CONSORT) reporting guidelines (Supplement, Table S1).
Implementation Framework
Reach Effectiveness Adoption Implementation Maintenance (RE-AIM) Framework: Study Variables
Setting
The study occurred at the (MH OP) Service of a pediatric academic health science centre in Eastern Ontario, Canada. The service provides multidisciplinary outpatient services, including consultation, assessment and treatment, to children and youth (0-17 years old) with urgent psychiatric symptoms and early onset of major psychiatric disorders. The clinician complement comprises psychiatrists, pediatricians, psychologists, social workers, occupational therapists, and child and youth care workers. Numerous types of psychotherapies and pharmacotherapies are provided.
Working within the CAPA (The Choice and Partnership Approach, 2025) model, the centre sees approximately 3000 unique patients a year for initial appointments. A CAPA pillar is a stepped approach to care, starting with the Choice appointment, during which a clinician-family or youth collaborative assessment and plan are made. If the plan is for the patient to be treated within the clinic, they are given an appointment with the person who will provide treatment, called the Partnership clinician.
Recruitment and Participants
Recruitment ran from November 2018 to December 2019. The CONSORT diagram is shown in Figure 1. A total of 398 patients/families waiting for Choice appointments were approached by a research team member about interest in study participation. Of those, 186 signed consent or assent forms, depending on their age. Inclusion criteria were presentation to MH OP and a Choice appointment referral for Partnership care. The only exclusion criterion was an inability to speak and write English or French. The CONSORT diagram of participant recruitment
Simple randomization of participants to one of the two arms was done when the referral to Partnership was made with allocation based on a computer-generated random numbers table. Blinding was not possible for patients, caregivers, or clinicians, as the intervention was an add-on to TAU and carried out by clinician and patient or caregiver. Moreover, all clinicians had received GBO training, which further precluded blinding.
Clinician Training
All clinicians were trained and participated in the study as it was a MH program continuous quality improvement (CQI) study. They used the paper version of the GBO tool (Law, 2019; Law & Jacob, 2013), within the shared decision-making framework integral to CAPA. Two in-person GBO tool training workshops several months apart (2 and 1.5 days duration) for all clinicians were conducted by co-author DL, tool developer. Fidelity assessments, with feedback to clinicians by DL were planned based on reviews of patient session audio or video tapes.
All clinicians were also trained in use of the Children’s Global Assessment Scale (CGAS) by co-author KP using the CGAS training manual.
Study Protocol
Consenting families completed the baseline standardized instrument before Choice and the clinician-reported measure was done immediately after Choice. Parents or caregivers (we use these terms interchangeably) and patients were notified of final eligibility in the study, based on remaining in the clinic for Partnership, a few days after the Choice appointment.
A research assistant (RA), blinded to group assignment, opened an envelope containing study arm allocation before the Partnership appointment and then met with the Partnership clinician if the assignment was TAU + GBO. Materials and a reminder about the GBO tool protocol was given to each such identified clinician, as well as questions being answered.
From then on, reminders were sent to clinicians within 24 hours of their next appointment with a TAU + GBO patient to remind them to assess goals with the patient, complete the clinician-reported measure and let the RA know if this appointment would be the discharge session.
For every session in which a TAU + GBO patient was going to be seen, the RA hand-delivered a study folder with the patient’s goals, tracking, and session completion tracking of all materials. The RA picked up the folder at the end of the appointment, checked everything for completion and rectified omissions immediately.
Upon discharge, patients and caregivers (if involved) completed the same instrument filled out at baseline, an experience of care survey, and an experience of study participation survey. These could be done, at patient and caregivers’ convenience, as paper and pencil instruments in the clinic right after the appointment, online through an emailed link to the RedCAP surveys; or orally over the phone to the RA. Final GBO tool information was recorded in the session with the clinician.
This process was also triggered if a patient did not want to return after a subsequent Partnership appointment, even if they had not officially ended care or if they did not show up for at least three appointments in a row.
Caregivers and youth completing all end-of-treatment measures each received a $10 gift card for their time and effort. Data collection was stopped on November 15, 2021 because the majority of people had completed care, but for 12 of them (7 in the GBO + TAU group and 3 in the TAU group), it was stopped on May 15, 2022 whether they were still in care or not. This was to allow enough time for data analyses and completion of the project.
When the entire study was complete, youth and parents were invited on a random basis to participate in one of four separate focus groups to discuss their experiences, based on their treatment allocation: 1. Youth TAU focus group, 2. Youth GBO tool + TAU focus group, 3. Caregiver TAU focus group, and 4. Caregiver GBO tool + TAU focus group. Groups were led by a trained facilitator, lasting 1 to 1.5 hours each. Groups were conducted by Zoom because of COVID-19 restrictions and were recorded and transcribed following de-identification procedures. Focus group participants received a $20 gift card.
Standardized Instruments
MH feelings, behaviours, and symptoms were measured with the Strengths and Difficulties Questionnaire (SDQ), a brief self- and caregiver-reported screening questionnaire (Goodman, 1997). It contains 25 items scored on a three-point scale and grouped into five subscales: emotional symptoms, conduct problems, hyperactivity/inattention, peer relationships problems, and prosocial behaviour. It is a component of standard care in the Outpatient Mental Health Program at CHEO. It was administered before the Choice appointment and at the end of the treatment to both parents and youth aged 12 and older. The SDQ is a global standard child mental health outcome instrument, with moderate but acceptable internal consistency (mean Cronbach α = .73), 6-month reliability (r = 0.62) and predictive validity (Goodman, 2001).
Clinician assessment of function was performed with the Children’s Global Assessment Scale (CGAS), a clinician-reported measure used to assess social and psychological functioning in children and youth aged 6–17 (Shaffer et al., 1983). Level of functioning is scored on an ordinal scale, ranging from 1 to 100, which is divided into 10 categories ranging from ‘extremely impaired’ (1–10) to ‘doing very well’ (91–100). The instrument was built into the electronic health record with a hard stop requiring it be completed before the note could be closed, but the last score obtained at treatment end was the one used in the analyses.
Primary effectiveness outcomes were measured by the pre-post SDQ Total Difficulties summary score on the caregiver SDQ and the pre-post clinician-derived CGAS scores. Secondary outcomes were pre-post subscale scores of the SDQ, as well as the youth-reported SDQ Total Difficulties and subscale scores.
Goal Based Outcome (GBO) Tool
The GBO tool was used as described in Law, et al. (Law, 2019; Law & Jacob, 2015) Patients and caregivers allocated to the GBO tool + TAU group were guided by clinicians in declaring up to three treatment goals at their first Partnership appointment. The tool then compares at each session how far a patient or caregiver feels they have progressed in achieving a goal, based on rating between 0 and 10. The components of the GBO tool are: the Goals Record Sheet, which records the three goals; the Goals Rating Sheet, estimating progress/goal each session on a scale from 0 to 10; and the Goals Progress Chart, allowing patients to track goals’ progress longitudinally.
The first rating was made at the first Partnership appointment (“pre-”) and the final rating was made the last time the patient or caregiver saw the clinician (“post-”) (Law, 2019, 2024). We also used the session ratings for a qualitative analysis of the material selected as important to patients or caregivers (see Data Analysis section below).
Patient/Family Experiences of Care
We used the 7-item subscale focused on care received from the Experience of Services Questionnaire (ESQ), with a 4-point Likert scale scoring for this construct. This sub-scale score range is 9–27, higher scores indicating greater satisfaction. Three free-text sections elicit what the respondent liked about the service, what they felt needed improving, and any other comments (Kotsis et al., 2024).
Satisfaction With Participating in the Study
A brief survey was created to assess patient and caregiver satisfaction with participating in the study. The survey contained three questions rated on a 5-point Likert scale: 1. How satisfied are you with your overall experience of participating in this study? 2. How likely would you be to participate in this study again? 3. How likely would you be to recommend this study to a friend? And one open-text question for feedback (Kotsis et al., 2024).
Experience With the GBO Tool
Patients and parents in the GBO + TAU arm were invited to take part in a focus group regarding their experiences with the GBO tool. Sessions were conducted virtually with a trained facilitator who was given a script containing questions based on the RE-AIM framework and prompts to initiate discussion. Sessions were recorded, transcribed, and de-identified for qualitative analysis.
Clinicians were offered a feedback survey or focus group for their experiences with the GBO tool. Both contained the same 20 open-ended questions (Supplement, Table S2).
Data Analyses
This study used mixed methods analysis. Quantitative analyses were conducted using R statistical software version 4.2.1, released in 2022 by the R Core Team, except for the Reliable Change Index (RCI) calculations, which were conducted using STATA version 16.1 (Stata Corp., College Station, Texas, USA). Baseline characteristics were summarized descriptively by group. Differences between treatment groups were calculated using a two-proportion z test for categorical variables and a t-test for continuous variables.
Multiple imputation by chained equations (MICE) was used to impute missing values for SDQ, CGAS, and ESQ scores, permitting all patients enrolled in the study to be included in intention to treat (ITT) analyses. Because the SDQ and ESQ instruments are not applicable for self-report at all ages, missing values for parents and youth were imputed separately. Both imputations used predictive mean matching and 50 imputed datasets for parents and 77 imputed datasets for youth were produced. These numbers were determined based on the percentage of rows with missing information on any item, corresponding to 50% and 77% of rows with missing data, respectively. Results from multiple imputed datasets were pooled according to Rubin’s rules (Rubin, 1987).
Quantitative analyses for effectiveness analyzed pre-post scores for the SDQ and CGAS (first value [time 1] and last value [time 2]) using intention-to-treat (ITT) analyses comparing the TAU arm with the TAU + GBO Tool arm. The primary and secondary analyses were carried out using the ITT population. For the primary analysis, caregiver- and youth-reported SDQ scores at follow-up [time 2] (dependent variable, DV) were compared between the groups (independent variable, IV) with an ANCOVA model that included baseline scores [time 1] as a covariate. Total SDQ scores and SDQ scores for each dimension (Emotional, Conduct, Hyperactivity, Peer, Prosocial, and Impact) were compared separately for parents and youth. An ANCOVA model was used for the secondary analysis of the following DVs: parent-reported Satisfaction with Care (ESQ), youth-reported Satisfaction with Care (ESQS), and Children’s Global Assessment Scale (CGAS) scores at time 2 and baseline score as a covariate. Benchmark eta squared values of 0.01 (small), 0.06 (medium), and 0.14 (large) were used to assess the effect size (Cohen, 1988).
A Reliable Change Index (RCI) was calculated as proposed by Jacobson and Truax (1991) by dividing the difference between the pre-treatment and post-treatment scores by the standard error of the difference between the two scores (Jacobson & Truax, 1991). The standard error of the difference of the test was directly calculated based on the standard error of measurement. The latter was calculated as a function of the reliability of the scale (i.e., Cronbach alpha) and the standard deviation of the sample at the first measurement point. If the RCI is greater than 1.96, then the difference is reliable: a change in that magnitude would not be expected due to the unreliability of the measure. Conversely, if the RCI score is 1.96 or less then the change is not considered reliable: it could have occurred just due to the unreliability of the measure. The RCIs calculated for the caregiver SDQ Total Difficulties score and the GBO tool were compared using a z-test.
Qualitative analyses were conducted using thematic analysis to determine the most common themes identified by children, youth, and caregivers. As recommended by Braun and Clarke (Braun & Clarke, 2006), the transcribed data was read and re-read several times, and the recordings were listened to several times to ensure the accuracy of the transcription. The initial thoughts and ideas were noted down, followed by a coding phase, in which features of the data that were considered pertinent to the research question were identified. The whole dataset was given equal attention so that full consideration could be given to repeated patterns within the data. Codes that were considered very similar or had been considered to cover the same aspect within the data were combined into themes. The themes were defined, named, and accompanied by a detailed analysis. Examples from transcripts were selected to illustrate the identified themes. Finally, themes were compared to the sub-scale constructs on the SDQ standardized for Canadian clinical populations to see if the constructs differ for the SDQ and the GBO tool. Overall, the qualitative analysis was informed by a constructivist epistemological and ontological stance, accompanied by ongoing reflexive consideration of how our clinical backgrounds and involvement in the service might have influenced data interpretation.
Results
This section is organized by the RE-AIM components: Reach, Effectiveness, Adoption, Implementation, and Maintenance.
Reach
The CONSORT diagram in Figure 1 displays the recruitment, enrollment, and participation flow of patients and caregivers, as well as the Reach for patients. A total of 453 Choice appointments were scheduled during the recruitment period. 398/453 (85%) of all patients and caregivers were informed of the study. Those missed were due to factors such as families not showing up for appointments, permanently cancelling their referral to the clinic, or research staff unavailability.
Of the 398 families approached, 186/398 enrolled (47%), understanding that they would not proceed with the study unless they were referred from Choice to a Partnership appointment. A total of 152 patients and families were referred to Partnership appointments; 76 were randomized to the GBO + TAU arm and 76 to TAU. A total of 72 received the intervention in the GBO + TAU group, while 70 received the intervention in the TAU group (Figure 1).
Patient and Clinician Characteristics
TAU: Treatment as Usual; GBO: Goal Based Outcomes: SD: standard deviation.
Clinician Reach was 100%. All clinicians participated, as this was a clinic CQI project.
Effectiveness Outcomes
The ANCOVA Models Comparing all SDQ Scores and CGAS Scores Between the Study Groups From Baseline to Follow-up Among Caregivers and Youths
ANCOVA: Analysis of covariance; SDQ: Strength and Difficulties Questionnaire; CGAS: Children’s Global Assessment Scale; CI: confidence interval.
Total difficulties score: This is generated by summing scores from all the scales except the prosocial scale. The resultant score ranges from 0 to 40, and is counted as missing if one of the 4 component scores is missing.
SDQ: caregiver and youth rated.
aCGAS: clinician rated.
RCI was calculated for the caregiver-rated SDQ Total Difficulties score. Based on the standard deviation of the sample at the first measurement point (SD = 5.23) and the reliability of the parent-rated SDQ in the sample (Cronbach’s alpha = 0.85), a standard error of measurement of 2.03 was obtained, indicating reliable change. Overall, 6.8% of participants showed significant improvement between pre- and post-scores, 72.8% showed no change, and 20.4% showed a deterioration. The difference between the reliable change category and treatment groups was not statistically significant (z = 0.20, p = 0.839).
To calculate the RCI for the GBO tool, Cronbach’s alpha (a = 0.73) for those who had completed two goals and the standard deviation for Goal 1 at the first measurement point (SD = 2.42) were used, giving a standard error of measurement of 1.78. Of those who completed at least two goals defined at baseline (n = 60), over one-third (30.0%) showed reliable improvement in progress towards goals, 68.3% showed no reliable change, and 1.7% displayed a reliable deterioration.
Comparisons between RCI from the SDQ and the GBO tool showed that the differences are statistically significant for significant improvement (z = 3.29; p < 0.001) and deterioration (z = 3.29; p < 0.001) but not for the no reliable change category (z = 0.61; p = 0.541).
A total of 192 goals were set during the study. The number of goals per participant ranged from 0 to 6, with a mean of 3. Qualitative analysis of the types of goals revealed three themes: personal growth and functioning, coping with specific problems and symptoms, and managing relationship/interpersonal difficulties. These goals captured different constructs than those measured by SDQ sub-scales (Goodman, 1997).
The Care Satisfaction sub-scale of the ESQ showed that all the means and SD for the scores for caregivers (Mean = 23.3; SD = 4.7 in the GBO + TAU group vs. Mean = 23.7; SD = 3.9 in the TAU group) and youth (Mean = 20.8; SD = 7.5 in the GBO + TAU group vs. Mean = 16.3; SD = 6.8 in the TAU group) were higher in the youth GBO + TAU group, but that there were no statistically significant group differences for either caregivers (effect size = −0.30; 95% CI = −2.09, 1.49; p = 0.74) or youth (effect size = 0.57; 95% CI = −4.36, 5.49; p = 0.82) and effect sizes were small.
More than half of 97/152 (64%) of patients and parents completed satisfaction survey. Results indicated that 59.4% of families in the GBO + TAU group showed that they were satisfied with their overall experience of participating in the study, 75% indicated that they would participate again, and 66% indicated that they would recommend this study to a friend.
Clinician experience was explored through a survey and a focus group, but only 8/24 (33.3%) of clinicians completed the feedback survey and 3/24 (12.5%) attended the focus group. Results showed that the majority of these clinicians (75%) wanted to continue using the GBO tool in their practice. They also identified areas of improvement, including transferring the tool from paper copies to the electronic health record, creating tutorial videos for refreshers, and sharing clinicians’ success stories to reinforce engagement.
Clinicians also indicated that the GBO tool worked best with patients having clear single-problem goals. I think it works best with the patients who treatment works best with in general. i.e., those who can stick with one set of goals throughout (get to school 4 days/week, see friends in person at least once/weekend, spend a half hr/night doing homework). (Quoted text from the survey)
In contrast, clinicians indicated that the GBO tool was not suitable for patients with intellectual deficiency, those in crisis, very complex goals, or for families needing imminent healing solutions. Families arrive overwhelmed and just want to feel better. I did not feel that it was helpful with the two families that were in the study. (Quoted text from the survey) They have to be functioning at a high enough level in order to define a goal. Some families are so distressed that you really... It’s stressful, then, to even think about a goal, you know, how can I be helpful, I do not know. And then, we do our best to try and find lower-level goals, like, you know, basic needs, housing, shelter. Or even... Sometimes, even, that's too high a level and we have to go with, like, just connection, so... But I would say or super-highly distressed people, it’s harder to come up with a goal. (Quoted text from the focus group).
Adoption
To facilitate adoption, all clinicians were trained, as described in Methods above. Included was training on goal focused interventions, how to guide youth and caregivers in developing and rating goals as part of therapy. No adaptations of the GBO tool were made. All clinicians used the protocol for study duration, and the qualitative data supported good adoption, as clinicians found that patients were generally willing to participate in the goal setting and rating processes. I can think of two families that I work with that were part of the GBO. One patient was a DBT therapy patient, so I found, like for him, it was really easy because we were defining very specific goals for his DBT therapy, so that person was quite engaged. And then, the other family, like, the child was younger, so it was the parents who were defining the goals, and, like, they seemed to be pretty willing. (Quoted text from the survey) And I had a similar experience. It helps to structure things. And I think, like, often in our heads we have goals, but they are not always super-clear... We think they are clear, but this sort of, set a nice way to make it clear with the client that each time we were talking about the same thing. So, they were... It was easy to get them to participate, and it helped structure the session. (Quoted text from the focus group)
In addition, the RCT protocol included reminders to pick up and drop off the forms for each appointment to encourage the adoption of the intervention. All clinicians participated in the pickup and drop-off of the folders, returning filled-out rating forms to the RA.
Implementation Outcomes
Patient flow was unaffected by the addition of the GBO tool. The total number of appointments across partnerships (i.e. all appointments attended by a service user across multiple service partnerships) was significantly higher for the GBO + TAU group than the TAU group (Mean = 12.64; SD = 1.26 for TAU vs. Mean = 13.14; SD = 1.50 for GBO + TAU; t = −2.15; p = 0.033). However, the core partnership number of appointments (i.e. only appointments within the primary (or core) service partnership) was higher for the TAU group than the GBO + TAU group (Mean = 10.11; SD = 0.98 for TAU vs. Mean = 9.13; SD = 1.01 for GBO + TAU; t = 5.87; p < 0.001).
For caregivers and for youth, follow-up rates for each arm were not significantly different: 74.29% for TAU caregivers with 42.11% for the TAU youth vs. 70.83% for GBO caregivers with 46.34% for the GBO youth.
As an additional way of measuring implementation of the GBO tool, we planned that each clinician would record a session, send it to co-author DL, and have him assess fidelity to model and give feedback for improvements. Only four out of 34 clinicians (11.8%) completed this task, and all were audio tapes. These recordings indicated fidelity to the model, with some minor adjustments which were shared with clinicians. A second one-day in-person training was held at the clinic site, with mock practice cases designed to reinforce GBO tool skills and solve problems. During this training, clinicians also worked on mock cases with group discussion, including addressing the problem with recording sessions.
Two barriers emerged in this supervision. First, many youths or caregivers did not want to be recorded, citing privacy concerns or embarrassment. Second, there was clinician discomfort with the process, despite being assured that recordings were for training only.
Another implementation outcome was the cost to ensure data collection for what was essentially a measurement-based care paradigm. Independent of the research costs to organize the study, and recruit participants, it became clear that a full-time person was needed to ensure data collection from and about patients and caregivers, including auditing and reporting instrument completion. One RA had to be assigned to these tasks full-time at the cost of $55,000 CDN/year.
Maintenance
As this was an experiment, maintenance was not planned. However, families felt satisfied and expressed interest in continuing with it. Clinicians were also interested in continuing the study, but only with extra support, such as that provided by the RA described above.
Discussion
This hybrid type II RCT was conducted to determine if the GBO tool could be implemented into a real-world child and youth MH OP setting using the CAPA framework without negatively affecting clinical outcomes. We also investigated whether the tool provided unique information about patient progress compared to the information obtained by a standard PROMS. To our knowledge, this is the first controlled study comparing the GBO tool with other such instruments within the CAPA framework.
The study showed that the GBO tool neither enhanced clinical care or negatively affected it and that there were no detrimental effects on patient flow or service satisfaction. Among service users and practitioners who participated, the GBO tool was generally acceptable for routine outcome measurement and offered a new dimension of patient treatment progress unique to the individual. However, some individuals declined to use the GBO when it was the only measure being added, so we cannot determine how those who refused would have perceived its usefulness. Nevertheless, our findings represent an important addition to the PCYMH care armamentarium, including the possibility of using large language models for assisting the youth or caregiver in identifying their unique goals within the electronic health record (Terheyden et al., 2025).
One of the key benefits of the GBO tool is that it allows goals to be defined by the patient, ensuring that the outcomes measured are personally meaningful. However, our findings also highlight that goals are influenced not only by the patient but also by clinicians and, in the case of younger children, by parents or caregivers. This interplay can shape both the focus and specificity of the goals. Consistent with the observations of participating practitioners, the GBO tool appeared to work most effectively when there was a single, clearly defined focus, such as a specific therapeutic target in dialectic behaviour therapy (DBT), suggesting that clarity and alignment among the patient, family, and clinician may enhance engagement and the utility of the tool.
Three overarching themes were identified for goals set in this study using the GBO tool: personal growth and functioning, coping with specific problems and symptoms, and managing relationship/interpersonal difficulties. These themes are similar to those previously identified by O'Reilly and colleagues using the GBO tool in young people receiving care in a community youth MH service (O'Reilly et al., 2022).
Moreover, our results showed that clinical effectiveness as measured by parent and youth-reported SDQ pre-post scores showed no difference when the GBO tool was added to TAU, even after replacing missing data. These findings are somewhat consistent with those from a previous study that found that change scores on the GBO tool were not correlated with the SDQ (Wolpert et al., 2012). Similarly, Duncan et al. (Duncan et al., 2023) found that the association between the GBO tool and the SDQ was very weak. Collectively, these findings suggest that the GBO tool adds unique information to the measurement of clinical outcomes by one of the most commonly used PROMS, the SDQ. The convergence of our work and that of the O'Reilly group suggest that the GBO tool captures different constructs than those measured by the SDQ subscales, findings suggesting that both types of instruments are useful in measuring the response to treatment.
The GBO tool demonstrated significant pre-post changes based on the RCI with more reliable improvement and less deterioration than the SDQ. This finding may be due to the GBO tool being a more sensitive measure of individual change than standardized nomothetic measures. It focuses on individual goals or problems formulated regarding patients own unique life experiences (Sales et al., 2023). However, the fact that both the intervention and TAU groups showed significant improvements in pre-post scores for everything is also validating for this particular CAPA MH OP clinic. Given that there are no other RCTs like this, we have no other studies with which to compare our findings.
The internal consistency of the GBO data documented in this study was acceptable, with Cronbach’s alpha of 0.72. This finding is comparable to other studies (Edbrooke-Childs et al., 2015; O'Reilly et al., 2022; Jacob et al., 2017) and suggests that despite diverse goals, the GBO tool is likely measuring a single underlying construct. Previous research has shown that the GBO tool could monitor change both for individual goals and the population level of outcomes in conjunction with other measures of MH and wellbeing (Duncan et al., 2023). However, using traditional statistical techniques, such as reliable change to data, implies an assumption that the measurement items remain static (Bollen & Diamantopoulos, 2017). This consideration may not be the case for the GBO tool because goals will likely differ between participants and may change over time. The use of traditional statistical techniques on GBO data could have several implications. First, it could induce an interpretational confounding because diverse items are being aggregated into the calculations. Second, it could result in unexplained variance due to the use of inappropriate techniques (Wilcox et al., 2008). Finally, it can lead to inaccurate conclusions about the nature of the structural relationships between constructs resulting from the misspecification of the correct analysis model (Podsakoff et al., 2003). Therefore, we are circumspect in drawing final conclusions about these issues and recommend that other studies try to replicate our findings.
Strengths and Limitations
This study has several strengths. First, it uses a type II RCT, meaning it has a dual focus on effectiveness and implementation outcomes, making use of mixed methods. This design allowed for the simultaneous testing or piloting of implementation strategies during an effectiveness trial (Landes et al., 2019). We also had a fairly high percentage of the target population of patients who were reached and informed of the study and all clinic providers participated. Finally, we had a high response rate for the post-treatment measurements, with over 70% of families (either a parent or youth) filling out the questionnaires.
There were several issues that limit the interpretation or generalizability of our findings. Our implementation plan was undermined by poor uptake of the GBO tool fidelity protocol. The degree of resistance was surprising, but no matter how we addressed it, we could not fix the problem. Hopefully, this experience can serve as a cautionary note to others wanting to ensure that the GBO tool is being used properly. Instead of our fidelity protocol, we suggest mock training patients be used.
Moreover, the low participation rate restricts the extent to which conclusions about the overall acceptability of the GBO can be generalized to the broader service population. While the addition of the GBO represented the primary change to the care process, uptake was limited, and therefore the findings primarily reflect the experiences of those who chose to engage with the tool. Nevertheless, among participating youth and caregivers, satisfaction ratings and qualitative feedback were consistently positive, suggesting good acceptability within this subgroup. These results should therefore be interpreted as indicative of acceptability among users who adopted the GBO rather than as evidence of universal acceptability across the full eligible cohort.
A possible explanation for finding no significant group differences in the effectiveness outcomes is that we had inadequate power due to sample size to detect significant change. However, we calculated effect sizes for all results, and they were quite small, suggesting that a larger sample may not have changed the results. Additionally, a post hoc power analysis indicated that, given our current sample size, we had strong statistical power—95.4% at an alpha level of 0.05.
Another limitation was also a strength. We conducted the study in a CAPA clinic, as the GBO tool is frequently used in such services. Therefore, we have unique data that can be relevant to CAPA clinics throughout the world. However, this setting also may have limited our ability to detect significant differences between the groups. A key component of CAPA is its focus on collaborative care in which the patient’s goals are central to high-quality practice (The Choice and Partnership Approach, 2025). The addition of the GBO tool may be very different than what is done in many clinical settings but may have been only small additions to CAPA TAU. We recommend that outcome effectiveness of the GBO tool also be done in a non-CAPA MH OP setting.
Finally, open science practices were partially constrained by the sensitive nature of child mental health data, which prevents public data deposition. However, the trial was reviewed and approved by our institutional ethic board and its preregistration at clinicaltrials.gov was undertaken under the ID NCT03527914. We aimed to enhance transparency by clearly outlining our analytic processes and providing access to study materials where feasible.
Conclusions
In this large child and youth MH OP CAPA-based clinic, incorporation of the GBO tool to address individualized patient change was successfully accomplished without disruption of care and patients, parents and clinicians generally found the tool quite helpful. While there were no significant differences in outcomes as measured by PROMS instruments, the GBO tool identified and addressed goals distinct from the constructs measured by the SDQ and CGAS. Furthermore, progress on the goals was better than progress on the SDQ and CGAS. This suggests that the GBO tool may provide unique data augmenting clinicians’ understanding of their patients’ needs, which can help them develop better informed tailored treatment plans for PCYMH care.
Supplemental Material
Supplemental Material - Hybrid Type II Randomized Controlled Trial of the Goal-Based Outcomes Tool in a Child and Youth Outpatient Mental Healthcare Clinic
Supplemental Material for Hybrid Type II Randomized Controlled Trial of the Goal-Based Outcomes Tool in a Child and Youth Outpatient Mental Healthcare Clinic by Kathleen Pajer, William (in memorium) Gardner, Hugues Sampasa-Kanyinga, Amanda Helleman, Nicole Sheridan, Vid Bijelić, Nick Barrowman, David Murphy, Marjorie Robb, Lisa M. Currie, Leigh Dunn, Duncan Law in Clinical Child Psychology and Psychiatry
Footnotes
Ethical Considerations
The study was conducted in accordance with the Declaration of Helsinki and approved by the CHEO (Children’s Hospital of Eastern Ontario) Research Institute Research Ethics Board (CHEO REB (18/03E)).
Consent to Participate
Participants gave their assent in addition to parental signed consent for those aged under 12 years before they participated in the study. Participants over the age of 12 years gave consent. Clinicians participated as part of the clinic’s continuous quality improvement program.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Canadian Hospital Academic Medical Organization Innovation Fund (CHAMO). The funders had no role in the design, data collection, analyses, interpretation of data, writing of the manuscript, in the decision to publish the results.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data that support the findings of this study are not publicly available due to their containing information that could compromise the privacy of research participants.
Supplemental Material
Supplemental material for this article is available online.
