Abstract
Psychosocial interventions are widely used in dementia care, yet standardized outcome measurement remains highly variable, and recent frameworks emphasize outcomes prioritized by people living with dementia and their care partners. This narrative, measurement-focused review does not appraise or synthesize treatment effects. Instead, it aims to map outcome measures to the International Consortium for Health Outcomes Measurement (ICHOM) dementia set plus an additional carer-wellbeing domain, to organize them into a taxonomy of wellbeing domains that highlights patterns and gaps in measurement practice. Eligible studies included participants with Alzheimer's disease and related dementias, evaluated a psychosocial intervention, and reported standardized pre- and post-intervention outcome measures at short and/or long-term follow-up. A total of 136 studies met inclusion criteria. Interventions encompassed arts and creative therapies, cognitive and reminiscence approaches, education and psychosocial support, physical and movement-based therapies, sensory and relaxation therapies, environmental and daily living support, and animal/robot-assisted programs. Outcome measures clustered on neuropsychiatric symptoms (205 instances) and cognitive functioning (146 instances), with fewer measures of social functioning (22 instances) and health-related quality of life (13 instances). Measurement approaches were highly variable (43 distinct neuropsychiatric measures, 47 cognitive measures, 14 social functioning measures). Outcomes were predominantly assessed using short-term measures, with some long-term follow-up, and few observational in-the-moment measures capturing engagement, enjoyment, reciprocity or mastery. This review presents a taxonomy of outcome measures that highlights the mismatch between current evaluation practices and person-centered psychosocial priorities in dementia care, and guides more purposeful measure selection.
Keywords
Pharmacological treatments have long been used to manage psychological and behavioral symptoms in Alzheimer's disease and related dementias, yet antipsychotics provide only modest benefit and carry significant risks, including stroke, falls, and mortality. 1 As a result, research and practice have shifted toward non-pharmacological approaches, such as psychosocial interventions. 2 These approaches use psychological, social, and activity-based methods to maintain or improve functioning, wellbeing, and quality of life (QoL). 3 However, the evidence base is complex, with diverse study designs, outcome domains, and measurement tools producing inconsistent findings.4,5 To address this, international initiatives now seek to define what constitutes meaningful outcomes in dementia care.6,7 Lived experience research highlights that people living with dementia and their families prioritize outcomes such as social connection and feeling supported.8,9 Our review therefore takes a person-centered approach to examine how well current outcome measures reflect these priorities.
A growing body of evidence supports the use of psychosocial interventions in dementia care.10–13 Recent reviews describe benefits across cognitive functioning (see 14 for an example), activities of daily living, QoL (see 15 for an example) and neuropsychiatric symptoms like depression, anxiety and apathy (see16,17 for an example). Multicomponent programs combining exercise and group cognitive stimulation show consistent gains in cognition, functioning, and QoL.14,15 Interventions such as multisensory stimulation, music therapy, and pet therapy show promise for apathy, though results remain mixed.15,17 By contrast, reminiscence, art therapy, relaxation, and staff education often yield low-certainty or inconsistent findings once methodological quality is considered.14,15
Despite this growth, progress is constrained by a lack of consensus on appropriate outcome measures. 18 Substantial heterogeneity exists in intervention content, duration, and targeted domains, 14 limiting comparability and synthesis.6,7 Many studies measure broad or distal endpoints rather than the immediate psychosocial targets intended by the intervention, making subtle experiential changes hard to detect. Reviews also emphasize short-term post-intervention outcomes, with limited examination of in-the-moment indicators of engagement and social connection, or long-term effects.2,19 Evidence remains sparse on how psychosocial interventions influence the ongoing experiences of people living with dementia and the immediate benefits they most value.18,20 Further, there is a mismatch between quantitative and qualitative evaluations. A mixed-methods review found that quantitative trials often reported limited or inconsistent effects, while qualitative accounts from participants and staff described clear benefits. 20 This gap suggests that commonly used quantitative measures are poorly aligned with the outcomes that psychosocial interventions most influence. When outcome measures are misaligned and lack sensitivity to change, meaningful benefits are obscured,6,7 costly trials may fail to capture lived experiences, and “research waste” accumulates. 21
Outcome measurement has traditionally taken a deficit-oriented view, focusing on reducing impairments such as memory loss or “behavioral problems”.11,22,23 This narrow focus has been criticized for overlooking what matters most to people with dementia—the capacity to live well—despite wellbeing being a central aim of many psychosocial approaches.8,9 In response, lived-experience research and core outcome frameworks increasingly prioritize wellbeing domains over symptom reduction. Initiatives such as the International Consortium for Health Outcomes Measurement (ICHOM) dementia standard set and the HOMEDEM project now emphasize QoL, mood, social health, and carer wellbeing alongside clinical outcomes.6,7 Qualitative and consensus work similarly identifies social connection, identity, enjoyment, autonomy, and feeling supported as central to living well with dementia.8,9 Yet these outcomes are rarely primary endpoints in psychosocial trials.8,14 Our review therefore adopts a person-centered lens, mapping outcome measures from psychosocial intervention studies onto these priority domains to assess how well current evaluation practices reflect what matters most to people living with dementia and their families.
Previous reviews have primarily used a meta-analytic approach to assess the magnitude and quality of intervention effects. 24 We conducted a narrative, measurement-focused review rather than a Joanna Briggs Institute scoping review or COSMIN-based systematic review because our aim was to map and taxonomize outcome measures, rather than comprehensively scope all psychosocial evidence or synthesize intervention effects. Our primary aim was to identify the outcome measures used to evaluate psychosocial interventions and organize them into a taxonomy of wellbeing domains, highlighting patterns and gaps in measurement practice. Rather than evaluating effectiveness, we aimed to map measures to the ICHOM dementia standard set, with the addition of a carer wellbeing domain to align with recent outcome-measurement guidance (e.g., PRISMA-COSMIN) that emphasizes carer-focused outcomes in determining what should be measured.7,25,26 We also aimed to compare the use of short-term, long-term, and in-the-moment measures to support more purposeful outcome selection and methodological decision-making in psychosocial dementia-care research.
This narrative, measurement-focused review has three objectives: (1) to identify the ICHOM psychosocial domains within which interventions have been evaluated, including short-term, long-term, and in-the-moment outcomes; (2) to develop a comprehensive overview of available measurement tools to enhance standardization across studies and guide measure selection and development; and, (3) to identify gaps in domains targeted by psychosocial interventions.
Methods
Narrative review approach
Consistent with narrative reviews requirements and our research aim to map and taxonomize outcome measures, we did not undertake a formal risk-of-bias or quality appraisal of individual studies or register the protocol. We followed guidance for high-quality narrative reviews (SANRA 27 ) and incorporated PRISMA-style and PRISMA-COSMIN-informed elements (e.g., transparent search reporting and a flow diagram) to enhance methodological transparency. SANRA is a six-item tool for assessing the quality of narrative (non-systematic) review articles. SANRA items cover: (1) justification of the article's importance, (2) statement of concrete aims/questions, (3) description of the literature search, (4) appropriateness of referencing, (5) scientific reasoning, and (6) presentation of relevant endpoint data (see Supplemental Material 1).
Inclusion and exclusion criteria
Publications were eligible if the studies included evaluation of a psychosocial intervention and included participants with dementia. All evaluative studies (quantitative or qualitative) were included if they met the criteria in Table 1 and reported pre- and post-intervention outcomes using standardized measures. Studies were excluded if they lacked a standardized outcome measure, were literature reviews, or involved brain training or physical exercise without a psychosocial component. By “without a psychosocial component” we mean delivered individually, without supervision or any structured opportunity for social interaction or relational engagement (i.e., physical exercise without an explicit psychosocial component). Mixed-sample studies were also excluded when outcomes for people living with dementia were not reported separately from other participant groups that may have been included in the sample, such as carers, or people with mild cognitive impairment. The time-window for article selection (from 2012 to 2022) provides a balance between recency and coverage of the modern psychosocial dementia literature. From around 2012 onwards, several key developments occurred, including a rapid growth in trials and reviews of non-pharmacological interventions designed to improve QoL and wellbeing for people living with dementia. Also, a rising concern about the heterogeneity of outcome measures and calls for more standardized, person-centered outcome frameworks (see7,18 for example) and the development of standardized outcome frameworks such as the ICHOM dementia standard set, which we used as the organizing structure for this review. 25
Inclusion and exclusion criteria.
Search strategy
The databases selected by the research team were useful resources for identifying articles on dementia. We searched the Medline and PsycINFO databases via Ovid, as well as PubMed, CINAHL, and Scopus, for relevant peer-reviewed manuscripts. For example, the full Ovid MEDLINE search string was: \(((dementia* OR alzheimer*) AND (intervention* OR treatment* OR program* OR therapy OR “cognitive therapy” OR “art* therapy” OR aromatherapy OR massage* OR touch OR “animal assisted therapy” OR exercise* OR “horticultural therapy” OR “virtual reality” OR “telerehabilitation” OR psychotherapy OR “recreation therapy” OR “sensory intervention” OR “stimulation intervention” OR “light therapy” OR “music therapy” OR Snoezelen OR “doll therapy” OR “robot therapy” OR “multimodal therapy” OR “occupational therapy” OR “behavio$r therapy” OR “computer assisted” OR “reminiscence therapy” OR “creative writing” OR “diversional therapy”) AND (assess* OR measur* OR tool* OR effect* OR scale*))). A search diary was maintained detailing the names of the databases searched, the keywords used, the search results, and the date. Titles and abstracts of studies to be considered for retrieval were recorded in an EndNote database. Duplicates were removed using EndNote and the library of relevant studies was uploaded to Covidence. 28
Study selection
Covidence software 28 was used to manage the screening and inter-rater process and to facilitate data extraction. At least two different reviewers independently screened for each title and abstract. Where there was a discrepancy between two reviewers , a third reviewer made the final decision as to whether the paper was included or excluded. Relevant manuscripts were retrieved and screened for full-text review, and reasons for exclusion were recorded in Covidence.
Data extraction
The final set of studies for analysis were extracted using pre-determined criteria as headings in an Excel spreadsheet, to analyze study-level data. Several reviewers extracted and entered data from these studies into the spreadsheet. Next, the extracted data for each paper was reviewed a second time (i.e., checked and edited for accuracy) by at least one other reviewer/author . Extracted information included study title, author names, year of publication, as well as study design (e.g., qualitative, RCT, cohort), sample size (including subgroups), dementia severity, study setting (e.g., residential aged care facility), intervention type (e.g., music listening), duration and mode of delivery (group or individual). In addition, the names of outcome measures were extracted for each paper and categorized according to the psychosocial domain of interest (e.g., cognitive functioning, neuropsychological symptoms, etc.). Intervention categories and types were developed iteratively by multiple reviewers. Categories were defined first, followed by the identification and consolidation of intervention types within each category to balance conceptual distinctiveness with analytical interpretability. Low-frequency intervention types were retained where they represented substantively unique approaches.
Short-term and long-term measures
Short-term outcome measures assess change over a pre–post intervention window, generally over several weeks of an intervention, with evaluation taken at the end of the intervention period. 29 They often draw from standardized questionnaires or rating scales that summarize overall symptoms, mood, or QoL in the recent past. Long-term measures are typically administered using a broader time window after the completion of an intervention. For example, at follow-up points like 3, 6, 9, and even 36 months.
Consistent with prior dementia intervention research that distinguishes immediate post-intervention outcomes from follow-up assessments over subsequent months, we classified measures taken during or within four weeks of intervention completion as short-term, and measures taken four weeks or more after completion as long-term (see 30 for an example). Although many dementia psychosocial trials schedule follow-up at 3, 6, or 12 months, we used a ≥ 4-week threshold pragmatically to capture all assessments occurring beyond the immediate post-intervention period, recognizing that these reflect an intention to measure maintenance or evolution of effects over time, rather than in-session or immediate outcomes.
In-the-moment measures
In-the-moment measures typically refer to observational or ecological momentary assessments (EMA) that capture a person's affect, engagement, or behavior as it occurs during or immediately surrounding an activity session.31,32 They are taken in the real-world context, rather than relying on retrospective reports over days or weeks. 33 As such, they provide an alternative to proxy ratings of wellbeing by carers which have been shown to be consistently lower than self-reports of people living with dementia, demonstrating a low correspondence between the two. 34 EMA also provide an alternative to relying on self-report which can sometimes exclude people with more severe dementia. 35
Domains of interest
Domains of interest were derived from the ICHOM initiative, 25 which developed standardized outcome sets for specific medical conditions to enable consistent measurement, reporting, and benchmarking across clinical practice. 7 The ICHOM dementia standard set is an internationally developed, dementia-specific and patient-centered outcome framework, designed through multi-stakeholder consensus to capture relevant outcomes including those considered most meaningful to people living with dementia and their carers, which is increasingly used as a common structure for organizing outcome measurement in dementia research and practice. 25 It describes 7 core outcome domains: 1. Neuropsychiatric Symptoms, 2. Cognitive functioning, 3. Social functioning 4. Functional Status, 5. General QoL, 6. Health-Related QoL, and 7. Clinical Status (see Table 2). These domains mirror the main areas targeted by psychosocial interventions; such as mood, behavior, social engagement, daily functioning and QoL.6,36 They also provide a shared rubric for taxonomizing intervention measures and comparing care across settings. 25 Also, ICHOM exists in a context of broader efforts to rationalize and standardize dementia outcome measurement, reinforcing its role as a recognized reference point for structuring fragmented literature on this topic. 37 We therefore mapped outcome measures onto the ICHOM dementia standard to provide a conceptually coherent way to show which aspects of living with dementia are being assessed (and with what measures), and which may be neglected.
Domains of interest: ICHOM Dementia Standard Set plus “carer wellbeing”.
In addition to the ICHOM dementia standard set, we included an eighth outcome domain of “carer wellbeing” to further reflect what matters most to people living with dementia and their families, and to acknowledge that the sustainability and impact of psychosocial interventions can depend critically on carers’ health and QoL. This decision also aligns with recent outcome-measurement guidance (e.g., PRISMA-COSMIN and core outcome set initiatives) that emphasizes carer-focused outcomes and stakeholder involvement in determining what should be measured. 26 Evidence shows that people living with dementia rank “not being a burden” and the QoL of their care partners, among their highest priorities. 38 Research also shows that psychosocial interventions frequently aim to reduce carer burden, distress, and depression alongside benefits for people living with dementia. 39 Recent work on core outcome sets in dementia care also identifies carer-focused outcomes (e.g., burden and QoL) as common and important measurement constructs that should be considered alongside person with dementia outcomes.6,40 Including carer wellbeing as a distinct domain, therefore, aligns the review with stakeholder-defined priorities and provides a more complete framework for organizing the fragmented measurement literature in psychosocial dementia research.
A taxonomy of standardized outcome measures
Collectively, the domains of interest provide a comprehensive framework for taxonomizing the identified measures covering cognitive, functional, behavioral, social, and QoL aspects of living with dementia. To achieve this, a second Excel spreadsheet was used to record additional information about each outcome measure, including its name, versions, purpose, assessment type, number of items, administration time, and administrator. This information was extracted by two members of the research team (RB, MA). The relevant psychosocial domain being indexed, and the number of studies that had adopted for each tool were also included. For dual and multi-domain instruments, we mapped each measure to a single primary outcome domain using the ICHOM Dementia Standard Set plus “carer wellbeing” domain. The primary domain was defined as the construct the instrument was originally designed to assess, in order to avoid double-counting and to maintain a clear, non-overlapping taxonomy of outcomes. Measures that were not standardized or that were administered as part of a cognitive intervention, rather than evaluating outcomes, were not included.
Article selection
From a total of 6065 publications, 136 were eligible for inclusion. A PRISMA flow diagram was used to document study selection, and contemporary outcome-measurement guidance (including PRISMA-COSMIN) informed the transparent reporting of our narrative review processes (Figure 1).

Article identification and PRISMA flow diagram.
Results
Evaluating the evidence
The included studies and their outcome measures are summarized in the sections below and in Table 3. Results are organized to describe study characteristics (participants, setting, and intervention type), followed by the timing of outcome assessment, the ICHOM domains of interest, and the distribution of measures across the domains.
Study, intervention, and measure characteristics including Participant number (N) and type (ω = person with dementia; ε = person with MCI; Ω = professional carer; φ = Family/friend carer); Intervention setting (ζ = residential aged care; ς = hospital inpatient; λ = hospital outpatient; μ = memory/health clinic or day center; ψ = home/retirement village; V = virtual); Intervention mode and length (In = individual; Gr = group, M = mixed; D = dyad; y = years; m = month/s, w = week/s, d = day/s; Domains of interest (1. neuropsychiatric 2. cognitive 3. social 4. functional 5. general QoL 6. health-related QoL 7. Clinical status 8. carer wellbeing); and administration time points: long-term (LT), short-term (ST) and in-the-moment (δ). Note. ✓ = Yes; × = No.
Note. * This study is listed twice in the table as the intervention aligns with 2 categories. # Outcome measure acronyms are spelled out in full in Table 4;
Participants
Sample sizes of people living with dementia varied across studies from 1 to 726 participants (M = 91.42). Over a quarter (n = 39, 28.7%) did not specify the stage of dementia. Studies that reported dementia stage (n = 97, 71.3%), included participants with mild dementia only (n = 16; 11.8%), moderate dementia only (n = 4; 2.9%), severe dementia only (n = 7; 5.1%), mild to moderate only (n = 36; 26.5%), moderate to severe only (n = 16; 11.8%), or participants at a variety of dementia stages (n = 18; 13.2%). Only a few studies (4.4%) included other participant groups in addition to people living with dementia, such as people with an intellectual disability or mild cognitive impairment. Some studies also included care partners alongside people living with dementia, comprising family members (n = 35, 25.7%), or aged care staff (n = 13, 9.6%), with a subset evaluating these participants as dyads with the person living with dementia (n = 20, 14.7%).
Setting
Approximately half of the included studies evaluated interventions delivered to people living with dementia in community-based settings (n = 67, 49.3%), including day centers (n = 14), clinical or outpatient healthcare settings (n = 23), retirement homes (n = 3), and participants’ homes (n = 19). The remaining 76 studies were conducted in institutional settings (n = 76, 55.9%), most commonly in residential aged care facilities (n = 69), and inpatient healthcare settings (n = 7). Percentages exceed 100% as some studies included both community and institutional settings.
Psychosocial intervention type
In line with our inclusion and exclusion criteria, this review focused on psychosocial interventions, defined as those incorporating social, quality-of-life, and/or mood-enhancing components. As such, interactive physical and movement-based therapies were defined as interventions in which structured physical movement (e.g., Tai Chi, dance, exergaming, online yoga, dyadic or group exercise) was delivered in a supervised, interactive context that provided structured opportunities for social connection, communication, shared enjoyment, or relational engagement between the person with dementia and others, including partners, family carers, staff, or group members. In total, we classified the interventions into seven categories (see Figure 2): arts and creative therapies (n = 34, 23.4%), education and psychosocial support (n = 33, 22.8%), cognitive and reminiscence approaches (n = 30, 20.7%), animal/robot-assisted programs (n = 11, 7.6%), interactive physical and movement-based therapies (n = 9, 6.2%), sensory and relaxation therapies (n = 9, 6.2%), and environmental and daily living support (n = 7, 4.8%). Interventions spanning three or more categories were classified as multicomponent activities (n = 12, 8.3%).

Psychosocial intervention categories. This figure is designed to be viewed in the interactive HTML version (see Supplemental Material 2). Selecting categories in the inner circle reveals additional detail, and hovering over individual slices displays proportional values. Slice sizes reflect weighted study contributions (N = 136), with each paper contributing a fixed total weight distributed across all reported intervention types to prevent over-representation of studies with multiple interventions.
While these categories capture the broad patterns in the types of approaches, individual papers operationalized them through specific intervention types. Accordingly, we examined the frequency of individual types across categories (see Figure 2). Across the sample, cognitive stimulation therapy (CST) was the most frequently evaluated intervention (n = 12, 8.3%), followed by music listening (n = 9, 6.2%). Other commonly included interventions were active music therapy (n = 7 studies, 4.8%), carer and staff coaching (n = 7, 4.8%), dyadic coping coaching (n = 7, 4.8%), exercise (n = 6, 4.1%), and robot interactions (n = 6, 4.1%). Multicomponent activity studies (n = 12, 8.3%) were typically delivered as coordinated care packages in which activities were flexibly selected and combined (often including reminiscence, music, movement activities, behavior support, and staff coaching) to address psychosocial or behavioral goals. In addition, five papers examined interventions spanning two categories. These were either comparative (e.g., comparing the outcomes of cognitive training versus music or music therapy) or involved parallel interventions, in which two distinct approaches or therapeutic aims were delivered concurrently, such as exercise combined with cognitive stimulation therapy or dyadic coping coaching.
Short-term, long-term and in-the-moment measures
Our review found that short-term measures predominated the included studies. In total, 129 studies (94.9%) employed short-term measures whereas 40 studies (29.4%) included long-term measures, either alone or in combination with other measures. In-the-moment measures were very rarely used, with only 8 (5.9%) papers reporting these. A detailed breakdown of the combinations of short-term, long-term, and in-the-moment measures is provided in Table 4.
Outcome assessment timepoints relative to the intervention.
Measurement domains of interest
The ICHOM framework was adopted a priori to define outcome domains, and all reported measures were categorized accordingly (see Table 5 for the full list of measures identified; see Figure 3 for their relative frequency of adoption). In total, we identified 201 unique standardized measures used across reviewed studies. The number of distinct measures being adopted varied across outcome domains as follows (from highest to lowest number): cognitive functioning (n = 47; 23.4%), neuropsychiatric symptoms (n = 43; 21.4%), carer wellbeing (n = 33; 16.4%), general QoL (n = 28; 13.9%), functional status (n = 23; 11.4%), social functioning (n = 14; 7.0%), clinical status (n = 8; 4.0%), and health-related QoL (n = 5; 2.5%).

Measurement use across outcome domains (1. neuropsychiatric 2. cognitive 3. social 4. functional 5. general QoL 6. health-related QoL 7. Clinical status 8. carer wellbeing). Note. This figure is best viewed in the interactive HTML version (see Supplemental Material 2), which enables inspection of lower-level categories not fully visible in the static image. The static figure is provided to illustrate relative proportional contributions across domains. Slice sizes reflect weighted study contributions (N = 136), with each paper contributing a fixed total weight distributed across all reported measures to prevent over-representation of multi-measure studies.
Standardized measures (N = 201) mapped onto primary outcome domains (1. neuropsychiatric 2. cognitive 3. social 4. functional 5. general QoL 6. health-related QoL 7. Clinical status 8. carer wellbeing)*. Measure characteristics include acronym, name, first author/year, purpose, dementia specific (DS), number of items (N), test-time (Min), assessment (Ax) type, and administrator (Admn).
Note. * = For dual and multi-domain instruments, we mapped each measure to a single primary outcome domain, defined as the construct the instrument was originally designed to assess, to avoid double-counting and to maintain a clear, non-overlapping taxonomy of outcomes. Ω = Partial validation only; AD = Alzheimer's Disease; ADL = Activities of Daily Living; Admn. = Administrator: person completing the measure, i.e., λ = Self ν = Carer/proxy person; υ = clinician or trained researcher; Assesses (measures) = This tool is designed to assess (or measure); Ax Type = Assessment type, i.e., ▴ = Performance based, π = Rating, μ = Interview, Δ = Observation; BPSD = behavioral and psychological symptoms of dementia; DS = dementia specific; DSM-5 = Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (American Psychiatric Association; 2013); EF = Executive Function; Full = Full version; HR-QoL: Health-related quality of life; IADL: Instrumental Activities of Daily Living; M = modified version; MCI = mild cognitive impairment; QoL = Quality of Life; Scr = Screen version; SF = short form; SS = subscale; ST = Short term; VAS = Visual Analogue Scale.
A similar pattern was observed in the frequency with which measures were used across studies (from most to least frequent): neuropsychiatric symptoms (n = 205; 31.9%), cognitive functioning (n = 146; 22.7%), General QoL (n = 94; 14.6%), carer wellbeing (n = 76; 11.8%), functional status (n = 66; 10.3%), social functioning (n = 22; 3.4%), clinical status (n = 21; 3.3%), and health-related QoL (n = 13; 2%). Because some studies employed multiple measures within the same domain, the number of instances recorded exceeds the total number of papers reviewed. Also, although measures are organized in Table 4 according to their primary intended domains and populations, studies often employed them more flexibly, applying measures across participant groups (e.g., using dementia-focused measures with carers and vice versa), and in some cases administering the same measure to both groups.
Measure distribution
The distribution of individual measures varied greatly across domains (see Figure 3). In some domains, usage was highly concentrated in a small number of measures, as reflected in their two-item concentration ratio (CR2), which represents the proportion of total usage accounted for by the two most frequently used measures. 378 The highest concentration ratio was observed for HR-QoL, which relied primarily on the EQ-5D (n = 8) and SF-36 (n = 2), giving a CR2 of 0.769. This was followed by clinical status (PMs: n = 8; CDR: n = 6; CR2 = 0.667) and general QoL (QoL-AD: n = 45; DEMQOL: n = 12; CR2 = 0.606). Moderate concentration was observed for neuropsychiatric symptoms (NPI: n = 59; CSDD: n = 28; CR2 = 0.424) and cognitive functioning (MMSE: n = 43; ADAS: n = 16; CR2 = 0.404). In contrast, functional status (IADL: n = 12; BI: n = 9; CR2 = 0.318), social functioning (HCS: n = 3; QUIS: n = 3; CR2 = 0.273), and carer wellbeing (ZBI: n = 11; EQ-5D: n = 5; CR2 = 0.211) showed a greater diversity in their use of measures. These findings suggest uneven levels of measurement standardization across outcome domains, whereby some areas have converged around a small set of commonly used measures, while others are characterized by greater heterogeneity, with important implications for cross-study comparability and evidence synthesis.
Building on this domain-level analysis, standardized measures were cross-tabulated against intervention categories, to assess whether similar intervention types were likely to adopt similar outcome measures. Across intervention categories, concentration was consistently moderate to low, with CR2 values ranging from 0.304 for sensory and relaxation therapies to 0.141 for physical and movement-based therapies, with intermediate values observed for multicomponent activities (CR2 = 0.217), animal/robot-assisted interventions (CR2 = 0.214), arts and creative therapies (CR2 = 0.214), education and psychosocial support (CR2 = 0.176), environmental and daily living support (CR2 = 0.167), and cognitive and reminiscence approaches (CR2 = 0.166). In contrast to the uneven concentration observed across ICHOM domains, no intervention category showed strong concentration around a small subset of instruments, suggesting a broadly heterogeneous distribution of standardized measures across intervention categories.
The type of assessments also varied within outcome domains. Measures evaluating cognitive functioning, for example, varied in administration time from brief screening tests (5–10 min) to full neuropsychological batteries (30–45 min). In some studies, only single subtests were utilized from larger test batteries to measure specific outcomes (e.g., WAIS-IV Digit Span subtest).
Discussion
This narrative review examined outcome measures in 136 psychosocial intervention studies for people living with dementia. Two key findings emerged. First, despite growing recognition of what matters to people living with dementia, outcome measures remain dominated by deficit-based indicators, particularly cognitive screening tools and symptom scales. By mapping these measures onto the ICHOM dementia standard set, plus an additional carer wellbeing domain, this review offers a structured taxonomy to guide more deliberate and transparent measure selection in future work.
This synthesis also highlights marked imbalances in domain coverage, most notably the heavy emphasis on neuropsychiatric symptoms and cognition, alongside sparse assessment of social functioning and a striking underuse of in-the-moment observational measures of benefit. These patterns suggest that the evidence base for psychosocial dementia care is shaped as much by what is measured as by the interventions themselves, and highlight the need for codesigned, dementia relevant tools that capture outcomes valued by people with dementia and their care partners. Second, we found a predominant focus on short-term assessment, which may obscure delayed or sustained benefits and limit understanding of how psychosocial interventions influence longer-term adjustment, wellbeing, and care trajectories.
A menu of measures for nuanced effects
Our findings highlight a wide range of standardized measures across multiple wellbeing domains for people living with dementia. The review identifies widely used measures as well as less common tools that may be well suited to capturing outcomes of diverse interventions. By synthesizing measures used to evaluate psychosocial interventions and classifying them by domain, we aimed to provide a resource that encourages researchers to look beyond the most familiar tools and consider measures with more nuanced foci that align with their intervention targets. Table 4 provides a taxonomy to support more considered measurement selection.
The predominant use of short-term measures
The striking discrepancy between the number of short-term post-intervention measures and both in-the-moment and longer-term measures is notable. In-the-moment measures such as EMA are designed to capture fine-grained fluctuations in enjoyment, engagement, reciprocity, and mastery during intervention sessions, maximizing ecological validity and minimizing recall bias. 31 Their underuse overlooks immediate improvements in enjoyment and engagement that may occur during sessions but are unlikely to be captured by post-intervention questionnaires. 379 Dementia and geriatric psychiatry literature explicitly recommends EMA as assessing subjective experiences in real time within daily life, distinguishing it conceptually from conventional pre–post measures that aggregate experiences. 33 Low uptake may reflect the difficulty of indexing transient, experiential benefits with standardized tools and the practical constraints of real world data collection, which often requires labor-intensive observation and qualitative coding.379,380
Our review also found limited use of long-term follow-up measures, which assess the maintenance of treatment effects and inform the real-world sustainability and clinical value of interventions. 381 Although the progressive nature of dementia complicates long-term evaluation, minimal use of follow-up measures limits understanding of how benefits are sustained over time. 7 It also remains unclear whether the same or different measures are best suited to short versus long-term effects, as long-term follow-up most often relied on the same tools used at immediate post-intervention.
Measurement type: what are we missing?
Our synthesis identified cognitive functioning and neuropsychiatric symptoms as the most commonly assessed domains, positioning cognitive decline and behavioral symptoms as primary endpoints. In contrast, social functioning was rarely assessed, even though this review focused on psychosocial interventions. This pattern reflects a persistent overreliance on cognitive measures in dementia research, 7 where intervention success is often judged by effects on cognition, with other psychosocial benefits overlooked. Outcome measurement in psychosocial dementia research therefore remains largely rooted in a deficit-based framework, drawing heavily on tools that operationalize deficits such as memory, functional impairments, and “behavioral problems”. 23
This focus persists despite a substantial body of work calling for outcomes that reflect what people with dementia and their care partners value: social engagement and inclusion, dignity, reciprocity, enjoyment, emotional connection, mastery and identity, confidence, control, independence, feeling safe and well cared for, continuity of self and everyday life, meaningful activity and purpose, and connection to community and place.6,18,35,380,382 In our review, relatively few measures indexed social engagement, relationship building, autonomy, or enjoyment.6,35 Positively framed QoL measures were also rarely used, particularly those capturing aspects of QoL prioritized by people with dementia, such as mastery, enjoyment, and meaningful engagement.6,7,36,380
Our analysis of measure distribution across domains suggests that standardized cognitive screening tools such as the MMSE remain frequently used to evaluate potential cognitive benefits. 7 Although these measures are standardized, easy to administer, and sensitive to broad long-term cognitive changes, they present several limitations. Brief cognitive screeners like the MMSE and MoCA were not designed to assess intervention effects and may lack sensitivity to subtle cognitive change. 383 Repeated administration can generate practice effects, where apparent improvement reflects familiarity with test content rather than intervention efficacy. 384 Small gains may also reflect item specific artefact. For example, learning the day of the week because an intervention runs on that day, rather than meaningful global cognitive change.
Ongoing reliance on neuropsychiatric and cognitive measures may be driven by researcher traditions, regulatory expectations, and the practical convenience of standardized tools.7,385 While MMSE and MoCA remain useful screeners for overall cognitive status, their limitations underscore the need to supplement them with more sensitive, domain specific assessments closely aligned with the hypothesized mechanisms of psychosocial interventions.386,387
Measurement distribution: risks and benefits
Our measurement distribution analysis suggests a partial move toward harmonization, but one largely confined to traditional biomedical endpoints. In the General QoL, HR-QoL and Clinical status domains, a small set of instruments (e.g., EQ-5D, SF-36, CDR, QoL-AD, DEMQOL) dominates use, as reflected in high distribution ratios. This convergence on a core set of standard measurement tools has the benefit of facilitating cross study comparison and meta-analysis. However, there is also a risk of narrowing the construct space. That is, when the same scales become the default choice, they repeatedly capture only certain aspects of wellbeing (e.g., mobility, self-care, depression), while other constructs that are important to people living with dementia (e.g., reciprocity, mastery, identity) remain undermeasured.316,388
General QoL instruments such as the SF-36, EQ-5D, and some older-adult QoL scales, were not developed specifically for people living with dementia and may miss nuances related to stigma, identity change, fluctuating capacity, and relational care that are central to their experience. 325 Heavy reliance on such measures can distort how intervention benefits are represented, limiting “benefit” to what these tools capture rather than to what people with dementia identify as meaningful change.36,316,388 By contrast, domains such as social functioning, functional status, and carer wellbeing show low concentration and a proliferation of measures, reflecting conceptual fragmentation and the absence of clear “go to” tools, which complicates evidence synthesis.6,7,36 This diversity may stem from uncertainty about how best to operationalize these constructs in dementia care. 36 Overall, both measure convergence and diversity carry risks and benefits, underscoring the need to critically appraise widely used tools for fit with dementia relevant priorities and to build consensus around codesigned instruments in under-standardized domains, particularly social functioning and carer outcomes.316,388
The need for codesigned new measures
Outcome selection in psychosocial dementia research needs to move decisively beyond its longstanding emphasis on cognition and symptom reduction, and a negative medicalized understanding of dementia. Despite repeated calls since at least 2016 to broaden outcomes, 7 our findings highlight that most intervention studies still prioritize cognitive and neuropsychiatric measures. This finding is striking given that we only included psychosocial interventions. Future work would benefit from a positive reframe, adopting strength-based, resilience-focused, and person-centered models that emphasize preserved abilities rather than deficits or “problem behaviors”. 325 This shifts evaluation toward a “living well with dementia” framework and broadens what counts as benefit, so psychosocial interventions are not judged solely on cognition or symptom change.
Reliable ways of measuring wellbeing, grounded in how people living with dementia themselves define it, are needed to determine whether specific interventions truly enhance wellbeing. 379 Measures would be strengthened by focusing on person-determined priorities rather than responses to behaviors others deem undesirable. 389 Codesigned measures that reflect “what matters most” to people with dementia and care partners, and that are selected a priori to match intervention content, are therefore essential for making research both more person-centered and more cumulative. Outcomes highlighted by those with lived experience include finding life meaningful, maintaining a positive sense of identity and agency, and having satisfying relationships.35,390
Social engagement is a core determinant of wellbeing and protective against negative health outcomes in dementia, yet few validated instruments exist and even fewer are routinely used.35,391 Future research could increase the development of in-the-moment measures specifically codesigned to capture such benefits, particularly for people with more advanced dementia. Emerging technologies, including automated face and voice analysis, may offer new ways to assess real-time mood and engagement in efficient and scalable ways. 392
Future directions
Overall, our findings reflect the growing shift toward non-pharmacological dementia research, the clinical value of psychosocial interventions, and the need for reliable, meaningful, codesigned measurement. 393 Using these insights to refine and extend psychosocial programs may better align them with end-user priorities. Meaningful involvement of carers can further enhance intervention effectiveness. 394 Recommended supports include training and education to build shared language with professionals, and to improve recognition and communication of behaviors, needs, and symptoms, thereby supporting targeted delivery, progress monitoring, and sustained use of skills over time. 394
Future psychosocial intervention studies should ensure that primary outcomes are conceptually aligned with intervention targets. When an intervention aims to improve social connection, meaningful activity, or emotional wellbeing, core outcomes should sit squarely in these psychosocial domains rather than defaulting to global cognitive screeners. At the same time, reducing heterogeneity in outcome domains and instruments through agreed core outcome sets and codesigned standardized tools, would support comparability across studies and strengthen the evidence base for practice and policy.6,7
Beyond psychosocial interventions, dementia trials, including pharmacological studies, should consistently embed person-centered outcomes. Establishing a shared backbone of such outcomes, informed by ICHOM and lived-experience sets and supplemented by intervention-specific endpoints, would enable comparability across psychosocial, pharmacological, and care-system trials while ensuring accountability to what matters most to people with dementia and their care partners.6–9,25 These measures also need systematic validation and review. 5 This includes co-designed assessments of content validity, responsiveness, cross-cultural relevance, and feasibility in routine care, to ensure tools accurately reflect intervention effects across contexts.
Limitations
First, although all screening and extraction decisions were made independently by at least two reviewers with discrepancies resolved by consensus or third-reviewer adjudication, we did not calculate or retain formal inter-rater reliability statistics for these processes, which limits the extent to which the consistency of judgements can be quantified. Also, the heterogeneity of included studies in design, setting, intervention type, duration, and dose constrained comparison across trials and prevented inferences about the relative effectiveness of specific psychosocial approaches or measurement strategies. The review period, restriction to English, and focus on peer reviewed publications may have introduced publication and language bias and excluded innovative or culturally specific tools reported elsewhere.
As a narrative rather than a systematic review, we did not formally appraise or synthesize treatment effects, study quality, or risk of bias; consequently, the distribution and uptake of measures across domains should not be interpreted as evidence of their psychometric robustness, feasibility, or sensitivity to change. This approach provides breadth and a descriptive “map” of measurement practice, but it limits the strength of conclusions that can be drawn about the comparative quality or performance of specific instruments. Also, newer trials will need to be considered in subsequent updates of this evidence base. Finally, limited reporting of dementia subtype, severity, and sociodemographic characteristics further restricted conclusions about how well commonly used measures perform across stages of dementia, care settings, and diverse cultural or linguistic groups.
Conclusions
This narrative review shows that measurement practices in psychosocial dementia intervention research remain largely anchored in short-term, deficit-oriented tools, with neuropsychiatric symptoms and global cognition far more frequently assessed than social functioning, autonomy, mastery, and other positive outcomes valued by people with dementia and their carers. By mapping 136 studies and their measures onto the ICHOM dementia outcome framework plus carer wellbeing, the review offers a structured resource to support more deliberate measure selection and highlights substantial gaps in long-term and in-the-moment assessment. Future work should prioritize codesign of new and adapted measures with people living with dementia and care partners, embed strengths based and person-centered concepts into outcome frameworks, and move toward more standardized yet flexible measurement strategies capable of capturing nuanced, meaningful benefits of psychosocial interventions over time.
Supplemental Material
sj-docx-1-alz-10.1177_13872877261459125 - Supplemental material for Measures used to evaluate psychosocial interventions in dementia care: A narrative review and synthesis
Supplemental material, sj-docx-1-alz-10.1177_13872877261459125 for Measures used to evaluate psychosocial interventions in dementia care: A narrative review and synthesis by Ruth Brookman, Justin Christensen, Olivia R. Maurice, Mina Aghaei, Angela Cass, Sahba Monzaviyan, Eman Shatnawi, Marlee Carter, Jeremy Tran, Nina McIlwain, Sandra Garrido, Joyce Siette, Paul Strutt and Celia B. Harris in Journal of Alzheimer's Disease
Footnotes
Acknowledgements
The authors are grateful to colleagues at the MARCS Institute for Brain, Behaviour and Development, Western Sydney University, and the MARCS AgeLab academics for opportunities to discuss, and refine the ideas that shaped and guided this project.
ORCID iDs
Author contribution(s)
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
