Abstract
Background:
Despite its advantages, Patient-Reported Outcomes Measurement Information System (PROMIS) remains underused in patients recovering from foot and ankle fractures, and it is unclear whether condition- and region-specific PROs, such as the Foot and Ankle Outcome Score (FAOS), offer additional clinical insight over PROMIS in this cohort. Therefore, PROMIS Computerized Adaptive Test (CAT) measurement properties were evaluated and compared with region-specific and lower-extremity instruments. Also, variables associated with poorer outcomes were identified.
Methods:
In this prospective, cross-sectional study, patients with foot or ankle fractures completed questionnaires. PROMIS Mobility (MOB) and Pain Interference (PI) CATs were compared with FAOS, Lower Extremity Functional Scale (LEFS), and Short Musculoskeletal Function Assessment (SMFA). Convergent validity (correlations), reliability (SE, Cronbach α), efficiency (number of items/completion time), and floor/ceiling effects of each instrument were assessed. Multivariable linear regression was used to evaluate factors associated with PROMIS outcomes.
Results:
Seventy-three patients were included (mean age 50 [SD 17] years, 60% female). PROMIS-MOB showed particularly high correlations with domains focused on lower extremity functioning (FAOS-Sport, r = 0.74; FAOS-ADL, r = 0.76; LEFS, r = 0.86; and SMFA-dysfunction index, r = −0.83). PROMIS-PI showed particularly high correlations with pain-focused domains (FAOS-Pain, r = −0.78; SMFA-bother index, r = 0.72). PROMIS CATs and (most) other legacy instruments showed excellent reliability (SE = 2.1, α = 0.81-0.96). PROMIS CATs required fewer items (mean 6 and 4 items, respectively, vs 20-46) and less time to complete (mean 56 and 36 seconds, respectively, vs 118-295 seconds; P < .001). No floor or ceiling effects were observed for PROMIS CATs; a limited proportion of patients reached score extremes on some FAOS subscales (up to 12%), remaining below the 15% threshold. Greater pain intensity and depressive symptoms were independently associated with worse mobility and greater pain interference.
Conclusion:
Condition- and region-specific instruments may provide limited information beyond that obtained with PROMIS CATs. PROMIS (MOB, PI) CATs captured comparable constructs of functional recovery and pain interference, as demonstrated by high correlations with commonly used instruments, while substantially reducing patient burden. The relation between depressive symptoms and worse outcomes underscores the significance of psychological factors in recovery in patients with foot and ankle fractures.
Levels of Evidence:
Level II, diagnostic study.
Introduction
Patient-reported outcome measures (PROMs) are increasingly considered integral to the assessment of recovery after orthopaedic conditions.1-3 In foot and ankle fracture care, traditional PROMs, such as the Foot and Ankle Outcome Score (FAOS), Lower Extremity Functional Scale (LEFS), and Short Musculoskeletal Function Assessment (SMFA), are widely used for this purpose.4-6 These legacy instruments are, however, often lengthy, have suboptimal measurement properties, and are burdensome to use in routine trauma follow-up.7,8 These limitations have become more pertinent as PROMs move from research settings into high-volume fracture clinics, where efficiency and patient engagement are essential.9,10
The Patient-Reported Outcomes Measurement Information System (PROMIS) was initiated to address these challenges through the development of calibrated item (question) banks and computerized adaptive testing (CAT), enabling efficient and effective outcomes measurement with limited response burden. 11 Yet, PROMIS remains underused in orthopaedic trauma care, and it is uncertain whether generic PROMIS domains (such as mobility (MOB) and Pain Interference (PI)) adequately reflect recovery across specific fracture types. 12 For foot and ankle injuries specifically, it remains unclear whether condition- and region-specific measures such as the FAOS offer additional clinical insight.
This study evaluated PROMIS-MOB and PROMIS-PI CATs in a cohort of patients with foot or ankle fractures, and compared measurement properties with the FAOS, LEFS and SMFA. Convergent validity, reliability, efficiency, and floor/ceiling effects were assessed, and psychosocial and other factors potentially associated with worse outcome scores explored.
It was hypothesized that in this patient population PROMIS CATs would show overall high correlations with traditional PROMs while requiring significantly less time to complete, and that, depression would be an important independent factor influencing mobility and pain interference.
Methods
Design
This prospective, cross-sectional, single-center study was approved by the institutional review board. Patients were recruited prospectively as they presented to the orthopaedic trauma outpatient clinic of a level 1 academic trauma center during a 21-month study period (October 1, 2021–July 1, 2023).
Patients
Adult patients treated for an isolated foot or ankle fracture and at least 1 month of follow-up were eligible for inclusion, provided they had no weightbearing or motion restrictions at the time of enrollment. Exclusion criteria included age <18 years, multiple traumatic injuries, cognitive impairment, or insufficient Dutch proficiency to complete the questionnaires independently.
Potential participants were identified from the outpatient clinic schedule. Study information was provided in person, and informed consent was obtained prior to participation. Patients who enrolled were instructed in the use of a wireless touchscreen tablet (iPad; Apple Inc) for questionnaire completion.
Of 204 eligible patients approached, 73 (36%) consented to participate.
Data Collection
Patient characteristics and PROMs (PROMIS CATs, FAOS, LEFS, and SMFA) were collected in Dutch, at the time of recruitment, through an online PROM platform, KLIK (www.promis-trauma.nl), accessed through a tablet computer. 13 To compare instrument efficiency, completion time (in seconds) was automatically recorded for each PROM. The number of items administered per PROMIS CAT was also captured. To minimize response fatigue and order bias, PROMs were presented in a randomized sequence.
Patient Characteristics
Sociodemographic data collected for all participants included age, gender, country of birth, education level, employment status, and social status. Fracture characteristics with respect to location, treatment type, and duration of follow-up from start of treatment to study inclusion were retrieved from the electronic health record. Patients also reported their current pain intensity at enrollment on a 5-point scale (none, slight, moderate, severe, or extreme).
Measures
PROMIS CATs
The Dutch-Flemish PROMIS item banks for Mobility (version 2.1) and Pain Interference (version 1.1) contain 15 and 40 items, respectively, and were administered as CAT. 14 Mobility items target lower-extremity function and assess activities such as walking, stair climbing, and jumping, whereas Pain Interference items reflect the extent to which pain hinders engagement in daily, social, and recreational activities. All items are rated on a 5-point Likert scale.
Depressive symptoms were assessed using the Dutch-Flemish PROMIS Depression CAT (version 1.0), which selects a variable number of items from a 28-item bank per patient. 15
PROMIS item banks were administered as CATs, programmed to terminate once an SE ≤2.2 was achieved (reflecting 95% reliability) or after a maximum of 12 items, with a minimum of 2 items administered per test. PROMIS CAT scores were reported as T scores (mean 50, SD 10) calibrated to the US general population, with higher scores indicating better mobility and lower pain interference. 16
FAOS
The FAOS is a condition-specific patient-reported outcome measure designed to assess symptoms, function, and quality of life in patients with foot or ankle disorders, and has previously been translated and validated in Dutch.4,17 It consists of 42 items divided into 5 subscales: symptoms (such as stiffness, swelling) (7 items), pain (such as during standing, walking) (9 items), activities of daily living (such as walking, stair climbing) (17 items), sports and recreation (such as running, jumping) (5 items), and quality of life (such as lifestyle change, confidence) (4 items). Items are scored on a 5-point Likert scale (range 0-4). Total scores are calculated by summing the responses to the individual items and transforming these to a zero to 100 scale, with higher scores indicating better function or fewer symptoms.
LEFS
The LEFS is aimed exclusively at measuring lower-extremity function. It is composed of 20 questions related to activities such as walking between rooms and running on uneven ground, and has previously been translated and validated in Dutch.5,18 Items are scored on a 5-point Likert scale (range 0-4). Scores are obtained by summing all points and dividing these by the total number of (20) items. Scores range from 0 to 80, with higher scores indicating a higher functional level.
SMFA
The SMFA is a 46-item measure of overall physical functioning, scored on a 5-point Likert scale, and has previously been translated and validated in Dutch.6,19 It comprises 2 subscales: the Dysfunction Index (DI) (34 items) and the Bother Index (BI) (12 items). The DI assesses perceived difficulty and frequency of difficulty during daily and functional activities, whereas the BI reflects how much patients are bothered by limitations across domains such as work, recreation, and mobility. Total scores are obtained by summing item responses and transforming them to a 0-100 scale, with higher scores indicating greater disability.
Analyses
Convergent validity
We assessed convergent validity through Pearson correlations between PROMIS (MOB and PI) CATs, FAOS subscales, LEFS, and SMFA (DI and BI).
A high correlation (r ≥ 0.70) was expected specifically between PROMIS-MOB CAT, and FAOS (sports, ADL) and LEFS, given that these instruments particularly assess lower-extremity functioning.20,21 A high correlation was also expected between PROMIS-MOB CAT and the more general SMFA (DI and BI), based on prior research. 22 Similarly, a high correlation was expected between PROMIS-PI CAT and FAOS-pain and the SMFA-BI, as these instruments assess domains of discomfort and pain.20,21,23 By contrast, FAOS-symptoms was expected to have a low-moderate correlation (r < 0.70) with PROMIS (MOB and PI) CATs, as this subscale measures an array of symptoms that may be distinct from broad (pain-limiting) functional recovery.20,21
Reliability
Reliability was based on a single measurement and therefore refers to internal reliability or internal consistency. With PROMIS, each T score is associated with an SE. To evaluate reliability of PROMIS (MOB and PI) CATs, the mean SE was calculated for the total sample. An SE of ≤2.2 corresponds with a 95% reliability and was considered sufficient for individual assessment. The Cronbach α was calculated for all the FAOS subscales, LEFS and SMFA (DI and BI). An α greater than 0.70 was considered sufficient internal consistency. 24
Efficiency
Efficiency was defined as the total number of items as well as the time (seconds) needed for test completion. Completion times were compared using the Wilcoxon signed-rank test and P < .05 was considered statistically significant.
Floor and ceiling effects
Test scores were additionally examined for floor effects (indicating the lowest level of health) and ceiling effects (representing the highest level of health). An instrument was deemed to exhibit significant floor or ceiling effects if more than 15% of respondents had attained the minimum or maximum values on all items administered, respectively. 24
Associated factors
Univariable and multivariable linear regression analyses were performed to identify factors with the strongest association with worse PROMIS (MOB and PI) CAT scores (corresponding with lower MOB and higher PI scores). Variables included in the analysis were all sociodemographic and fracture specific factors as well as depressive symptoms and pain intensity. Variables with P <.10 in the univariable analyses were subsequently included in the multivariable linear regression analysis. As such, age, gender, education level, social status, fracture type, and treatment were not included in the PROMIS (MOB and PI) analysis. Additionally, employment status (for PROMIS-PI) and follow-up duration (for PROMIS-MOB) were excluded from the respective multivariable analysis based on the univariable analyses. No multicollinearity was detected among included variables (r < 0.80). We also inspected assumptions of normality and linearity through histogram, probability-probability (P-P) plots, residual plots, and scatter plots, and all assumptions for performing linear regression analysis were met.
Data were analyzed using SPSS, version 25 (IBM Corp).
Results
Study population
Mean age of the study population of 73 patients was 50 (SD 17; range 18-84) years, the majority was female (60%), sustained a fracture of the ankle (68%), and had received operative fracture fixation (84%) (Table 1). Follow-up was mean 190 (SD 212; range 36-821) and median 107 (IQR 122; 36-821) days.
Sociodemographic Variables and Fracture Specifics (n = 73). a
Unless otherwise noted, values are n (%).
Convergent validity
Correlations were high between PROMIS CATs measures and associated legacy instruments as hypothesized (Table 2). Specifically, PROMIS-MOB showed strong correlations with FAOS-ADL (r = 0.76) and FAOS-Sport (r = 0.74), as well as the LEFS (r = 0.86), and strong (negative) correlations with SMFA-DI (r = −0.83) and SMFA-BI (r = −0.76). PROMIS-PI demonstrated strong (negative) correlations, with FAOS-Pain (r = −0.78) and FAOS-ADL (r = −0.78), and strong (positive) correlations with SMFA-BI (r = 0.72). In contrast, FAOS-Symptoms showed weak correlations, at best, with PROMIS-MOB (r = 0.16) and PROMIS-PI (r = −0.25). These findings were in accordance with the predefined hypothesis.
Correlations: Pearson Coefficients. a
P = .04, ** P = .16, all other P values <.001.
Data are expressed as mean r, 95% CI). High correlations (r ≥ 0.70) are presented in bold.
Reliability
PROMIS CATs and most legacy PROMs showed very high reliability (Table 3). The exception was FAOS Symptoms, which performed poorly (Cronbach α −0.11), suggesting that items within this subscale do not measure a single coherent construct.
Reliability (Internal Consistency) of PROMIS CATs and Legacy PROMs.
Efficiency
The mean number of items needed for test completion for PROMIS-MOB and PROMIS-PI CATs was 6 (SD 2; range 3-12) and 4 (SD 3; range 2-12), respectively, compared with the fixed 42 items for the FAOS, 20 items for the LEFS, and 46 items for the SMFA.
Time to completion for both PROMIS-MOB (mean 56 [SD 24] seconds) and PROMIS-PI CATs (36 [SD 24] seconds) was shorter compared with the FAOS (mean 261 [SD 100] seconds), LEFS (118 [SD 38] seconds), and SMFA (295 [SD 130] seconds) (P < .001 for all time intervals).
Floor and ceiling effects
PROMIS (MOB and PI) CATs as well as legacy instruments demonstrated no significant floor or ceiling effects (Table 4). Nonetheless, up to 12% of patients reached the minimum or maximum scores for some FAOS subscales (Table 4).
Floor and Ceiling Effects (n = 73). a
Values higher than 15% indicates floor and/or ceiling effects are present.
Associated factors
Multivariable linear regression showed that greater depressive symptoms and moderate to extreme pain intensity were independently associated with worse PROMIS-MOB and greater PROMIS-PI scores.
For PROMIS-MOB, higher depressive symptom scores were associated with lower mobility (β = −0.222, 95% CI −0.310 to −0.006; P = 0.041), and patients reporting moderate to extreme pain intensity demonstrated substantially worse mobility (β = −0.450, 95% CI −8.044 to −3.044; P < .001). With the numbers available, no significant association could be detected between employment status and mobility outcomes. The final model explained 31.2% of the variance in PROMIS-MOB scores (adjusted R² = 0.312).
For PROMIS Pain Interference, higher depressive symptom scores (β = 0.350, 95% CI 0.121-0.417; P < .001) and moderate to extreme pain intensity (β = 0.449, 95% CI 3.447-8.461; P < .001) were independently associated with greater pain interference. With the numbers available, no significant association could be detected between follow-up duration and pain interference. The model explained 41.9% of the variance in PROMIS Pain Interference scores (adjusted R² = 0.419).
Discussion
As hypothesized, PROMIS-MOB scores correlated strongly with legacy measures focused on foot-ankle and lower-extremity functioning (FAOS ADL and Sport, LEFS), as well as with the more general SMFA indices, whereas PROMIS-PI aligned particularly well with pain-focused domains (FAOS Pain, SMFA BI). As such, generic PROMIS CATs demonstrated convergent validity, that is, the ability to capture the same core constructs of functional recovery and pain burden as condition- and region-specific legacy instruments. Importantly, CATs do so without compromising reliability, and while offering important practical advantages.
Research validating PROMIS measures against legacy instruments in patients with foot and ankle conditions remains limited. In a cohort of patients who underwent fixation for an unstable ankle fracture, PROMIS Lower Extremity CATs showed moderate to high correlations with FAOS subscales (r = 0.50-0.65), with FAOS Symptoms showing the least correlation. 20 In patients with common nontraumatic foot and ankle pathologies (eg, hallux valgus, ankle arthritis), PROMIS (MOB and PI) CATs similarly demonstrated moderate to high correlations with most FAOS subscales (r = 0.59-0.75), again with the lowest association for FAOS Symptoms. 21 In a broader population of patients treated for any type of lower-extremity fracture (ranging from pelvis to foot), PROMIS-MOB CATs were also found to be highly correlated (r = 0.77-0.84) with several widely used legacy PROMs, and strong correlations were observed between measures of more general physical functioning (PROMIS-PF CAT and SMFA). 22
In line with prior work, the FAOS Symptoms subscale in the present study demonstrated, at best, weak correlations with PROMIS CATs and a negative reliability (−0.11), typically expected between 0 and 1.20,21 In contrast, PROMIS CATs and other legacy PROMs exhibited very high reliability. These findings suggest a conceptual limitation of the FAOS Symptoms subscale. Although the heterogeneous range of symptoms it captures may be clinically relevant and important to assess in addition to PROMIS measures, they do not appear to reflect a single coherent construct and align poorly with domains of functional recovery or pain interference.
Importantly, patients in this study completed PROMIS CATs significantly faster and with far fewer items, while achieving reliability comparable to that of legacy instruments. Consistent with earlier studies, PROMIS CATs typically require well under 1 minute to complete while maintaining excellent reliability.8,20-22 Administering lengthy questionnaires may be feasible in research settings; however, if PROMs are to be used in routine clinical care, such as in time-pressured fracture clinics, minimizing test burden is critical to improve patient compliance.9,10 Taken together, our findings support the use of PROMIS CATs, rather than a range of (overlapping) legacy instruments, as a foundation for efficient clinical outcome assessment in daily foot and ankle care.
Neither PROMIS (MOB or PI) CATs, nor legacy instruments, demonstrated significant floor/ceiling effects in the present study, with no patients achieving minimum or maximum scores on the PROMIS CATs and a limited proportion reaching these extremes on the LEFS and SMFA (DI, BI). In contrast, up to 12% of patients reached minimum or maximum scores on several FAOS subscales (while still remaining below the 15% threshold). Consistent with these findings, prior studies have reported more normally distributed score patterns and much less (or no) ceiling effects for PROMIS CATs compared with the FAOS and other foot and ankle–specific instruments, thereby enhancing the ability of CATs to discriminate between the upper limits of “good” and “very good” outcomes in foot and ankle pathology.8,20-22
An important finding in this study, is the association of depressive symptoms with worse mobility and greater pain interference, even after accounting for demographic and fracture-related variables. This finding aligns with a mounting body of literature indicating that psychological health is strongly associated with outcomes following foot and ankle surgery.25-28
More longitudinal studies are nonetheless needed to evaluate the course of depressive symptoms during recovery.
Overall, PROMIS CATs may be recommended for use both in research and clinical care as they adequately capture core health domains (such as physical function and pain). When clinically relevant, additional assessment may be warranted of highly specific symptoms (such as joint noises and swelling).
Several limitations of this study must be acknowledged. The 36% participation rate in this study, may have introduced response bias. The sample size, with its broad follow-up range, although adequate for evaluation of measurement properties, limits more detailed subgroup analyses across specific fracture types or according to specific time intervals. 24 Only patients with isolated foot or ankle injuries were included, so results cannot be generalized to patients with other fracture types or polytrauma. Digital completion of questionnaires in Dutch on a computer tablet may have introduced selection bias, potentially excluding older or digitally less literate patients, as well as patients with insufficient Dutch proficiency. The correlations between PROMIS CATs and legacy instruments were specified a priori and are considered confirmatory. Other correlations, as well as the regression analyses to identify associated factors, are considered exploratory and should be interpreted with caution as no correction for multiple testing was applied. Finally, these study’s findings should be validated in a longitudinal study design, allowing assessment of change over time, and the stability of measurement properties across different stages of recovery.
Conclusion
In this study, PROMIS (MOB, PI) CATs demonstrated strong correlations with commonly used legacy instruments (FAOS, LEFS, SMFA), capturing comparable constructs of functional recovery and pain interference with excellent reliability and substantially lower patient burden. Condition and region-specific instruments may provide limited additional information beyond that obtained with PROMIS CATs in patients with foot and ankle fractures. Importantly, PROMIS CATs achieved this with excellent reliability, substantially lower patient burden, and no floor or ceiling effects.
The observed limitations of the FAOS Symptoms subscale further support the use of conceptually coherent outcome measures. The strong association between depressive symptoms and worse functional outcomes further supports the importance of integrating mental health assessment into routine outcome evaluation in foot and ankle fracture care.
Supplemental Material
sj-pdf-1-fao-10.1177_24730114261454983 – Supplemental material for PROMIS Computerized Adaptive Testing Demonstrates Strong Convergent Validity and Lower Patient Burden Compared With Legacy Instruments in Foot and Ankle Fracture Care
Supplemental material, sj-pdf-1-fao-10.1177_24730114261454983 for PROMIS Computerized Adaptive Testing Demonstrates Strong Convergent Validity and Lower Patient Burden Compared With Legacy Instruments in Foot and Ankle Fracture Care by Michiel A. J. Luijten, Lotte Haverman, Martijn Poeze and Diederik O. Verbeek in Foot & Ankle Orthopaedics
Footnotes
Ethical Considerations
Ethical approval for this study was obtained from the MUMC+ Institutional Review Board (METC 2021-2922)
Consent to participate
Written informed consent to participate was obtained.
Consent for publication
Not applicable.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Disclosure forms for all authors are available online.
Data Availability Statement
Available on request.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
