Abstract
Background:
There is uncertainty regarding which outcomes tools should be used to report the results of treatment for patients with foot and ankle disorders. This study compared the responsiveness of the Foot Function Index (FFI), American Orthopaedic Foot and Ankle Society (AOFAS) Clinical Rating Systems, and Medical Outcomes Study Short Form-36 (SF-36) in patients with foot and ankle surgery.
Methods:
Twenty-five patients were recruited at a tertiary referral foot and ankle practice. The mean age of the patient sample was 40 years (range 21 to 69) and 19 were women (76%). Thirteen patients (52%) had conditions affecting the ankle, hindfoot, or midfoot, while 12 patients (48%) had conditions affecting the forefoot. Patients completed packets preoperatively and 6-months postoperatively which included informed consent forms, the FFI, the AOFAS, and the SF-36 questionnaires. Standardized response means (SRM) and effect sizes (ES) were used as the measures of responsiveness and were calculated for the AOFAS scores, the three domains of the FFI, the eight SF-36 sub-scales, and the two SF-36 summary scales.
Results:
The standardized response mean (SRM) for the AOFAS scores was 1.10 and the effect size (ES) was 1.12. The SRM for the three FFI domains ranged from −0.39 to −0.83, while the ES ranged from −0.55 to −0.86. The SRM for the SF−36 ranged from 0.09 to 0.72 (ES ranged from 0.09 to 0.77) with the highest values occurring with the Bodily Pain sub-scale (SRM 0.72, ES 0.77) and Physical Component Summary scale (SRM 0.76, ES 0.68).
Conclusions:
This study demonstrated increased responsiveness of foot and ankle specific outcomes tools compared to the SF-36. However, the Bodily Pain sub-scale and Physical Component Summary scale of the SF-36 had levels of responsiveness approaching those of the FFI and AOFAS Systems after foot and ankle surgery. This suggests that the SF-36 may be used alone to monitor the outcomes in these patients without sacrificing adequate sensitivity to clinical change.
INTRODUCTION
The best choice of outcomes tools to report the results of treatment of patients with foot and ankle disorders remains uncertain. 7,24 Several outcomes assessment instruments have been proposed for use in monitoring foot and ankle patients, including generic instruments, such as the Medical Outcomes Study Short Form-36 (SF-36) that are designed for broad use in a variety of medical conditions, as well as more specialized questionnaires such as the Foot Function Index (FFI) and the American Orthopaedic Foot and Ankle Society (AOFAS) Clinical Rating Systems. 6,10,20 The SF-36 is composed of 11 questions sub-divided into a total of 36 items that generate sub-scale scores for eight domains of health as well as summary scores for overall physical and mental functioning. 27,28 The FFI is a three-part questionnaire that is used to report clinical scores for the domains of activity limitation, pain, and disability in patients with foot and ankle conditions. 6 The AOFAS Clinical Rating Systems consist of four scales, each specific to the function of one region of the foot and ankle. 10
The criteria for selecting among these instruments are their validity, reliability, and responsiveness in evaluating the health of the targeted population. 7,11,16 The validity and reliability of the SF-36, FFI, and AOFAS Systems have been reported in previous studies. 2,3,6,7,10,12,14,16,17,18,20 –25 The validity of the AOFAS systems has been questioned, while the FFI has been validated only in patients with rheumatoid arthritis. The AOFAS and FFI questionnaires are included in the current study because they remain the most commonly used tools for foot and ankle conditions despite the limited evidence supporting their validity. 7 In contrast, the SF-36 has been extensively validated and tested for reliability but is designed for use as a generic tool and has been shown to have lower levels of responsiveness in patients with orthopaedic disorders than disease or region-specific tools. 1,4,5,9,15,18,19
More focused instruments, such as the FFI and AOFAS, systems have been designed to increase sensitivity to clinical change in patients with conditions affecting the foot and ankle. 6,10 However, the relative responsiveness of the SF-36, AOFAS Systems, and FFI has not been definitively established.
The purpose of this study was to compare the responsiveness of the FFI and AOFAS Clinical Rating Systems to the SF-36 by calculating the standardized response means (SRM) and effect sizes (ES) for each questionnaire in a group of patients undergoing foot and ankle surgery. The standardized response mean and effect size are accepted measures of the responsiveness of an outcomes tool to clinical change. 11,13,16 Our hypothesis was that the FFI and AOFAS systems would have improved responsiveness and higher SRM and ES than the SF-36 in patients with foot and ankle disorders. This would support the continued need for validated foot and ankle specific questionnaires for use in conjunction with generic tools to follow the clinical outcomes of these patients. Conversely, similar levels of responsiveness of at least some of the SF-36 scales to the FFI and AOFAS systems would suggest that the SF-36 could be used alone to monitor foot and ankle patients. This would decrease the questionnaire load administered to patients and allow direct comparison to other health conditions without sacrificing sensitivity to clinical change.
MATERIALS AND METHODS
Human Subjects
This project was approved by the institutional review board at University of California, Los Angeles (UCLA) School of Medicine. Informed consent was obtained from all patients before participation.
Patient Sample
Patients were recruited through the foot and ankle clinic at UCLA Medical Center. Patients were included if they presented with a chronic condition affecting the foot and ankle that required operative intervention. Patients were excluded if their presenting complaint was acutely traumatic. This resulted in the exclusion of acute fractures. Other inclusion criteria included the ability to read English, age greater than 18 years unless accompanied by a parent, and consent to participate. Patients were not otherwise excluded on the basis of age or race.
Thirty patients were recruited and had adequate data to allow complete scoring of the SF-36 for the preoperative visit and 6-month postoperative followup. One patient had inadequate followup on the FFI activity limitation and disability domains, while one patient had inadequate data to score the pain domain. Three patients had inadequate followup on the AOFAS systems. The patients with missing FFI and AOFAS data were excluded from further analysis. The remaining 25 patients composed the study population. Demographic data of the patient sample are shown in Table 1. The patients included in the study were 19 women (76%) and six men (24%), with a mean age of 40 (range 21 to 69) years.
Because of the exclusion of acute traumatic injuries, the patients selected all had chronic foot and ankle conditions. Thirteen patients (52%) had conditions affecting the ankle, hindfoot, or midfoot and 12 patients (48%) had conditions affecting the forefoot. The forefoot conditions included hammertoe deformity or metatarsalgia (six patients), hallux rigidus (two), and hallux valgus (four). The ankle/hindfoot and midfoot conditions were degenerative arthritis (eight patients), flatfoot deformity with posterior tibial tendinitis (two), accessory navicular (one), ankle instability (one), osteochondral lesion of the talus (one), and plantar fasciitis (one).
Data Collection
Recruited patients were given a packet that included an Informed Consent Form approved by the Institutional Review Board at UCLA Medical Center, the Medical Outcomes Study Short Form-36 version 2 (SF-36), the FFI, and the AOFAS Clinical Rating System appropriate to their anatomic area of complaint. The questionnaires in this packet were administered during the initial patient visit for all enrolled subjects. The same questionnaires were then administered again at the 6-month postoperative visit for each patient.
Data Analysis
Raw data for the SF-36 and FFI were recorded using Microsoft Excel 2002, (Microsoft, Redmond, WA, 2001). The AOFAS Systems, FFI, and SF-36 were scored using standard scoring techniques. 6,10,27,28 The SF-36 data were scored using the SF-36 Health Outcomes Scoring Software Version 1.0 (QualityMetric Incorporated, Lincoln, RI, 2003). This software package generates output for all eight sub-scales of the SF-36, as well as for the Physical (PCS) and Mental Component Summary scales (MCS).
Missing data were adjusted for using the scoring algorithms appropriate for each outcomes tool. There is no established technique for missing data estimation with the AOFAS systems or FFI and scales with incomplete data points on any of these tools were excluded from further calculation of the standardized response means and effect sizes. 6,10 The SF-36 software scored missing data using standard algorithms. The SF-36 allows complete scoring for patients with a limited number of missing items. More extensive missing data leads to incomplete scoring of some sub-scales or summary scores. 27,28
Demographic data on patient sample (n = 25)
Statistical Analysis
The mean value for each of the scales was calculated both preoperatively and postoperatively. The mean change in score for each scale also is reported. The SRM and ES were calculated for each instrument including the AOFAS systems, the three FFI domains, the eight SF-36 sub-scales, and the two SF-36 summary scales. Microsoft Excel 2002 (Microsoft, Redmond, WA 2001) was used for all statistical comparisons. The SRM and ES are accepted measures of responsiveness. 11,13,16 Higher values indicate instruments more responsive to clinical change while lower values reflect less sensitivity to underlying changes in health status. The SRM is calculated as the mean change in scores divided by the standard deviation of these changes. The ES is calculated as the mean change in scores divided by the standard deviation of the preoperative scores. Small effects were considered more than 0.20, moderate effects were considered more than 0.50, and large effects were considered more than 0.80. 11,16
RESULTS
Baseline Scores
Mean scores were calculated for the patient sample for the AOFAS scores, three FFI domains, eight SF-36 sub-scales, and two SF-36 summary scales (Table 2). The changes in scores for each patient are shown in Table 3. The FFI reports three scores covering Activity Limitation, Pain, and Disability. Higher FFI scores indicate worse health status. The mean score was 31.65 (range 0 to 79.60) for Activity Limitation, 48.14 (5 to 90.40) for Pain, and 40.36 (0.10 to 94.40) for Disability. The mean AOFAS scale score was 54.00 with a range of 8 to 85.
Mean raw scores for each of the eight SF-36 subscales are shown in Table 2. The eight sub-scales include physical functioning (PF), role-physical (RP), bodily pain (BP), general health (GH), vitality (VT), social functioning (SF), role-emotional (RE), and mental health (MH). These scales are scored from 0 to 100 with higher scores indicating better health status. The mean scores ranged from a high of 80.67 for the Role-Emotional sub-scale to a low of 44.72 for the Bodily Pain sub-scale.
Mean baseline scores for the Physical (PCS) and Mental (MCS) Component Summary scales of the SF-36 are shown in Table 2. The summary scales are based on established norms with scores calculated as variations from a reference value of 50 for healthy populations. Each 10 points higher or lower is consistent with one standard deviation from the normal population reference value of 50. 28 The Physical Component Summary scale had a preoperative value of 39.58 for this patient sample. This is approximately one standard deviation below the mean for healthy subjects. In contrast, the mean Mental Component Summary Scale score of 52.44 in this group of patients is near the reference value of 50. These findings suggest that foot and ankle pathology has significant effects on the physical components of health status measured by the SF-36.
Mean baseline scores, mean followup scores, mean change in scores, effect sizes, and standardized response means for the AOFAS Systems, FFI, and SF-36 (n = 25)
AOFAS = American Orthopaedic Foot and Ankle Society; FFI = Foot Function Index; SF-36 = Short Form-36.
Postoperative Scores
The mean postoperative scores and the mean changes in scores are shown in Table 2. The mean change in score for the AOFAS scale was an improvement of 19.40 points (range −18 to +55). The mean FFI scores also improved, with the lower mean scores postoperatively indicating improvement in health status for all three domains. The mean improvement was −12.37 (range −66.00 to +71.40) for the Activity Limitation domain of the FFI, −17.94 (range −62.80 to +34.60) for the Pain domain, and −16.90 (range −81.27 to +19.60) for the Disability domain.
Some of the SF-36 subscales showed similar improvements in scores when comparing preoperative and postoperative scores. The mean changes were highest for the Role-Physical (mean 17.50, range −68.75 to +81.25) and Bodily Pain (mean 16.60, range −43 to +62) sub-scales. The Vitality (mean 4.25) and Mental Health (mean 1.20) sub-scales showed the least amount of change. The Physical Component Summary scale showed a higher mean change, with improvement by 6.74 points as compared to a mean increase of 0.48 points for the Mental Component Summary scale.
Responsiveness Testing
The SRM and ES for each instrument also are shown in Table 2. The AOFAS Clinical Rating Systems had a SRM of 1.10 and ES of 1.12 while the three FFI domains had values of SRM −0.39 and ES −0.55 for the Activity Limitation domain, SRM −0.83 and ES −0.86 for Pain, and SRM −0.68 and ES −0.75 for Disability. The SF-36 had a wider range of SRM and ES. Higher SRM and ES were seen in the scales that are overweighted in the Physical Component Summary scale. These include the Bodily Pain (SRM 0.72 and ES 0.77), Role-Physical (SRM 0.63 and ES 0.56), Physical Functioning (SRM 0.53 and ES 0.53), and General Health (SRM 0.37 and ES 0.24) sub-scales. Lower SRM and ES were seen in the sub-scales overweighted in the Mental Component Summary scale. These include the Mental Health (SRM 0.09 and ES 0.09), Vitality (SRM 0.25 and ES 0.32), Role-Emotional (SRM 0.31 and ES 0.26), and Social Functioning (SRM 0.35 and ES 0.37) sub-scales. These differences resulted in a higher SRM for the Physical Component Summary scale (SRM 0.76 and ES 0.68) than the Mental Component Summary scale (SRM 0.06 and ES 0.05).
DISCUSSION
Several studies have examined the responsiveness of generic tools such as the SF-36 and more specific instruments in patients with orthopaedic conditions. The SRM seen in this study for the AOFAS scale, FFI Pain domain, SF-36 Bodily Pain sub-scale, and SF-36 Physical Component Summary scale were all within the range of 0.72 to 1.10, while the ES for these components were in the range of 0.68 to 1.12. There are no established thresholds for classifying a questionnaire as having adequate responsiveness. However, these SRM are of a similar magnitude to those of outcomes tools used for studies of patients undergoing treatment of shoulder disorders, orthopaedic trauma, and hip fractures. 4,8,11,18 In addition, SRM and ES values greater than or equal to 0.5 are generally accepted as moderate in size. 11,16 This indicates that these components of the AOFAS, FFI, and SF-36 have an acceptable responsiveness to clinical change in the population of foot and ankle patients studied.
The findings of this study highlight the importance of monitoring pain as an outcome. The AOFAS scale relies largely on one question pertaining to pain. Similarly, the most responsive components of the FFI and SF-36 also rely on patient reports of pain. A study of spine patients suggested that disease-specific measures do not provide significant increases in responsiveness when compared to more generic tools used to measure pain. The authors suggested that generic tools such as the SF-36 may be sufficient to follow clinical outcomes as long as they include some component that measures pain. 26
The main limitation of this study is the small sample size. The patients recruited included patients with ankle, hindfoot and midfoot conditions as well as forefoot disorders. This study did not recruit enough patients to determine the SRM or ES for individual diagnoses. However, since each patient was asked to complete all three outcomes tools, the results of this study are still useful for understanding the comparative magnitude of the SRMs for the questionnaires studied in a population undergoing surgery for chronic foot and ankle disorders. A larger sample size is necessary to identify the precise value of the ES and SRM of these tools for specific foot and ankle conditions. An additional limitation of this study is the lack of comorbidity data. The effects of comorbidity on the relative responsiveness of these tools is not reported in the current study.
Outcomes assessment has become increasingly important in evaluating the efficacy of medical and surgical treatments. There has been ongoing uncertainty regarding the best tool or combination of tools to use for reporting the outcomes of patients with foot and ankle disorders. Outcomes tools should be valid, reliable, and responsive. Previous studies have demonstrated the validity and reliability of the SF-36, but there has been concern that it is not adequately responsive to clinical change in foot and ankle patients. 7 The findings of this study indicate that the Bodily Pain sub-scale and Physical Component Summary scale of the SF-36 have a level of responsiveness approaching that of the AOFAS Clinical Rating Systems and Foot Function Index. This supports the use of the SF-36 alone as a method of monitoring outcomes after foot and ankle surgery. This would allow for a decreased burden of questions on patients involved in clinical studies without sacrificing adequate responsiveness to clinical change. Further study of these tools in a larger sample of patients would be useful to confirm the responsiveness of the SF-36 in patients with specific foot and ankle diagnoses.
