Abstract
Background: The International Knee Documentation Committee (IKDC) forms are commonly used to measure outcomes after anterior cruciate ligament (ACL) reconstruction. The knee examination portion of the IKDC forms includes a radiographic grading system to grade degenerative changes. The interrater and intrarater reliability of this radiographic grading system remain unknown.
Hypothesis: We hypothesize that the IKDC radiographic grading system will have acceptable interrater and intrarater reliability. Study Design: Case series (diagnosis); Level of evidence, 4.
Methods: Radiographs of 205 ACL-reconstructed knees were obtained at 5-year follow-up. Specifically, weightbearing posteroanterior radiographs of the operative knee in 35° to 45° of flexion and a lateral radiograph in 30° of flexion were used. The radiographs were independently graded by 2 sports medicine fellowship—trained orthopaedic surgeons using the IKDC 2000 standard instructions. One surgeon graded the same radiographs 6 months apart, blinded to patient and prior IKDC grades. The percentage agreement was calculated for each of the 5 knee compartments as defined by the IKDC. Interrater reliability was evaluated using the intraclass correlation coefficient (ICC) 2-way mixed effect model with absolute agreement. The Spearman rank-order correlation coefficient (rs) was applied to evaluate intrarater reliability.
Results: The interrater agreement between the 2 surgeons was 59% for the medial joint space (ICC = 0.46; 95% confidence interval [CI] = 0.35-0.56), 54% for the lateral joint space (ICC = 0.45; 95% CI = 0.27-0.58), 49% for the patellofemoral joint (ICC = 0.40; 95% CI = 0.26-0.52), 63% for the anterior joint space (ICC = 0.20; 95% CI = 0.05-0.34), and 44% for the posterior joint space (ICC = 0.28; 95% CI = 0.15-0.40). The intrarater agreement was 83% for the medial joint space (rs = .77, P < .001), 86% for the lateral joint space (rs = .76, P < .001), 81% for the patellofemoral joint (rs = .79, P < .001), 91% for the anterior joint space (rs = .48, P < .001), and 69% for the posterior joint space (rs = .64, P < .001).
Conclusions: While intrarater reliability was acceptable, interrater reliability was poor. These findings suggest that multiple raters may score the same radiographs differently using the IKDC radiographic grading system. The use of a single rater to grade all radiographs when using the IKDC radiographic grading system maximizes reliability.
The International Knee Documentation Committee (IKDC) forms are commonly used to evaluate outcomes after anterior cruciate ligament (ACL) reconstruction. Since its inception, the IKDC system has proven to be a reliable tool for both clinicians and researchers to capture and study operative outomes.
The IKDC was formed in 1987 by members of the European Society for Knee Surgery and Arthroscopy and the American Orthopaedic Society for Sports Medicine. Before that time, there was no universal system for standardizing outcomes after knee surgery, thus impeding the ability to perform research in the field. To help resolve this problem, the committee took upon itself the task of creating a standardized method of measuring outcomes after knee surgery. The result was the initial IKDC form produced at the International Knee Society meeting in 1991. 3 The form was later modified and published in 1993. 2 To address some of the shortcomings with the initial IKDC form, a separate subjective knee evaluation form was recently added and published in 2000. 4
Currently, the IKDC system is composed of 6 forms: demographic, current health assessment, subjective knee evaluation, knee history, surgical documentation, and knee examination. The subjective knee evaluation form in concert with the knee examination form are often used as outcome measures after ACL reconstruction. The knee examination form itself is composed of 7 groups: effusion, passive motion deficit, ligament examination, compartment findings, harvest site problems, radiographic findings, and functional testing. While the reliability and validity of much of the IKDC system have been established, those of the IKDC radiographic findings have not. The purpose of this study was to determine the interrater and intrarater reliability of the IKDC radiographic findings grading system.
Materials and Methods
Institutional Review Board approval was obtained before commencement of this study. Radiographs of 205 ACL-reconstructed knees were obtained at 5-year surgical follow-up. These radiographs consisted of a weightbearing posteroanterior view of the operative knee in 35° to 45° of flexion and a lateral view in 30° of flexion (Figure 1). The radiographs were independently graded by 2 sports medicine fellowship–trained orthopaedic surgeons using the IKDC 2000 standard instructions (Figures 2 and 3). The surgeons were given a copy of the instructions to refer to while grading the radiographs. One surgeon graded the same radiographs 6 months apart, blinded to patient and prior IKDC grades. The percentage agreement was calculated for each of the 5 knee compartments as defined by the IKDC. Interrater reliability was evaluated using the intraclass correlation coefficient (ICC) 2-way mixed effect model with absolute agreement. The Spearman rankorder correlation coefficient (rs) was applied to evaluate intrarater reliability.

Tunnel view posteroanterior radiograph is used to evaluate the medial (A) and lateral joint spaces (B). A lateral radiograph is used to evaluate the posterior (C) and anterior (D) joint spaces. A Merchant view is used to evaluate the patellofemoral compartment (E).

International Knee Documentation Committee radiographic findings grading instructions.

International Knee Documentation Committee radiographic findings grading system.
Results
The interrater agreement between the 2 surgeons was 59% for the medial joint space (ICC = 0.46; 95% confidence interval [CI] = 0.35-0.56), 54% for the lateral joint space (ICC = 0.45; 95% CI = 0.27-0.58), 49% for the patellofemoral joint (ICC = 0.40; 95% CI = 0.26-0.52), 63% for the anterior joint space (ICC = 0.20; 95% CI = 0.05-0.34), and 44% for the posterior joint space (ICC = 0.28; 95% CI = 0.15-0.40). The intrarater agreement was 83% for the medial joint space (rs = .77, P < .001), 86% for the lateral joint space (rs = .76, P < .001), 81% for the patellofemoral joint (rs = .79, P < .001), 91% for the anterior joint space (rs = .48, P < .001), and 69% for the posterior joint space (rs = .64, P < .001).
Discussion
To meaningfully compare outcomes after ACL reconstruction, it is essential to have reliable outcome measures. Clinicians and researchers have used a multitude of knee outcome scales, and many continue to be popular. Perhaps the most popular of these outcome scales is the IKDC system.
Before the development of the IKDC subjective knee form, there were few attempts to test the utility and reliability of the original IKDC form. Irrgang et al 7 found the IKDC score to be useful in describing outcomes after ACL reconstruction and demonstrated that the IKDC score correlated with the patients’ subjective rating of knee function. Sernert et al 9 followed 527 patients after ACL reconstruction and found the IKDC score to be a reliable and useful tool for evaluating postoperative outcomes.
The original IKDC form captured a limited amount of information, which represented the minimal required criteria to evaluate results. The IKDC subjective knee form was developed in 2000 to be a more comprehensive form that would capture symptoms and limitations in function and sports activity due to knee impairment. 6 A secondary goal was also to maximize the test-retest reliability, responsiveness, and validity. The IKDC subjective knee form has been more thoroughly tested than the original IKDC form. It has been found to be both valid and reliable in its ability to measure symptoms, function, and sports activity.5,6 Kocher et al 8 have also demonstrated it to be significantly associated with patient satisfaction after ACL reconstruction. More recently, normative data on the IKDC subjective knee form have been published, allowing outcomes to be compared with age- and gender-matched peers. 1
Reliable and valid instruments by which to measure outcomes after ACL reconstruction are essential to performing research and continuing progress in the field. These outcome data can be subjective or objective in nature. The IKDC subjective knee form captures the subjective outcomes after ACL reconstruction and has been shown to be a valid and reliable way of doing so. The objective data of the IKDC knee examination form have been less well studied. A component of this is the radiographic findings section. While it is important to measure the subjective outcomes and objective physical examination outcomes, it is equally important to be able to quantify and study the radiographic outcomes. This is of specific interest after ACL reconstruction to help determine if a procedure or lack of a procedure has led to any degenerative changes in the knee. Surprisingly, the interrater and intrarater reliability of the radiographic findings section of the IKDC knee examination form has not been tested. Knowing the reliability of this is crucial as many large ACL outcome studies are multicenter, multisurgeon studies in which multiple clinicians are grading the radiographs. In this study, we quantified the reliability of this system and demonstrate that it has acceptable intrarater reliability and poor interrater reliability.
A limitation of this study is that the radiographs obtained are not exactly those described in the IKDC system. The IKDC radiographic system uses a bilateral weightbearing posteroanterior radiograph in 35° to 45° of flexion and a Merchant view at 45°. We used a weightbearing posteroanterior radiograph in 35° to 45° of the operative knee only and a lateral radiograph in 30° of flexion. It is possible that the difference in radiographs obtained may affect the reliability of the grading system.
In summary, we demonstrate that while the intrarater reliability of the IKDC radiographic grading system is acceptable, the interrater reliability is poor. These findings suggest that different raters may score the same radiographs differently using the IKDC radiographic grading system. To maximize reliability, we recommend that a single rater grade all radiographs when using the IKDC radiographic grading system. This, however, is not always possible, as in the case of multicenter studies. When the use of a single rater is not possible, more precise definitions and guidelines for raters may help to increase the interrater reliability.
