Abstract
Background
Ankle ultrasound imaging could be an option with higher priority due to its lack of radiation, and cost- and time-effectiveness. However, previous studies regarding anterior tibiofibular ligament and calcaneofibular ligament injuries have shown varied results.
Purpose
To evaluate the diagnostic performance of ankle ultrasound for anterior tibiofibular ligament and calcaneofibular ligament injuries.
Material and Methods
PubMed and EMBASE databases were searched for diagnostic accuracy studies that used ultrasound for diagnosing anterior tibiofibular ligament and calcaneofibular ligament injuries. Bivariate and hierarchical summary receiver operating characteristic modeling were used to evaluate diagnostic performance. Subgroup analysis was performed using studies according to severity of the injury (complete and partial anterior tibiofibular ligament tear). We performed meta-regression analyses for heterogeneity exploration.
Results
Ten articles involving a total of 380 patients were included. For anterior tibiofibular ligament injury, the summary sensitivity, summary specificity, and area under the hierarchical summary receiver operating characteristic curve (AUC) were 0.99, 0.92, and 0.99, respectively. For calcaneofibular ligament injury, the summary sensitivity, summary specificity, and AUC were 0.95, 0.99, and 0.95, respectively. In subgroup analysis, for complete anterior tibiofibular ligament tear, the summary sensitivity, summary specificity, and AUC were 0.96, 0.82, and 0.96, respectively. For partial anterior tibiofibular ligament tear, the summary sensitivity, summary specificity, and AUC were 0.90, 0.82, and 0.93, respectively. Among the various potential covariates, proportion of anterior tibiofibular ligament tear, ultrasound interpreter, and reference standard were associated with specificity heterogeneity.
Conclusion
Ankle ultrasound demonstrates high diagnostic performance in the diagnosis of anterior tibiofibular ligament and calcaneofibular ligament injuries. We recommend ultrasound performed by a musculoskeletal radiologist as a first-line diagnostic tool to diagnose anterior tibiofibular ligament and calcaneofibular ligament injuries.
Introduction
Lateral ankle sprains are the most common ankle joint injury, frequently occurring in daily and sports activities (1,2). It has been reported that >50% of adults may experience an ankle sprain (2). In lateral ankle sprains, the lateral collateral ligament (LCL) complex of the ankle, including the anterior talofibular ligament (ATFL), calcaneofibular ligament (CFL), posterior talofibular ligament, and anterior tibiofibular ligament can be injured. Among them, the ATFL and CFL are commonly damaged in ankle inversion because of their anatomical position (3).
ATFL and CFL injuries need adequate treatment plans (immobilization, rehabilitation, neuromuscular training) according to the injury grade; non-surgical treatment is usually chosen as a first-line option (4,5). If treatment after ATFL and CFL injury is inadequate, the injured joint is misused before healing, or rupture of any damaged ankle ligaments occurs, ankle instability may develop. Therefore, the early detection and accurate diagnosis of ATFL and CFL injury severity is crucial (3).
Clinical examination was initially used for the diagnosis of ATFL and CFL injuries in patients with ankle sprains. However, clinical examination alone may underdiagnose ATFL and CFL injuries (6); thus, imaging studies are commonly used. Ankle stress X-ray may visualize joint space, bony abnormalities including fractures, and abnormal soft tissue densities including swelling. It is especially useful as a primary screening tool for fractures, which may require timely intervention and which could be missed by ultrasound (US) (7). Computed tomography (CT) scan may visualize soft tissue more clearly than a plain X-ray, plus cross-sectional structures of bones (8,9). However, both procedures are hazardous because they use radiation sources and individuals must be exposed in a radioactive area. Magnetic resonance imaging (MRI) does not use radiation and is useful in imaging soft tissue, including ligament, tendon, and muscle, bleeding, and swelling; therefore, MRI may be used in a variety of soft tissue abnormalities, including ankle injuries. However, MRI is relatively expensive, time-consuming, and needs a fixed position to acquire images (10).
With this background, for the evaluation of the ATFL and CFL injuries, ankle US imaging could be an option with higher priority due to its lack of radiation, and cost- and time-effectiveness (10). However, previous US evaluation studies regarding ATFL and CFL injuries have shown varied results. Therefore, we believe that the performance of ankle US for diagnosing ATFL and CFL injuries needs further exploration and high-level evidence needs to be presented via quantitative synthesis of data from existing studies. Additionally, the pooling of results will be interesting because published studies have used different methodologies including injury severity (complete or partial tear) and reference standard (surgical finding or MRI finding).
This systematic review and meta-analysis aimed to evaluate the diagnostic performance of ankle US for diagnosing ATFL and CFL injuries. In addition, we performed a subgroup analysis to evaluate the diagnostic performance of ankle US according to severity of the injury (complete and partial tear).
Material and Methods
This meta-analysis followed the revised guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Accuracy Studies (PRISMA-DTA) statement (11).
Data sources
The PubMed and EMBASE databases were searched up to 2 January 2018 to identify English-language reports regarding US for diagnosing ATFL and CFL injuries. Search terms that were related to ‘ATFL,’ ‘CFL,’ and ‘ultrasound’ were combined with ‘sensitivity,’ ‘specificity,’ or ‘receiver operating characteristic’ as follows: ((‘anterior talofibular ligament’) OR (‘ATFL’) OR (‘calcaneofibular ligament’) OR (‘CFL’)) AND ((ultrasound) OR (ultrasonography) OR (sonogram) OR (sonography)) AND ((diagnosis) OR (accuracy) OR (sensitivity) OR (specificity) OR (receiver operating characteristic) OR (ROC curve)). The bibliographies of the identified articles were also screened to identify additional relevant studies. Two investigators screened the titles and abstracts for potential eligibility and disagreements were resolved through discussion.
Study selection
We included studies that fulfilled the following criteria: (i) patients with ankle injury; (ii) ankle US as the index test; (iii) use of surgical finding or MRI as the reference standard for confirmation of the ATFL and CFL ligaments tear; (iv) availability of sufficient information to reconstruct 2 × 2 contingency tables regarding sensitivity and specificity; and (v) original research article as the publication type with English language.
The exclusion criteria were as follows: (i) case report or case series; (ii) review articles, guidelines, consensus statements, letters, editorials, clinical trials, and conference abstracts; (iii) studies not pertaining to the field of interest; (iv) studies not performed on humans; (v) studies with insufficient data for 2 × 2 table; (vi) studies with only focused on diagnostic performance of indirect signs of the ankle ligaments tear such as joint effusion; and (vii) studies without use of surgical finding or MRI as the reference standard.
Data extraction and quality assessment
Two investigators independently extracted patient and study characteristics data. The same investigators evaluated methodological quality using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool (12). Inconsistencies between the reviewers were resolved through discussion.
A standardized form was used to extract data regarding: (i) patient characteristics (number of total patients, proportion of the ligament tear, mean age, age range, and sex); (ii) study characteristics (study location, publication year, study design, reference standard, and blinding to reference standard); and (iii) US characteristics (probe, technical parameters, and US performer). Study outcomes were also extracted to create 2 × 2 tables (i.e. true-positive, true-negative, false-positive, and false-negative results). The 2 × 2 tables were calculated using the Bayesian method if only sensitivity and specificity were presented for an eligible study. If two or more reviewers independently assessed the diagnostic accuracy, the result with the highest accuracy was extracted.
Data synthesis and analysis of diagnostic performance
Patient demographic characteristics and extracted covariates were summarized using standard descriptive statistics. Continuous variables are expressed as means and 95% confidence intervals (CIs), while categorical variables are expressed as frequencies or percentages unless stated otherwise.
We used a bivariate random-effects model for analyzing and pooling the diagnostic performance (sensitivity and specificity) measurements across studies. To derive summary estimates of diagnostic performance, we plotted estimates of the observed sensitivities and specificities for each test in forest plots and hierarchical summary receiver operating characteristic (HSROC) curves derived from individual study results (13–15). These results were plotted using HSROC curves with 95% CIs and prediction regions.
Heterogeneity was determined using Cochran’s Q test (P < 0.05 indicated the presence of heterogeneity) and the I2 test (0–40% = heterogeneity might not be present; 30–60% = moderate heterogeneity; 50–90% = substantial heterogeneity; and 75–100% = considerable heterogeneity) (16). When heterogeneity was noted, a ‘threshold effect’ was analyzed by visual assessment of the coupled forest plots of sensitivity and specificity. A meta-analysis of diagnostic test accuracy studies simultaneously evaluates a pair of outcomes (i.e. sensitivity and specificity). Sensitivity and specificity are commonly inversely correlated and influenced by the threshold (cut-off) value (13–15). In addition, Spearman’s correlation coefficient between the sensitivity and false-positive rate was calculated to determine any threshold effect; a coefficient of >0.6 was considered to indicate a considerable threshold effect (17). We omitted Deeks’ funnel plot (18) of individual studies to check for publication bias according to the PRISMA-DTA.
Subgroup analysis
For detailed evaluation of the diagnostic performance of ankle US for diagnosing complete or partial ligament tear, we performed a subgroup analysis. We extracted the eligible studies in which ligament injury was classified as partial/complete or grade 2/3 for subgroup analysis. Grade 2 injuries were regarded as partial ligament tear and grade 3 injury was regarded as complete ligament tear (19).
Meta-regression analysis
Meta-regression analyses using several covariates were performed to explore the potential causes of heterogeneity: (i) patient characteristics (acute ankle injury versus chronic ankle instability with or without acute injury); (ii) total patients (≥50 versus <50); (iii) including pediatric patients (yes versus no); (iv) proportion of ATFL tear (≥50% versus <50%); (v) proportion of CFL tear (≥50% versus <50%); (vi) US performer (musculoskeletal radiologists versus others); and (vii) reference standard (operation versus MRI).
All statistical analyses were performed by one author who has three years of experience performing systematic reviews and meta-analyses. The statistical analyses were performed using the ‘midas’ and ‘metandi’ modules in Stata software (version 10.0; StataCorp LP, College Station, TX, USA) and the ‘mada’ package in R software (version 3.4.1; R Foundation for Statistical Computing, Vienna, Austria). Results were considered statistically significant at a P value < 0.05.
Results
Literature search
Fig. 1 shows a flow diagram summarizing the literature search. During the initial search, 227 studies were identified. After removing 34 duplicates, we reviewed 193 titles and abstracts, then excluded 166 studies for the following reasons: case reports/letters/editorials/conference abstracts (n = 24); review articles/guidelines/consensus statements (n = 24); not in the field of interest (n = 116); and not human study (n = 2). After reviewing the full text of 27 eligible articles, we excluded 17 for the following reasons: studies with insufficient data to build 2 × 2 tables (n = 11) (20–30); studies focusing only on the diagnostic performance of indirect signs of ankle ligament tear (n = 1) (31); and studies without use of surgical finding or MRI as the reference standard (n = 5) (6,8,32–34). Ultimately, 10 original research articles (19,35–43) including a total of 380 patients were included in the meta-analysis.

Study selection process for the meta-analysis.
Characteristics of the studies and included patients
The patient characteristics are summarized in Table 1. The sample size was in the range of 7–120 (age range = 12.1–34.0 years). The study and US characteristics are summarized in Table 2. All studies were prospective and single-centered, with blinding to the reference standard. Nine studies (19,35–40,42,43) performed consecutive enrollment and one study (41) performed case-control enrollment. All studies used linear array US transducer for evaluation of the ATFL and CFL.
The included patients’ demographic characteristics.
CAI, chronic ankle instability; ATFL, anterior talofibular ligament; CFL, calcaneofibular ligament; NR, not reported.
Study and US characteristics.
US, ultrasound; MRI, magnetic resonance imaging; NR, not reported; OS, orthopedic surgeon; MSK, musculoskeletal; EP, emergency physician.
Quality assessment
Fig. 2 shows the risk of bias and applicability concerns for the 10 included studies. Overall, none of the studies were considered to be seriously flawed according to the QUADAS-2 tool. All the studies satisfied ≥4 of the seven items.

Grouped bar charts showing the risk of bias (left) and applicability concerns (right) for the 10 included studies using the Quality Assessment of Diagnostic Accuracy Studies-2 domains.
The risk of bias regarding the patient selection domain was considered high in one study (41) because this study performed case-control enrollment. Regarding the index test and reference standard domains, all studies were considered to have a low risk of bias. Regarding the flow and timing domain, seven studies (36–40,42,43) had an unclear risk of bias because the mean interval between US and the reference standard was not reported. All studies exhibited low applicability to our research question in the patient selection, index test, and reference standard domains.
Overall diagnostic performance of US for diagnosing ATFL injuries
The 10 studies (19,35–43) had sensitivity values that were in the range of 0.50–1.00 and specificity values that were in the range of 0.25–1.00. The pooled sensitivity and specificity values were 0.99 (95% CI = 0.92–1.00) and 0.92 (95% CI = 0.76–0.97), respectively. The Q test revealed significant heterogeneity (Q = 11.012, P = 0.002), with considerable and substantial heterogeneity detected for sensitivity (I2 = 85.01%) and specificity (I2 = 80.52%), respectively. A threshold effect was observed in the coupled forest plot of sensitivity and specificity (Fig. 3) and in the correlation between sensitivity and the false-positive rate (0.110, 95% CI = −0.592 to 0.659). The area under the HSROC curve was 0.99 (95% CI = 0.98–1.00) (Fig. 4).

Coupled forest plots for the summary sensitivity and specificity of ankle ultrasound for anterior talofibular ligament injury. Dots in squares represent sensitivity and specificity. Horizontal lines represent the 95% confidence interval (CI) for each included study. The combined estimate (“Combined”) is based on the random-effects model and is indicated with diamonds. Corresponding heterogeneities (I2) with 95% CIs are provided at the bottom right corner: I2=100%×(Q−df)/Q, where Q is Cochran’s heterogeneity statistics and df is the degrees of freedom.

Hierarchical summary receiver operating characteristic (HSROC) curve for using ankle ultrasound for anterior talofibular ligament injury. The summary point (red box) indicates that the summary sensitivity was 0.99 and the summary specificity was 0.92. The 95% confidence region represents the 95% confidence intervals (CIs) of summary sensitivity and specificity, and the 95% prediction region represents the 95% CIs of sensitivity and specificity for each included study. The study estimates indicate the sensitivity and specificity estimated using the data from each study. The size of the marker is scaled according to the total number of patients in each study.
Overall diagnostic performance of US for diagnosing CFL injuries
Five studies (19,35,36,40,42) had sensitivity in the range of 0.94–1.00 and specificity in the range of 0.6–1.00. The pooled sensitivity and specificity values were 0.95 (95% CI = 0.89–0.98) and 0.99 (95% CI = 0.61–1.00), respectively. The Q test revealed significant heterogeneity (Q = 8.690, P = 0.006), with substantial and considerable heterogeneity detected for sensitivity (I2 = 76.14%) and specificity (I2 = 93.50%), respectively. A threshold effect was observed in the coupled forest plot of sensitivity and specificity (Fig. 5) and in the correlation between sensitivity and the false-positive rate (0.179, 95% CI = −0.514 to 0.792). The area under the HSROC curve (AUC) was 0.95 (95% CI = 0.93–0.97) (Fig. 6).

Coupled forest plots for the summary sensitivity and specificity of ankle ultrasound for calcaneofibular ligament injury.

Hierarchical summary receiver operating characteristic (HSROC) curve for using ankle ultrasound for calcaneofibular ligament injury. The summary point (red box) indicates that the summary sensitivity was 0.95 and the summary specificity was 0.99.
Subgroup analysis: diagnostic performance of ankle US according to severity of ATFL injury
Four studies (19,35,39,41) were included for subgroup analysis. In terms of complete ATFL tear, the pooled sensitivity and specificity were 0.96 (95% CI = 0.90–0.98) and 0.82 (95% CI = 0.72–0.89), respectively. The AUC was 0.96 (95% CI = 0.94–0.98). In terms of partial ATFL tear, the pooled sensitivity and specificity were 0.90 (95% CI = 0.83–0.94) and 0.82 (95% CI = 0.66–0.92), respectively. The AUC was 0.93 (95% CI = 0.90–0.95). We did not perform subgroup analysis with regard to CFL due to the small number of studies.
Meta-regression analysis
The results of the meta-regression analyses with regard to ATFL tear are summarized in Table 3. We did not perform meta-regression analysis with regard to CFL tear due to the small number of studies. The significant sources of heterogeneity in terms of sensitivity were proportion of ATFL tears (P < 0.01), US interpreter (P = 0.03), and reference standard (P < 0.01) with higher sensitivity reported in studies with relatively high proportion of the ATFL tear (≥50%), musculoskeletal radiologists, and surgical finding as reference standard than those with relatively low proportion of the ATFL tear (<50%), emergency physician, and MRI as reference standard. There was no significant source of heterogeneity in terms of specificity.
Meta-regression analyses for potential sources of heterogeneity in terms of anterior talofibular ligament (ATFL) tear.
Italic text indicates statistical significance (P < 0.05).
CI, confidence interval; No., number; ATFL, anterior talofibular ligament; CFL, calcaneofibular ligament; MSK, musculoskeletal; MRI, magnetic resonance imaging; US, ultrasound.
Discussion
The present meta-analysis revealed that ankle US was excellent for diagnosing ATFL (sensitivity = 99%, specificity = 92%) and CFL injuries (sensitivity = 95%, specificity = 99%). Considering these findings, ankle US is a useful imaging modality for diagnosing ATFL and CFL injuries. In subgroup analysis, diagnostic performance of ankle US for diagnosing complete and partial ATFL tears were excellent (complete ATFL tear, sensitivity = 96%, specificity = 82%; partial ATFL tear, sensitivity = 90%, specificity = 82%).
Traditionally, the initial assessment of the extent of ankle injuries has been done using radiography, including stress views (44,45). However, it is debatable about the value of stress radiographs, especially in the acute phase, because the findings are severely influenced by the radiographic technique, the amount of force applied to the joint, and patient cooperation. The latter depends on the amount of pain, refractory muscle spasm, and edema of the surrounding soft tissue (44). With this background, ankle US for diagnosing ATFL and CFL injuries has been investigated and recent advances in US equipment have considerably increased the diagnostic usefulness of this technique (7,46,47).
Our results may have important clinical implications. First, point-of-care US is increasingly being used to facilitate accurate and timely diagnoses and to plan treatment. Moreover, it may improve patient satisfaction because it can expedite clinical decision-making and is performed at the bedside (48–50). Our results can be expanded on by making an indirect comparison between US and stress radiograph findings. A recent study (21) related to the comparison between ankle US and stress radiographs demonstrated that US is more accurate. In stress radiographs, anterior translation and talar tilt angle showed variable sensitivity (15.8–78.9%) and specificity (0–57.9%). These sensitivity and specificity values of stress radiographs are lower than those of US in our meta-analysis. Therefore, we speculate that US may be more useful than stress radiographs for the evaluation of ATFL and CFL injuries. Second, it is a dynamic study (and can include a stress test) and so is able to gauge how unstable the affected ankle is. Third, it can evaluate the entire portion of the ligament in longitudinal view (46). Third, it can evaluate the contralateral ankle and is thus able to compare whether the ligament or tendon is thinned/thickened or not. Also, the use of US may reduce the radiation dose. Therefore, on the basis of the ‘first do no harm’ principle and the ‘as low as reasonably achievable’ concept, we believe the use of US is warranted to reduce radiation exposure in patients with suspected ATFL and CFL injuries.
Our meta-regression analysis revealed that the US interpreter was a source of heterogeneity. In particular, the pooled sensitivity was higher in studies with musculoskeletal radiologists than in studies with emergency physicians. Thus, we recommend that musculoskeletal radiologists perform the initial US for diagnosing ATFL and CFL injuries. If clinical physicians, including emergency physicians, perform ankle US, we speculate that focused musculoskeletal US training is required. Actually, more recently, emergency US has been formally incorporated into emergency medicine fellowship training because US can show different diagnostic performance according to the operator’s proficiency (51).
This meta-analysis only examined studies where the diagnostic performance of US was based on conclusive cases, as the eligible studies did not include cases with equivocal or inconclusive findings. Moreover, almost all studies using US emphasized its diagnostic performance alone and did not compare it to other modalities. Thus, a comprehensive study that includes equivocal and inconclusive cases and other methodology may be needed to confirm the usefulness of ankle US as an initial diagnostic tool in routine clinical practice.
The present study has several limitations. The first limitation is the relatively small number of included studies. Nevertheless, we were able to draw several important conclusions regarding the diagnostic performance of US and related factors, which we believe provides a useful overview because we used broad search terms and only included easily accessible studies (published in English and available in the PubMed and EMBASE databases). The second limitation is that all included studies revealed positive results, and that fact could be attributed to publication bias. Although we omitted Deeks’ funnel plots according to the PRISMA-DTA guidelines, we observed a low probability of publication bias (overall, P = 0.75–0.90; subgroup analysis, P = 0.26–0.96), which suggests that this factor did not undermine our results. The third limitation is the methodological differences between the included studies, and the extensive meta-regression analysis revealed that these variables were also significant sources of heterogeneity. This methodological diversity might affect the pooled estimates, especially as the US technical parameters were not assessed in the meta-regression analysis because not all studies reported the values for gain, dynamic range, and mechanical index. Further studies with larger sample sizes are needed to determine the optimal parameters for ankle US.
In conclusion, ankle US, which is a non-invasive, radiation-free modality, demonstrates high diagnostic performance in the diagnosis of ATFL and CFL injuries. We recommend that US should be performed by musculoskeletal radiologists as a first-line diagnostic tool to more accurately diagnose ATFL and CFL injuries.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
