Abstract
Objective
Pediatric head trauma is common, but computed tomography exposes children to ionizing radiation. This systematic review and meta-analysis evaluated the diagnostic accuracy of point-of-care ultrasound for pediatric skull fractures and clarified its role as an adjunct to clinical assessment rather than a replacement for computed tomography when intracranial injury is suspected.
Methods
We conducted a systematic review and bivariate random-effects diagnostic test meta-analysis guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 statement and registered in the International Prospective Register of Systematic Reviews (Registration Number: CRD420251139217). PubMed, Embase, the Cochrane Library, and Web of Science were searched from inception through 3 September 2025. Two reviewers independently screened studies, extracted 2 × 2 diagnostic data, and assessed risk of bias using the Quality Assessment of Diagnostic Accuracy Studies-2 tool.
Results
Nine studies conducted in emergency department settings met the inclusion criteria. Point-of-care ultrasound demonstrated a pooled sensitivity of 0.90 (95% confidence interval: 0.84–0.94), specificity of 0.98 (95% confidence interval: 0.94–0.99), and an area under the summary receiver operating characteristic curve of 0.96 (95% confidence interval: 0.94–0.97). The summary positive likelihood ratio was 41.73 (95% confidence interval: 15.85–109.87), and the negative likelihood ratio was 0.10 (95% confidence interval: 0.07–0.17). Deeks’ funnel plot showed no evidence of small-study effects (P = 0.80).
Conclusions
Point-of-care ultrasound shows high diagnostic accuracy for detecting pediatric skull fractures and may support bedside risk stratification in selected children with low- or intermediate-risk mild head trauma. However, most isolated linear skull fractures are managed conservatively, and point-of-care ultrasound does not evaluate intracranial injury. Computed tomography decisions should therefore remain anchored in neurological status, injury mechanism, validated pediatric head injury decision rules, and clinician judgment.
Keywords
Background
Traumatic head injury is among the most frequent reasons for pediatric emergency department (ED) visits. 1 Skull fractures occur in 2%–20% of pediatric head-trauma presentations, with higher rates in younger children because of thinner calvaria and greater susceptibility to impact. 2 Epidemiological studies indicate that pediatric head trauma remains a substantial ED and public health burden across regions.3,4 The clinical challenge is to identify children who need computed tomography (CT) for possible intracranial injury and avoid routine CT in children at very low-risk.
CT remains the reference standard for evaluating clinically important traumatic brain injury (ciTBI) and can identify skull fractures and associated intracranial pathology when imaging is indicated.5–7 However, CT exposes children to ionizing radiation. This concern is clinically relevant because children have developing tissues and a longer lifetime during which radiation-associated effects may emerge.8–11 The large Pediatric Emergency Care Applied Research Network (PECARN) cohort identified children at very low-risk of ciTBI for whom CT may be unnecessary, and subsequent validation work has reinforced the clinical value of decision rules such as PECARN, Canadian Assessment of Tomography for Childhood Head Injury (CATCH), and Children’s Head Injury Algorithm for the Prediction of Important Clinical Events (CHALICE).5,6 Current pediatric mild traumatic brain injury (mTBI) guidance also recommends against routine imaging when mTBI is diagnosed clinically and no risk factors for more serious intracranial injury are present.7,12
Point-of-care ultrasound (POCUS) has emerged as a promising non-ionizing alternative for bedside evaluation of skull fractures in children, offering advantages such as portability, real-time imaging, lack of radiation, and cost-effectiveness compared with CT.13,14 Operationally, trained emergency physicians using high-frequency linear probes can identify cortical discontinuity, irregularity, or step-off suggestive of fracture, enabling rapid ED triage at the bedside.14,15 Preliminary studies report diagnostic accuracies with sensitivities ranging from 77% to 100% and specificities up to 100% for identifying pediatric skull fractures, suggesting that POCUS could reduce CT utilization in appropriately selected children.16–18 Despite these benefits, variation in operator training, image-acquisition protocols, and reference standards across studies limits comparability and underscores the need for formal evidence synthesis to inform practice. 19
This systematic review and meta-analysis aimed to synthesize the available evidence on POCUS for detecting pediatric skull fractures, estimate pooled sensitivity, specificity, and overall accuracy and contextualize the clinical role of skull ultrasound in pediatric head injury decision making.
Methods
This systematic review and meta-analysis were conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 statement (PRISMA 2020). 20 The protocol was prospectively registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD420251139217). A comprehensive literature search of PubMed, Embase, the Cochrane Library, and Web of Science Core Collection was performed from database inception through 3 September 2025 to identify studies assessing the diagnostic accuracy of POCUS in detecting pediatric skull fractures.
The search strategy incorporated controlled vocabulary terms (e.g. Medical Subject Headings (MeSH) in PubMed and Emtree in Embase) and free-text keywords related to “skull fracture,” “ultrasound,” and “children.” To maximize sensitivity, no language or publication-status restrictions were applied, and backward and forward citation tracking complemented database queries by screening the reference lists of eligible studies and relevant reviews. Detailed search strategies specific to each database are provided in Supplementary Attachment 1.
Inclusion and exclusion criteria
Studies were eligible for inclusion if they met the following criteria: (a) involved pediatric patients (aged 0–18 years) with suspected skull fractures; (b) evaluated ultrasound as the index test for diagnosing skull fractures against a prespecified radiologic reference standard (CT with or without radiography), interpreted independently of the index test; (c) reported, or allowed derivation of, 2 × 2 contingency data (true positives (TPs), false positives (FPs), false negatives (FNs), true negatives (TNs)) enabling calculation of sensitivity, specificity, and diagnostic odds ratio (DOR); and (d) were original research articles, including prospective cohort, diagnostic cross-sectional, or case–control designs.
Exclusion criteria included the following: (a) studies involving animals or focusing on fractures other than skull fractures; (b) non-original articles, such as reviews, editorials, letters, conference abstracts, and case reports or case series without extractable accuracy data; and (c) studies with partial verification, incorporation bias, an unclear reference standard, or otherwise lacking complete diagnostic accuracy data. Where multiple reports overlapped by population, site, or timeframe, we included the most comprehensive, non-duplicative dataset, prioritizing the largest sample size and most complete verification, to avoid double counting.
Data extraction and quality assessment
Two independent reviewers screened titles, abstracts, and full texts in parallel using a piloted protocol, with disagreements resolved first by discussion and, if needed, by a third reviewer. Data were extracted independently and in duplicate using a standardized, piloted form and included study characteristics (first author, year, country, and sample size), patient demographics (age group and clinical setting), ultrasound details (device, probe, prespecified fracture signs, and operator training/experience), reference standard (CT with or without radiography), time interval and blinding between index and reference tests, derivable 2 × 2 data (TP, FP, FN, and TN), and diagnostic performance metrics (sensitivity, specificity, and area under the curve (AUC)).
Methodological quality was evaluated independently by two reviewers using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. 21 This assessment covered four domains: patient selection, index test, reference standard, and flow and timing. Risks of bias and applicability concerns were rated as low, high, or unclear, with disagreements adjudicated as above, enabling an objective appraisal of study rigor and potential sources of heterogeneity. Domain-level judgments were summarized graphically as traffic-light plots and used to guide interpretation of pooled estimates and exploration of heterogeneity.
Statistical analysis
Meta-analyses were conducted using Stata/SE version 18.0 (StataCorp, College Station, TX, USA). A bivariate random-effects model, equivalent to the hierarchical summary receiver operating characteristic (HSROC) approach, was employed to pool sensitivity and specificity, construct the summary receiver operating characteristic (SROC) curve, and calculate the AUC. 22 From the bivariate model, pooled DOR and positive and negative likelihood ratios (LR) (LR+, LR−) were derived, whereas heterogeneity was summarized through between-study variance components and quantified using Cochran's Q and the I2 statistic. Publication bias was assessed using Deeks’ funnel plot asymmetry test (P < 0.10 indicating potential small-study effects). For clinical translation, an LR scattergram and an illustrative Fagan nomogram (pretest probability fixed at 50%) were generated to estimate posttest probabilities. All analyses used the MIDAS module for diagnostic test accuracy meta-analysis, with cross-checks using METANDI command where appropriate. Two-sided statistical significance was set at P <0.05.
Results
Study selection and characteristics
The literature search yielded 457 records. After removing 133 duplicates, 324 titles and abstracts were screened, resulting in the exclusion of 304 records (primarily case reports, conference abstracts, editorials, and reviews). Of the remaining 16 full-text articles assessed for eligibility, 7 were excluded (reasons detailed in Supplementary Table S1) for not meeting inclusion criteria. Ultimately, nine studies were included in the meta-analysis.13–17,23–26 The selection process is illustrated in the PRISMA 2020 flow diagram (Figure 1).

PRISMA 2020 flow diagram of study identification and selection.
All included studies were prospective diagnostic accuracy investigations conducted in pediatric EDs and primarily used CT as the reference standard (one study permitted CT and/or radiography). Participants were children with head trauma or suspected skull fractures, with sample sizes ranging from 21 to 538. Ultrasound operators were predominantly emergency medicine physicians (frequently pediatric emergency specialists), employing high-frequency linear probes (typically 5–10 MHz). One large study used a 2–5 MHz phased-array probe for initial screening. Positive ultrasound findings were generally defined as cortical discontinuity, cortical irregularity, or cortical step-off, confirmed in multiple imaging planes and distinguished from sutures and open fontanelles using anatomical landmarks and contralateral comparison. Comprehensive study-level characteristics, patient demographics, ultrasound parameters, and diagnostic measures are summarized in Tables 1 and 2.13,14,16.17,23–26
Characteristics of included studies and patients.
CT: computed tomography; GCS: Glasgow Coma Scale.
Characteristics of ultrasound devices, operators, and diagnostic measures in the included studies.
EM: emergency medicine; US: ultrasound.
Publication bias
Deeks’ funnel plot asymmetry test revealed no significant evidence of publication bias or small-study effects (P = 0.80) (Figure 2). Accordingly, no trim-and-fill or other adjustment methods were applied to the pooled estimates.

Deeks’ funnel plot asymmetry test for small-study effects.
Risk of bias and quality assessment
The QUADAS-2 assessment indicated an overall moderate, domain-specific risk of bias across the included studies (Figure 3). High-risk was noted in patient selection for several studies due to convenience or unclear enrollment methods. The index-test domain showed generally low-risk, as ultrasound interpretations were often blinded to the reference standard. Risks in the reference-standard and flow/timing domains were mostly low, although a subset of studies had uncertain or variable intervals between index and reference tests. Applicability concerns were low overall, confirming good alignment with the review question. Taken together, these findings suggest that threats to validity stem chiefly from sampling frames and verification timing rather than from test execution or reference-standard quality.

QUADAS-2 quality assessment of the included studies.
Diagnostic performance
The SROC curve demonstrated excellent overall diagnostic accuracy (AUC = 0.96; 95% confidence interval (CI): 0.94–0.97) (Figure 4). Pooled sensitivity was 0.90 (95% CI: 0.84–0.94), and pooled specificity was 0.98 (95% CI: 0.94–0.99). These pooled estimates, together with the high AUC, support strong rule-in capability and moderate rule-out value for skull fracture detection, whereas clinical exclusion of intracranial injury still requires neurological assessment and validated head injury decision pathways.

Summary receiver operating characteristic curve for the diagnostic accuracy of ultrasound for pediatric skull fractures.
Quantitative synthesis
Forest plots of diagnostic metrics are presented in Figure 5. Sensitivity ranged from 77% to 100%, with low heterogeneity (Q = 10.34, df = 8, P = 0.24; I2 = 22.66%, 95% CI: 0%–80.3%) (Figure 5(a)). Specificity ranged from 85% to 100%, with moderate heterogeneity (Q = 26.30, df = 8, P = 0.001; I2 = 69.58%, 95% CI: 48.59%–90.58%) (Figure 5(b)). LR+ ranged from 6.14 to 250.17, showing moderate heterogeneity (Q = 21.66, df = 8, P = 0.01; I2 = 36.43%, 95% CI: 0%–89.68%) (Figure 5(c)). LR− ranged from 0.01 to 0.25, with low heterogeneity (Q = 12.30, df = 8, P = 0.14; I2 = 34.95%, 95% CI: 0%–85.41%) (Figure 5(d)).

Forest plots of pooled diagnostic performance: (a) sensitivity; (b) specificity; (c) positive likelihood ratio; and (d) negative likelihood ratio.
Diagnostic score and DOR
The diagnostic score ranged from 4.31 to 7.09, with low-to-moderate heterogeneity (Q = 12.44, df = 8, P = 0.13; I2 = 35.69%, 95% CI: 0%–85.60%) (Figure 6(a)). The pooled diagnostic score was 5.99 (95% CI: 5.01–6.96). By contrast, DORs varied widely from 74.25 to 1197.00, and the pooled DOR was 397.56 (95% CI: 150.07–1053.22), with very high heterogeneity (Q = 3875.20, df = 8, P < 0.001; I2 = 99.79%, 95% CI: 99.76%–99.82%) (Figure 6(b)). Because the DOR integrates sensitivity and specificity, this extreme variability likely reflects greater dispersion in specificity rather than instability in sensitivity.

Forest plots of (a) diagnostic score and (b) diagnostic odds ratio.
LR scattergram and Fagan plot
The LR scattergram indicated that ultrasound performs well for confirming skull fractures (summary LR+ >10) and moderately for ruling them out (summary LR− approximately 0.1) (Figure 7(a)). In the Fagan nomogram, assuming a pretest probability of 50%, a positive ultrasound result increased the posttest probability to approximately 98%, whereas a negative result decreased it to approximately 9% (Figure 7(b)).

(a) Likelihood ratio scattergram and (b) Fagan nomogram for clinical interpretation.
Discussion
The aim of this study was to evaluate the diagnostic performance of POCUS for identifying skull fractures in children presenting to the ED after blunt head injury. The pooled results showed high sensitivity and specificity. These findings support POCUS as a useful bedside test for detecting skull fractures, particularly when the area of impact is localized and the operator is trained in differentiating fractures from sutures, fontanelles, and normal cranial contours. The clinical interpretation, however, should remain narrow: skull ultrasound detects calvarial cortical disruption, not intracranial hemorrhage or ciTBI.
These findings align closely with prior systematic reviews and meta-analyses, reinforcing the growing evidence base for POCUS in this context.19,27 For example, a 2021 meta-analysis of six studies reported a pooled sensitivity of 91% and specificity of 96%, with an LR+ of 14.4, concluding that POCUS substantially alters posttest probabilities for skull fractures. 27 Similarly, a 2022 review of seven studies yielded comparable pooled estimates (sensitivity, 91%; specificity, 96%), emphasizing POCUS as a valid diagnostic alternative in EDs with minimal methodological concerns. 19 Our analysis extends this literature by incorporating more recent studies and by explicitly linking performance to practical determinants, including operator training, probe selection, and fracture definitions. This addition helps clarify why specificity showed greater dispersion than sensitivity. The across-review consistency suggests reproducible diagnostic performance, but the slightly lower sensitivity in our meta-analysis likely reflects broader operator experience and variation in fracture criteria, including requirements for cortical disruption in multiple planes.
The main clinical value of diagnosing a skull fracture in mild pediatric head trauma is not that the fracture itself usually requires surgery. Most simple isolated linear skull fractures without intracranial findings are managed conservatively. 1 Rather, a skull fracture can act as a marker of injury energy and may increase concern for associated intracranial injury in selected children. Therefore, POCUS should not be used to justify CT solely to document a fracture in a clinically stable child whose symptoms and decision-rule assessment do not support imaging. Conversely, a positive POCUS result may increase posttest probability and support closer observation, CT, neurosurgical consultation, or shared decision making when clinical features suggest higher risk.
This distinction is important because clinicians managing pediatric head injury are primarily concerned with intracranial injury rather than skull fracture alone. The PECARN rule was derived to identify children at very low-risk of ciTBI for whom CT may be unnecessary, and external validation studies have compared PECARN, CATCH, and CHALICE in large pediatric cohorts.5,6 Current pediatric mTBI guidance also recommends that imaging should not be performed routinely to diagnose mTBI and that validated decision rules should guide imaging when more serious intracranial injury is suspected. 12 In this framework, POCUS can complement but should not replace neurological examination, mechanism-of-injury assessment, age-specific risk factors, observation, or CT when indicated.
The radiation-sparing rationale remains clinically relevant. The Lancet cohort study by Pearce et al. found that childhood CT radiation exposure was associated with increased risks of leukemia and brain tumors, particularly at higher cumulative doses. 11 More recent evidence has also renewed concern about hematologic malignancy and cancer risk from CT radiation exposure in children and young adults.8–10 However, these risks should not be interpreted as a reason to avoid CT when intracranial injury is suspected. Rather, they support careful CT stewardship: CT should be performed promptly when decision rules or clinical deterioration indicate risk, whereas alternative bedside tools, observation, and shared decision making may help avoid imaging in appropriately selected low-risk presentations.
Evidence in infants deserves separate interpretation. Several included studies enrolled infants and very young children, including cohorts aged 0–4 years, younger than 2 years, and 0–6 years.15,16,23 These data suggest that POCUS can be feasible in infants when performed by trained emergency clinicians. However, neonatal-specific diagnostic accuracy evidence remains insufficient. Open fontanelles, developing sutures, ongoing calvarial ossification, and a higher need to consider nonaccidental trauma may complicate interpretation in neonates. Therefore, POCUS in neonates should be considered investigational or highly operator-dependent until prospective neonatal studies define its accuracy, reliability, and clinical pathway role.
Several limitations should be acknowledged. The number of included studies was modest and most were conducted in pediatric EDs with trained emergency clinicians, which may limit generalizability to settings with less POCUS experience. Some studies used convenience sampling, and timing between ultrasound and CT was not uniformly reported. Specificity heterogeneity was moderate and may reflect differences in ultrasound criteria, operator experience, fracture location, probe type, and methods for distinguishing fractures from sutures or open fontanelles. The meta-analysis evaluated diagnostic accuracy for skull fractures, not direct detection of intracranial injury or clinical outcomes such as CT avoidance, ED length of stay, missed ciTBI, return visits, neurosurgical intervention, or parental satisfaction.
Future research should move beyond test accuracy alone. Multicenter pragmatic studies should evaluate standardized scanning protocols, structured training thresholds, interobserver reliability, age-stratified performance, and integration with PECARN- or CHALICE-based pathways. Studies should also measure downstream outcomes, including CT utilization, observation duration, ED throughput, return visits, and missed intracranial injuries. Neonates and infants younger than 3 months require dedicated study because their anatomy and safeguarding considerations differ from those of older children.
Conclusion
POCUS has high diagnostic accuracy for detecting pediatric skull fractures in ED settings. It may support bedside assessment and risk stratification in selected low- or intermediate-risk children, but it should not be presented as a standalone screening test for ciTBI. CT decisions should remain based on neurological status, mechanism of injury, validated pediatric head injury decision rules, and clinician judgment. Multicenter studies with standardized training and outcome-based endpoints are needed to confirm clinical benefit and implementation feasibility.
Supplemental Material
sj-docx-1-imr-10.1177_03000605261460310 - Supplemental material for Diagnostic performance of point-of-care ultrasound for pediatric skull fractures: A systematic review and meta-analysis
Supplemental material, sj-docx-1-imr-10.1177_03000605261460310 for Diagnostic performance of point-of-care ultrasound for pediatric skull fractures: A systematic review and meta-analysis by Xiaoyang Wang, Gaofeng Rao, Gang Yang, Yangtian Ye, Xiaoshuang Jiang, Jiuzhou Lin, Min Tang, Lihui Chen, Liuxian Pan, Weiting Chen and Xianlong Wu in Journal of International Medical Research
Supplemental Material
sj-docx-2-imr-10.1177_03000605261460310 - Supplemental material for Diagnostic performance of point-of-care ultrasound for pediatric skull fractures: A systematic review and meta-analysis
Supplemental material, sj-docx-2-imr-10.1177_03000605261460310 for Diagnostic performance of point-of-care ultrasound for pediatric skull fractures: A systematic review and meta-analysis by Xiaoyang Wang, Gaofeng Rao, Gang Yang, Yangtian Ye, Xiaoshuang Jiang, Jiuzhou Lin, Min Tang, Lihui Chen, Liuxian Pan, Weiting Chen and Xianlong Wu in Journal of International Medical Research
Footnotes
Acknowledgments
This manuscript was edited for language clarity using Grammarly Premium and ChatGPT (OpenAI). The authors reviewed and approved all AI-assisted edits and take full responsibility for the scientific content, data accuracy, and conclusions.
Author contributions
Conceptualization: Xiaoyang Wang, Weiting Chen, and Xianlong Wu. Methodology: Xiaoyang Wang, Gaofeng Rao, Gang Yang, and Weiting Chen. Systematic search and study selection: Yangtian Ye, Xiaoshuang Jiang, Jiuzhou Lin, and Min Tang. Data curation: Yangtian Ye, Xiaoshuang Jiang, Jiuzhou Lin, and Min Tang. Risk of bias and quality assessment: Lihui Chen and Liuxian Pan. Formal analysis: Xiaoyang Wang and Yangtian Ye. Visualization: Xiaoyang Wang and Yangtian Ye. Writing – original draft: Xiaoyang Wang, Gaofeng Rao, and Gang Yang. Writing – review and editing: Weiting Chen, Xianlong Wu, and all authors. Supervision: Xiaoyang Wang, Gang Yang, Gaofeng Rao, and Weiting Chen. Project administration: Xiaoyang Wang, Weiting Chen, and Xianlong Wu. All authors read and approved the final manuscript and agree to be accountable for all aspects of the work.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethics approval and consent to participate
Not applicable. This systematic review and meta-analysis used data from previously published studies and did not involve new individual patient data collection.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
All data generated or analyzed during this study are included in this article and its supplementary materials.
Preprint statement
This manuscript has not been posted as a preprint.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
