Abstract
Background:
Orthopedic patients, especially those with bone tumors, are prone to perioperative acute kidney injury (AKI). This study integrates ultrasound radiomics with machine learning to predict AKI risk.
Methods:
A retrospective cohort of 120 patients from this center with fractures or bone tumors was analyzed. Ultrasound images were preprocessed and manually segmented to define kidney regions of interest. Morphological, texture, intensity, and higher order features were extracted. Feature selection was performed using least absolute shrinkage and selection operator, random forest importance ranking, and support vector machine–recursive feature elimination. Ten machine learning algorithms were trained with internal cross validation, and their performance was assessed by area under the curve (AUC), accuracy, calibration, and decision curve analysis. The final optimized model was applied to real-world ultrasound images, and class activation mapping was used to visualize AKI-related regions through interpretable heatmaps.
Results:
Three radiomic features consistently associated with AKI were identified across all the selection methods. Among the tested algorithms, the extreme gradient boosting (XGBoost) model achieved the best performance, with AUCs of 0.932, 0.922, and 0.893 in the training, internal, and external validation sets, respectively. The model demonstrated good calibration and high clinical net benefit. An interpretable risk-scoring system visualized individualized postoperative AKI risk, revealing higher predicted risk in bone tumor and complex fracture patients.
Conclusion:
This study highlights three ultrasound-derived features as critical determinants of postoperative AKI. The XGBoost model built on these features provides accurate and interpretable prediction in orthopedic patients and holds promise for guiding individualized perioperative kidney protection.
Introduction
Acute kidney injury (AKI) is a sudden decrease in kidney function that develops over hours to days, usually signaled by a rapid rise in serum creatinine or reduced urine output. A global pooled analysis found that AKI occurs in approximately 21.6% of hospitalized adults and 33.7% of hospitalized children. 1 AKI carries a marked short-term mortality burden ranging from 40% to 52%, and outcomes are particularly poor for patients who require renal replacement therapy.2,3 If not promptly recognized and managed, AKI often leads to incomplete renal recovery and progression to chronic kidney disease (CKD), with survivors demonstrating a markedly increased risk of incident CKD. 4 In orthopedic patients, multiple overlapping insults, such as substantial blood loss and intraoperative hypotension causing renal hypoperfusion and ischemic tubular injury, myoglobin release from muscle crush or rhabdomyolysis producing tubular toxicity and obstruction, exposure to nephrotoxic agents (e.g., nephrotoxic antibiotics, NSAIDs, chemotherapeutics) and iodinated contrast, and postoperative infection or systemic inflammation leading to endothelial and microcirculatory dysfunction, combine to raise the risk of AKI.5,6 Preexisting conditions such as advanced age, baseline renal impairment, diabetes, massive transfusion, and prolonged operative time further increase risk and impede recovery.6,7 A previous study reported that AKI incidence in orthopedic cohorts ranges from 8% to 30%, with higher rates in trauma- and tumor-related patients. 1 AKI in these populations is associated with greater short-term mortality and an increased risk of progression to CKD.4,8 Therefore, early identification and risk stratification of AKI in orthopedic patients are extremely important.
Advances in early detection have followed two complementary paths: biochemical biomarkers and data-driven prediction models. Urinary cell-cycle arrest markers (TIMP-2 and IGFBP7) and tubular injury markers such as NGAL have been prospectively validated to identify patients at imminent risk of moderate-severe AKI, often earlier than changes in serum creatinine.9,10 Concurrently, imaging-based approaches, including quantitative ultrasound (ultrasomics) and other radiomic methods, seek to capture subtle tissue structure and blood-flow changes that indicate renal vulnerability or systemic injury. 11 These techniques are promising but remain at the pilot stage for AKI prediction. In a retrospective single-center cohort of patients undergoing cardiovascular surgery, scholars developed a radiomic machine learning model that identified LVEF, red-blood-cell transfusion, and intensive care unit (ICU) mortality as independent predictors of AKI and used seven hippocampal radiomic features to predict AKI-related delirium. 12 In another retrospective study, a contrast-enhanced ultrasound (CEUS)-based radiomic nomogram accurately predicted microvascular invasion in hepatocellular carcinoma preoperatively. Although ultrasomic applications have made progress in cardiovascular risk prediction, oncology, and surgical outcome forecasting, there remains a paucity of studies that specifically integrate ultrasomics with advanced algorithms to predict AKI in orthopedic fracture and bone tumor populations. Although biochemical biomarkers and existing risk scores have shown promise, they were mainly developed in cardiac or ICU cohorts and can be confounded in orthopedic patients by muscle injury, transfusion and systemic inflammation, limiting specificity, and bedside utility. Ultrasound radiomics offers a portable, repeatable bedside approach to quantify tissue and perfusion signals over time, which motivated the present study.
In this retrospective study, the authors develop and validate an interpretable machine-learning model that integrates postoperative ultrasound radiomic features to predict the risk of perioperative AKI in patients with fractures or bone tumors. By providing a validated, interpretable prediction model that integrates ultrasound radiomics with routine clinical data, this study seeks to guide perioperative decision-making, optimize monitoring, and ultimately reduce AKI incidence and its downstream consequences.
Materials and Methods
Collection of clinical and ultrasound data
This study included patients with fractures and bone tumors who were treated at this hospital. The study was approved by the Institutional Ethics Committee (Approval No. GLMC202405054). All enrolled subjects met the inclusion and exclusion criteria and provided written informed consent before systematic data collection. Clinical data included demographic characteristics, medical history, imaging examination results, and surgery-related information. Ultrasound data were acquired by experienced sonographers. All patients underwent preoperative and postoperative ultrasound examinations to ensure comparability. During the examination, patients were placed in an appropriate position, and a high-frequency linear array probe was used for scanning. All images and video data were digitally stored and imported into the research database for subsequent analysis. The study workflow is illustrated in Figure 1.

Study workflow. Flowchart illustrating the overall study design, including patient enrollment, ultrasound image acquisition, preprocessing, feature extraction, feature selection, model construction, and validation.
Ultrasound data preprocessing
To ensure the quality and consistency of ultrasound images, preprocessing was performed after collecting the raw images. All ultrasound images were converted into a lossless format and anonymized to remove personal identifiers. The images were then resized to 512 × 512 pixels and normalized in grayscale to reduce variability caused by different ultrasound equipment and scanning parameters. To minimize noise, median filtering and Gaussian filtering were applied for smoothing while preserving lesion edge features as much as possible. Based on this, region of interest (ROI) segmentation was performed. The ROIs were independently delineated by two experienced sonographers, and disagreements were resolved by a third expert to ensure accuracy and reliability of kidney region segmentation.
Ultrasound feature extraction
After ROI annotation, imaging features were extracted from multiple dimensions. Morphological features included area, perimeter, long-axis diameter, short-axis diameter, and long-to-short axis ratio. Texture features were calculated using the gray-level co-occurrence matrix and gray-level run-length matrix, yielding metrics such as energy, entropy, contrast, and homogeneity. Intensity features included mean gray value, standard deviation, skewness, and kurtosis of pixels within the ROI. In addition, to capture higher order imaging patterns, wavelet transform and multiscale filter bank analyses were applied to some samples, extracting multiscale and multidirectional texture features. All features were exported in standardized formats and stored in the research database.
Important AKI-related feature selection
After feature extraction, three machine learning methods were used to perform feature selection and importance ranking. First, least absolute shrinkage and selection operator (LASSO) regression was applied to penalize redundant variables and retain those most relevant to outcomes. Second, the random forest (RF) algorithm was used to evaluate feature importance scores and identify variables with the greatest contribution to classification. Finally, support vector machine (SVM) combined with recursive feature elimination was used to further optimize the feature set. Integrating the results of these three approaches yielded a stable and interpretable feature subset.
Construction and evaluation of an ultrasound-based AKI prediction model
After obtaining key ultrasound imaging features, these variables were input into 10 machine learning algorithms to identify the optimal model for predicting AKI. The dataset was randomly divided into a training set (70%) and a validation set (30%). The following algorithms were tested: logistic regression, SVM, RF, extreme gradient boosting (XGBoost), gradient boosting decision tree, k-nearest neighbors, naïve Bayes, decision tree, quadratic discriminant analysis, and multilayer perceptron. During training, k-fold cross-validation was applied to reduce randomness. Model performance was evaluated using area under the curve (AUC), accuracy, sensitivity, specificity, F1 score, and calibration curve. The model with the best overall performance was selected as the predictive tool, and an interpretable risk scoring system was constructed based on feature weights.
Results
Baseline characteristics of enrolled patients
A total of 120 patients were included in this study, comprising 72 with fractures and 48 with bone tumors. No statistically significant differences were observed between the two groups in terms of age, gender, smoking history, hypertension, diabetes mellitus, or postoperative AKI. A significant difference was found only in alcohol consumption, which was more prevalent in the fracture group than in the bone tumor group (87.5% vs. 66.7%, p = 0.01) (Fig. 2).

Baseline characteristics of enrolled patients. Comparison of demographic and clinical features between fracture and bone tumor patients. Alcohol consumption showed a significant difference between the two groups (p = 0.01), while other variables were comparable.
Ultrasound features associated with AKI identified by machine learning
After standardized feature extraction, ultrasound features were classified into four major categories: shape features, texture features, first-order features, and higher order features (Fig. 3A, B). The RF algorithm identified 27 important variables, mainly texture and wavelet features, with the top five AKI-related variables shown in Figure 3C. LASSO regression reduced the original features to four candidate variables (Fig. 3D). SVM-RFE stably selected 90 key features with a low error rate (0.09) (Fig. 3E). By integrating the results of all three approaches, the authors obtained three overlapping features that were consistently identified across algorithms (Fig. 3F).

Feature extraction and selection.
Predictive performance of the XGBoost model
When the above shared features were applied to 10 machine learning algorithms, the XGBoost model demonstrated the best performance. The prediction accuracies in the training, testing, and external validation cohorts were 0.994, 0.935, and 0.909, respectively (Fig. 4A, B), with corresponding AUCs of 0.932, 0.922, and 0.893 (Fig. 5A–E). The confusion matrix and calibration curves showed good agreement between predicted probabilities and observed outcomes (Fig. 5B–F). Decision curve analysis (DCA) indicated that XGBoost consistently provided the highest net clinical benefit across most threshold probabilities. These findings suggest that the XGBoost model based on three key features achieves robust and stable predictive performance across different disease types and demographic subgroups.

Predictive performance of machine learning models.

Evaluation of the extreme gradient boosting (XGBoost) model.
Visualization of AKI risk based on real-world patients
An individualized risk scoring system was constructed based on the feature weights of the XGBoost model to estimate postoperative AKI risk in different patients, with the distribution of risk scores illustrated in Figure 6A. Patients were stratified into low-, intermediate-, and high-risk groups according to tertiles of the risk score. Taking preoperative ultrasound images of patients with bone tumors, fractures, and hand trauma as examples, standardized heatmaps were generated and risk scores were calculated as 1.61, 1.14, and 0.17, respectively (Fig. 6B–D). Among these, the bone tumor and fracture patients developed postoperative AKI, whereas the hand trauma patient did not during follow-up. This risk scoring system and its visualization provide clinicians with an intuitive tool for risk interpretation and decision-making support.

Risk score visualization based on extreme gradient boosting (XGBoost).
Discussion
In this retrospective cohort, an interpretable machine-learning model that combined postoperative ultrasound radiomic features with routine clinical data demonstrated robust predictive performance for postoperative AKI and highlighted a compact set of imaging and clinical predictors that drive individual risk estimates. Ultrasound is safe, widely available, and easily repeatable in the early postoperative period, which are particularly important for hemodynamically unstable patients and that are not readily met by CT or MRI, which often require transport, radiation, or contrast exposure, and longer scan times. The application of ultrasound radiomics provides orthopedic patients with timely bedside risk stratification for postoperative AKI, enabling earlier detection of evolving renal dysfunction, guidance of hemodynamic management, and prompt nephrology involvement without the added risks of contrast or patient transfer. Previous AKI prediction efforts have largely focused on biochemical biomarkers or electronic health records (EHR) in general surgical and intensive care patients. In two multicenter observational cohorts of critically ill adults at risk for AKI, the primary endpoint was development of moderate-severe AKI (KDIGO stages 2–3) within 12 h. A previous study developed a deep-learning model trained on longitudinal EHRs from 703,782 adults across 172 inpatient and 1062 outpatient sites to continuously predict future deterioration. The model detected 55.8% of inpatient AKI episodes and 90.2% of dialysis-requiring AKI up to 48 h in advance, with about two false alerts per true alert. 13 In addition, Xu et al 11 used a rabbit acute renal vein thrombosis (ARVT) with multimodal ultrasound images acquired, segmented, and subjected to radiomic feature extraction to build a machine learning model. Experimental kidneys showed marked structural, perfusion, and stiffness changes, and multimodal ultrasound-radiomic models achieved high diagnostic accuracy (AUCs up to 0.899), supporting their value for early AKI detection after ARVT. In a retrospective cohort, 145 RICU patients with ARDS complicated by AKI received continuous renal replacement therapy (CRRT). The clinical, CRRT-parameter, and ultrasomic features were used to build and compare four prognostic models, showing the best discrimination and highest net benefit by DCA in both training and validation sets, indicating superior ability to predict mortality and potential utility for individualized prognostic stratification. 14
Currently, some studies use clinical data and biomarkers to predict the occurrence of perioperative AKI in orthopedic patients, such as a prospective cohort of 237 orthopedic trauma patients undergoing open reduction and internal fixation. Clinical data and 28 blood biomarkers were measured to develop a tool for early identification of patients at risk for postoperative AKI. 7 Liu et al 15 trained and compared 10 machine-learning models to predict AKI at 24, 48, and 72 h and derived compact, clinically actionable models using routinely available predictors from a cohort of 1596 patients in the Medical Information Mart for Intensive Care IV database. Integrating ultrasound radiomics with machine learning could meaningfully complement conventional biochemical surveillance: biomarkers reflect molecular injury signals, whereas radiomics captures spatial and perfusion heterogeneity at the bedside. Multimodal fusion can optimally weight imaging and biochemical signals over time to provide dynamic, individualized risk estimates that prioritize biomarker testing, guide bedside interventions, and trigger timely nephrology referral. 16 However, studies specifically addressing orthopedic fracture and bone tumor cohorts by ultrasomics are sparse. There are plausible pathophysiologic reasons why ultrasound-derived features may predict AKI risk. Ultrasound techniques such as CEUS, Doppler flow indices, and shear-wave elastography can detect early alterations in renal microcirculation, perfusion dynamics, and tissue stiffness that often precede rises in serum creatinine and reflect mechanisms known to drive AKI, such as hypoperfusion/ischemia, microcirculatory dysfunction, inflammation, and interstitial edema.2,17 Quantitative imaging metrics, therefore, provide complementary hemodynamic or structural information relevant to AKI detection and, when combined with established clinical risk factors, improve the ability of predictive models to identify patients whose kidneys lack resilience to perioperative or systemic stress. 16 As in this study, the ultrasound-radiomics model achieved an AUC of 0.932 on the validation set, which is comparable with or exceeds previously reported biomarker- and clinical-based models,7,10,14 indicating that ultrasomics can provide complementary predictive information for AKI risk stratification in orthopedic patients.
Integrating ultrasound-radiomic outputs into clinical workflows could permit more precise allocation of perioperative resources. Several randomized trials and subsequent reviews show that screening to identify high-risk patients and then applying a targeted KDIGO-style bundle can reduce the incidence of moderate-to-severe AKI. For example, Meersch et al 18 used urinary [TIMP-2]·[IGFBP7] to triage high-risk cardiac-surgery patients and reported a reduction in postoperative AKI after implementation of a KDIGO care bundle. It should be noted that most evidence for a “screen-then-target” approach comes from cardiac and major surgery populations. Therefore, extending this strategy to orthopedic trauma or bone tumor cohorts requires first validating screening performance in those patients and then testing whether targeted interventions reduce AKI in prospective trials. Importantly, this study demonstrates that the ultrasound-radiomic model assigns systematically different postoperative AKI risk scores across orthopedic conditions, with bone tumor patients showing the highest scores, followed by oblique fractures and then hand trauma. This disease-specific risk gradient is consistent with biological expectations. Bone tumors and their treatments often entail greater operative complexity, blood loss, and exposure to nephrotoxins compared with isolated hand injuries, which also reminds us that future validation should report prespecified subgroup analyses by diagnosis to confirm discrimination and calibration within each subgroup. 19 In addition, reproducibility must be demonstrated across scanners and operators, and radiomic features harmonized to exclude acquisition bias. 20 Notably, variability in ultrasound acquisition protocols and vendor-specific image processing can introduce substantial nonbiological variation in pixel intensities, speckle patterns, spatial resolution, and contrast, which in turn reduces reproducibility of radiomic features and harms cross-machine generalizability. 21 Parameters, such as transducer frequency, focal depth, harmonic imaging, beamforming, frame rate, probe pressure, patient positioning, and in-scanner postprocessing, all change the appearance of renal parenchyma and may alter intensity- and texture-based features, lower ICCs, and cause models to pick up machine signatures rather than biological signals. 22 To address this, future studies should report acquisition metadata, evaluate feature robustness, perform phantom calibration across systems, and apply scale normalization and harmonization.
In summary, this study is novel in targeting an understudied, high-risk orthopedic population and in combining ultrasomics to produce an interpretable machine-learning model for AKI risk stratification. However, the retrospective, single-center design and limited external validation restrict generalizability, and heterogeneity in ultrasound acquisition protocols and scanner models may affect feature reproducibility. One limitation concerns interpretability. Although the authors used class activation maps (CAMs) to visualize image regions driving predictions, CAMs are qualitative, architecture-dependent, and can be unstable to small input or model changes, sometimes highlighting background or contextual cues rather than causal tissue signals. Future work should therefore complement CAMs with quantitative, model-agnostic methods, such as SHAP for per-feature attributions, permutation-importance and ablation tests, occlusion and deletion–insertion experiments for spatial validation, and reproducibility metrics for top features. In addition, the authors agree that validating feature robustness across different ultrasound manufacturers and imaging settings is essential; future studies should perform phantom calibration and test–retest and interoperator ICC assessments, and apply statistical harmonization or domain-adaptation methods to confirm that retained features are not scanner dependent. In addition, the available sample size and event rate limit the complexity of models that can be stably trained. Future work should therefore prioritize prospective, multicenter validation, standardization, and harmonization of imaging protocols across devices, and randomized trials testing whether model-guided, targeted kidney-protective interventions reduce AKI and hard clinical endpoints. With these steps, ultrasomic-guided prediction could become a practical and trustworthy component of perioperative kidney protection in orthopedics. Notably, another important limitation is the potential overrepresentation of specific anatomical sites in this cohort, which may imprint site-specific acquisition and anatomical patterns on ultrasound radiomic features and thereby bias feature selection and model performance. Consequently, the model’s performance and identified predictors should be interpreted with caution until validated across anatomically diverse and external cohorts. Future work should therefore assess subgroup performance by site, include anatomical subsite as a covariate or stratification factor, and validate the model in multicenter datasets to confirm generalizability.
Authors’ Contributions
The conceptualization of the study was led by J.D. and W.S., while the methodology was developed by W.S., T.R.L.A., and X.H. B.Y. also conducted the formal analysis. X.H. was responsible for preparing the data. X.H., B.Y., Q.Q., L.X., and R.L.D.W. drafted the original article and, along with W.S., supervised the entire project. The article revisions were undertaken by X.H., B.Y., and J.D., who also managed the project administration and funding acquisition alongside W.S. All the authors have reviewed and approved the final version of the article.
Footnotes
Funding Information
This project was supported by the Yulin Science Research and Technology Development Program Project (No. 202324079).
Data Availability
All data can be obtained from the corresponding author.
Ethics Approval Statement
Ethics approval by The Affiliated Hospital of Guilin Medical University (approval No. GLMC202405054).
Permission to Reproduce Material from Other Sources
All codes with analysis can be obtained from the corresponding author.
Disclosure Statement
All authors declare no conflicts of interest.
