Abstract
Background
Splenic artery angioembolization (SAAE) for high-grade blunt splenic injuries is an accepted adjunct to nonoperative management. We aimed to determine SAAE failure rates and identify predictive factors for SAAE failure (F-EMBO).
Methods
We conducted a retrospective review of TQIP (2018-2022) for adult patients with AAST Grade III-V blunt splenic injuries who underwent SAAE within 24 hours of arrival. F-EMBO was defined as requiring subsequent splenectomy within 4 days or repeat SAAE ≥1 hour after initial procedure. Univariable analysis compared successful SAAE (S-EMBO) vs F-EMBO characteristics and outcomes. A gradient boosting machine model with SHapley Additive exPlanations (SHAP) identified F-EMBO predictive factors.
Results
Among 6055 patients, 5694 (94.0%) had S-EMBO and 361 (6.0%) had F-EMBO. Of F-EMBO patients, 167 (2.8%) required splenectomy while 204 (3.4%) required repeat SAAE. F-EMBO patients had higher rates of anticoagulation therapy (11.6% vs 6.1%, P < 0.001) and cirrhosis (4.2% vs 2.2%, P < 0.05). They demonstrated higher pulse rates (97 vs 92 bpm, P < 0.01), lower systolic blood pressure (108 vs 122 mmHg, P < 0.001), and received higher red blood cell volumes within 4 hours (600cc vs 0cc, P < 0.001). F-EMBO patients had higher mortality (9.1% vs 3.2%, P < 0.001), more complications, longer stays, and higher nonhome discharge rates. The top predictive factors were RBC volume (SHAP = 1.12), systolic blood pressure (0.23), injury severity score (0.16), age (0.16), and pulse rate (0.10).
Discussion
RBC transfusion requirements, SBP, ISS, age, and pulse rates are the top influential factors predicting F-EMBO in adult patients with AAST III-V spleen injuries due to blunt trauma.
• Identified key predictors of splenic artery embolization failure in grade III-V blunt spleen trauma • Top 5 factors: RBC transfusion volume, systolic blood pressure, injury severity score, age, and pulse rate • RBC >200cc, SBP 70-90 mmHg, ISS >30, age 25-30, pulse <60/>120 bpm predicted higher failure ratesKey Takeaways
Background
Blunt trauma resulting in splenic injuries is common in the United States, making up 42% of all abdominal trauma. 1 Splenic injuries account for approximately 25% of abdominal trauma cases, representing roughly 800-1200 admissions annually in the United States.2,3 Traumatic spleen injuries may result in significant bleeding either from the organ itself or its vascular supply, which, if not managed properly, can result in hemorrhagic shock or even death, with mortality rates as high as 18% in the adult population. 1
Operative management (OM) is the preferred management strategy in hemodynamically unstable patients or patients who are nonresponsive to volume resuscitation. However, management strategies have shifted in recent decades, emphasizing splenic salvage to preserve immune function and avoid other complications associated with splenectomy. 4 Nonoperative management (NOM) strategies have proven successful in approximately 80% of patients with blunt splenic injuries. 5 A study by Peitzman et al demonstrated that higher blood pressure, higher hematocrit, and low injury severity are associated with successful NOM. 6 NOM should only be considered in facilities that have appropriate monitoring capabilities and urgent operating room access if NOM strategies fail. In recent years, splenic artery angioembolization (SAAE) has become a widely accepted adjunct to NOM.
SAAE is a minimally invasive procedure that occludes either the main or distal branches of the splenic artery to preserve as much of the spleen as possible. SAAE is typically offered to patients with a grade III or higher splenic injury, who are hemodynamically stable, with presence of contrast blush on CT scan with IV contrast, moderate hemoperitoneum on CT scan, or have evidence of ongoing bleeding. 7
While some studies report success rates as high as 92%, failure rates can reach 23.8% depending on patient selection criteria, institutional definitions of success, and follow-up duration.4,8 Despite the minimally invasive nature and advantage of spleen preservation, SAAE is not without complications. A systematic review demonstrated that significant complications including rebleeding, splenic infarction, splenic abscess, and hematoma were the most frequently reported following SAAE. 4 In addition to post-procedure complications, failure rates of SAAE have been noted to be as high as 23.8%. However, information pertaining the failure rates, risk of failure, and complications associated with SAAE is sparse, with only one study identifying red blood cell transfusion requirements as being associated with SAAE failure using a logistic regression. 9 Through this retrospective study, we aim to identify risk factors predictive of failure of SAAE with a gradient boosting machine (GBM) model.
Methods
We queried the 2018-2022 Trauma Quality Improvement Program (TQIP) for patients greater than or equal to the age of 18 who sustained blunt, AAST Grade III-IV spleen injuries and underwent SAAE for hemorrhage control within 24 hours of arrival. We used both TQIP embolization site variable, defined as embolization for hemorrhage control, and International Classification of Diseases, 10th Revision (ICD-10), procedure codes to identify patients that underwent SAAE. We used a combination of ICD-10 diagnosis codes and Abbreviated Injury Scale, 2005, as reported by Vanderbilt University Medical Center, 10 to determine spleen injury grade. SAAE was considered successful (S-EMBO) if no subsequent embolization or splenectomy procedure was required. SAAE was considered as procedure failure (F-EMBO) if patients underwent subsequent SAAE ≥1 hour after SAAE or splenectomy within 4 days of SAAE. Patients were excluded if they underwent concurrent exploratory laparotomy, concurrent embolization of liver, kidney, pelvis, vascular, or “other” sites as defined by TQIP, or if they underwent splenectomy greater than 4 days after SAAE. Patients who were discharged from the emergency department to the operating room, home, or other (jail, mental health institution, etc.), left the emergency department against medical advice, were transferred to another facility, or died/expired in the ED were also excluded.
Indication for procedure is not provided in the TQIP database. However, the minimum time frame from SAAE to associated post-procedure complications, such as splenic abscess, is 4 days. 11 Additionally, the Eastern Association for the Surgery of Trauma reported failure rates of NOM as high as 75% within 48 hours of injury and 88% within 5 days. 7 Therefore, a 4-day cutoff was chosen as the maximum time frame from SAAE to splenectomy classified as F-EMBO, to reduce falsely inflated failure rates. Our primary outcome was to identify risk factors predictive F-EMBO.
Univariable analysis was performed to compare patient characteristics, including age, sex, ethnicity, height, weight, comorbid conditions, vitals, Glasgow Coma Score (GCS), injury severity score (ISS), and red blood cell (RBC) volume administered between S-EMBO and F-EMBO groups. We also compared hospital characteristics, including bed size, trauma center verification level, and teaching status, as well as outcomes, including inpatient complications, length of stay (LOS), discharge disposition, and inpatient mortality. Vitals, including systolic blood pressure (SBP), pulse rate, pulse oximetry, respiratory rate, and GCS, were recorded within the first 30 minutes of arrival. If SBP was recorded as < 90 mmHg, it was presumed that they were responsive to volume resuscitation prior to SAAE. The amount of RBC received is reported if it was administered within 4 hours of arrival and was reported in cubic centimeters (cc). The amount of blood in a standard unit was reported per facility. The lowest amount of blood in cc in a standard unit was 250 cc. Therefore, patients were excluded if the amount reported was less than 250 cc or if they had missing information pertaining to the amount of blood they received. Chi-square was performed for categorical variables, and Student’s t and Mann-Whitney U tests were performed for continuous variables.
We developed a gradient boosting machine (GBM) model using a 60/20/20 train/validation/test split to identify patient factors that were predictive of F-EMBO. We chose a GBM model because of its computational efficiency, capability to identify nonlinear relationships, robustness to outliers and multicollinearity, and is efficiently interpretable with SHapley Additive exPlanations (SHAP). A GBM is an ensemble learning method that builds predictive models by combining multiple weak learners (typically decision trees) sequentially. Each subsequent model corrects errors made by previous models, resulting in improved overall prediction accuracy. GBMs are particularly effective at identifying nonlinear relationships and interactions between variables that traditional regression methods may miss. “SHAP was applied for model interpretation of feature importance. SHAP is a method for interpreting machine learning model predictions based on game theory. It calculates the contribution of each input feature to individual predictions by considering all possible combinations of features. SHAP values quantify how much each patient characteristic increases or decreases the predicted probability of embolization failure, allowing for intuitive interpretation of complex models. 12 The practical application of our machine learning approach, through SHAP analysis, allows us to quantify how specific patient characteristics contribute to the predicted probability of F-EMBO. Variables with >5% missing information were excluded from the model. Variables with missing values were inherently handled by the gradient boosting machine (GBM) model. Model calibration was assessed with Brier Score. Discrimination metrics were calculated at the F1-maximizing threshold. SHapley Additive exPlanations (SHAP) were applied for model interpretation of feature importance. Mean absolute SHAP values were calculated for the top 5 influential factors that were predictive of F-EMBO.
Finally, we conducted a lift analysis, which evaluates how much better our model performs compared to random chance. This approach involved dividing our cohort into risk percentiles based on model-predicted probability ranges. We calculated lift values for each percentile by dividing the true positive rate at specific predictive probability thresholds by the overall outcome (F-EMBO) prevalence in the cohort. A lift value exceeding 1.0 indicates better-than-random performance and for example a lift of 3.95 means our model is nearly 4 times better than random selection at identifying high-risk patients. We generated both a lift chart (plotting lift against population percentage sampled) and a cumulative gains chart (showing captured true positives against population percentage sampled) to visualize these relationships. 13
This study was deemed exempt from Institutional Review Board review as it is a retrospective review utilizing de-identified data from the American College of Surgeons Trauma Quality Improvement Program (TQIP) database. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline was used to ensure proper reporting of methods, results, and discussion. STATA (Version 17) and R Studio Statistical Analysis Software (Version 2023.12.1 Build 402) were used for data preprocessing, model development, and analysis. A P-value of <0.05 was considered statistically significant. A copy of the STROBE guidelines as well as flow diagram defining patient cohort can be referenced in Supplemental Digital Content (SDC 1-2).
Results
We identified 6055 patients with blunt trauma who sustained AAST Grade III-IV spleen injuries and underwent SAAE. The median age of the patients in our study was 44 [interquartile range 30-60]. 31.8% were female, 67.9% were male, and 0.3% had missing information pertaining to sex. 634 (10.5%) patients were Hispanic/Latino ethnicity. Patients had a median ISS of 25 (17-30).
There were 3565 (58.9%) patients who underwent SAAE <4 hours from arrival. 5694 (94.0%) patients resulted in S-EMBO, and 361 (6.0%) patients resulted in F-EMBO. Of the patients in the F-EMBO group, 167 (2.8%) required splenectomy, while 204 (3.4%) required repeat SAAE. Ten patients who underwent repeat SAAE ultimately required splenectomy.
Patient and Hospital Characteristics
All data presented as n (%) unless otherwise specified. *Indicates data represented as median (interquartile range). ᶧIndicates mean (standard deviation).
Patient Outcomes
All data presented as n (%) unless otherwise specified. *ICU length of stay and length of stay presented as median (interquartile range).
Top 5 Most Influential Predictors of F-EMBO
RBC—recorded as cc within the first 4 hours of presentation.
Systolic Blood Pressure—recorded within the first 30 minutes of presentation.
Lift Analysis of Risk Deciles

Cumulative Lift Chart for Gradient Boosting Machine Model Performance. The Cumulative Lift Chart Demonstrates the Model’s Ability to Identify Patients at High-Risk for Splenic Artery Angioembolization Failure (F-EMBO) Compared to Random Selection. The x-Axis Represents Deciles Sorted by Predicted Probability, and the y-Axis Shows the Cumulative Lift Value. The Model Achieves Optimal Performance at the Highest Risk Deciles, With a Maximum Lift of 3.95 for the Top Decile. The Dashed Red Line at y = 1.0 Represents Random Chance Performance
Cumulative Lift Analysis Probability Threshold Cutoffs
Discussion
We developed a machine learning (ML) model to identify patient factors that tend to increase the predicted probability of F-EMBO following blunt trauma resulting in AAST Grade III-V spleen injuries. Overall, 6.0% of patients resulted in F-EMBO, with 2.8% resulting in splenectomy and 3.4% resulting in repeat embolization. RBC volumes >200cc, SBP 70-90mmHg, ISS >30, ages 25-30 years, and pulse rates <60 bpm and >120 bpm tended to increase the predicted log odds of F-EMBO following SAAE for hemorrhage control. The optimal threshold that provides the maximum lift is ≥ 0.15.
While Bhangu et al's meta-analysis of 4 prospective and 21 retrospective studies demonstrated improved failure rates with SAAE, it did not specifically address risk factors for failure of embolization (F-EMBO). Our findings partially align with theirs in that increasing Injury Severity Score (ISS) tended to increase the predicted log odds of F-EMBO. However, our analysis revealed a notable distinction regarding age, demonstrating that younger patients, particularly those between 25-30 years, tended to increase the predicted log odds of F-EMBO, whereas increasing age has been associated with higher rates of failure of NOM alone. Younger patients may have more robust physiological responses to injury, potentially masking early signs of decompensation, or may sustain higher-energy mechanisms of injury not fully captured by conventional grading systems. These findings highlight the potential limitation of viewing clinical parameters in isolation or through simple linear relationships.
In addition to ISS and age, our model identified specific ranges across several other influential predictors including RBC volume requirements within the first 4 hours, SBP, and pulse rate that tended to increase the predicted log odds of F-EMBO. Unlike traditional regression analyses that are interpreted with odds ratios and confidence intervals, our machine learning approach in combination with SHAP identifies nonlinear relationships and interaction effects that may not be captured by conventional statistics. RBC volume requirement >200cc within the first 4 hours was the most influential factor in our model. Although this aligns with Bankhead-Kendall et al's finding that transfusion requirements were independently associated with an increased risk of F-EMBO, 9 our model provides specific volume thresholds and relative feature importance. Similarly, we identified that SBP between 70-90 mmHg, bimodal distribution of pulse rate, <60 bpm tachycardic >120 bpm, ISS >30, patients aged 25-30 years tended to have increased predicted log odds of F-EMBO. The interactions between these parameters create a more complex risk profile than can be captured by conventional thresholds alone. While our model does not provide specific odds ratios for these variables, their relative importance across all predictions, quantified by SHAP values, suggests that closer monitoring and potentially more aggressive intervention strategies may be warranted for patients whose clinical parameters fall within these identified high-risk ranges.
We identified one study that evaluated risk factors associated with F-EMBO in splenic trauma. Bankhead-Kendall et al performed a logistic regression and found that transfusion requirements within the first 24 hours was the only factor that was independently associated with increasing odds of F-EMBO. However, these findings should be interpreted with caution. Their study lacked transparency regarding model development, variable selection processes, and performance metrics necessary for proper evaluation and interpretation of results. Our study offers several strengths including enhanced statistical power through a larger cohort derived from a national database, comprehensive reporting of model development and performance metrics, and validation through lift analysis, which demonstrated our model’s superiority to random prediction, with optimal performance at predictive probabilities ≥0.15.
In addition to the lack of reporting transparency, traditional statistical models, including logistic regression, have several limitations. If too many variables are incorporated into regression modeling, this leads to overfitting which decreases the validity and generalizability of results. While several variable selection strategies exist, this inherently introduces selection bias which therefore leads to logistic coefficient bias with traditional statistical methods. 14 In order to eliminate these biases, we identified risk factors associated with increased predicted probability of F-EMBO using machine learning. Machine learning techniques focus on making accurate predictions through identifying nonlinear relationships as opposed to the linear relationships identified by traditional statistical methods. ML also affords the ability to elicit complex relationships between features. 15 Our GBM model enables us to determine which of these factors should be assigned a heavier weight when performing a risk assessment on these patients. RBC volumes administered within the first 4 hours had the highest mean absolute SHAP value, indicating that this was the most influential factor in the model followed by SBP, ISS, age, and pulse rate.
ML is often criticized for its complexity and lack of interpretability. However, we aimed to improve the understanding of ML by applying SHAP. SHAP uses a game theory approach to measure the contribution that each patient factor has on each predicted probability of F-EMBO. Each patient factor is assigned to a SHAP value, quantifying its contribution to each predicted log odds. Not only does SHAP allow us to interpret which factors are important in making predictions, but it also allows identification of the significance of each patient factor in relation to one another. 16
In our risk prediction model for F-EMBO, lift analysis demonstrated significant discriminative ability, with the highest performance observed at probability thresholds ≥0.15 (lift = 3.95), capturing 39.5% of true positives. Although lower thresholds were associated with increasing sensitivity, they also corresponded with decreasing lift values, representing the importance of the trade-off between predictive power and patient classification by the model. While our model consistently outperforms random chance at multiple thresholds, the modest maximum predicted probability of 0.31 suggests room for improvement. External validation followed by prospective evaluation of risk-stratified interventions based on the model’s performance is essential prior to implementation into clinical practice.
Our study has several limitations, one of which is that it is a retrospective database review. Although this allows for increased study power reflected by a large sample size, this is accompanied by a lack of data granularity. For instance, several factors influence both the decision to perform initial or repeat SAAE, including CT scan findings, arteriography findings, and specifics related to SAAE technique, including type of embolizing material used or arterial branch embolized, that were not present in TQIP. However, the aforementioned study by Bankhead-Kendall et al demonstrated that there was no difference in failure rates when comparing patients who had a contrast blush on CT or the branch that was embolized. 9 Additionally, a multicenter review by Haan et al evaluating complications associated with SAAE found that significant hemoperitoneum did not affect success of embolization. 17 They did however find that arteriovenous fistula was associated with higher failure rates following embolization. Another study by Sclafani et al 18 found that the absence of extravasation on arteriography was a reliable indicator of successful NOM. Future ML models should incorporate initial CT findings as well as arteriography findings and evaluate their feature importance in predicting F-EMBO following SAAE.
TQIP also lacks information pertaining to the patient’s hospital course following initial procedure including lab values, transfusion requirements, imaging, and indication for subsequent procedure including splenectomy vs repeat SAAE. Therefore, our model should be taken into consideration for risk stratification on patient arrival and post-procedural monitoring as opposed to guiding ongoing management decisions following SAAE.
Conclusion
RBC transfusion, SBP, age, ISS, and pulse rates are the top 5 influential factors predicting F-EMBO. RBC volumes >200cc, SBP 70-90 mmHg, ISS >30, ages 25-30 years, and pulse rates <60 bpm and >120 bpm tended to increase the predicted log odds of F-EMBO following SAAE for hemorrhage control. Further studies should be conducted to identify post-procedure patient factors related to F-EMBO to improve upon risk stratification, post-procedure monitoring, and management strategies in patients with blunt, spleen trauma. 19
Supplemental Material
Supplemental Material - Predicting Angioembolization Failure in Blunt Splenic Trauma
Supplemental Material for Predicting Angioembolization Failure in Blunt Splenic Trauma by Melissa A. Kendall, Tyler Zander, Emily A. Grimsley, Rachel L. Wolansky, Rajavi Parikh, Joseph Sujka, Paul Kuo, Jose J. Diaz in The American Surgeon™
Supplemental Material
Supplemental Material - Predicting Angioembolization Failure in Blunt Splenic Trauma
Supplemental Material for Predicting Angioembolization Failure in Blunt Splenic Trauma by Melissa A. Kendall, Tyler Zander, Emily A. Grimsley, Rachel L. Wolansky, Rajavi Parikh, Joseph Sujka, Paul Kuo, Jose J. Diaz in The American Surgeon™
Supplemental Material
Supplemental Material - Predicting Angioembolization Failure in Blunt Splenic Trauma
Supplemental Material for Predicting Angioembolization Failure in Blunt Splenic Trauma by Melissa A. Kendall, Tyler Zander, Emily A. Grimsley, Rachel L. Wolansky, Rajavi Parikh, Joseph Sujka, Paul Kuo, Jose J. Diaz in The American Surgeon™
Footnotes
Acknowledgments
The authors thank the American College of Surgeons Committee on Trauma for providing the TQIP PUF Admission Years 2018-2022 database. The content reproduced from the PUF remains the full and exclusive copyrighted property of the American College of Surgeons. The American College of Surgeons is not responsible for any claims arising from works based on the original data, text, tables, or figures.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research reported in this publication was supported by the Ruth L. Kirschstein Institutional National Research Service Award of the National Institutes of Health under award number T32GM144274. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Declaration of conflicting interests
Dr Tyler Zander receives funding through the Ruth L. Kirschstein Institutional National Research Service Award of the National Institutes of Health (T32GM144274). Drs. Melissa A. Kendall, Emily A. Grimsley, Rachel L. Wolansky, Rajavi Parikh, Joseph Sujka, Paul C. Kuo, and Jose J. Diaz have nothing to disclose.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
