Title : Performance and predictive accuracy of machine learning prognostic models in traumatic spinal cord injury: A systematic review and meta-analysis
Abstract:
Traumatic spinal cord injury (SCI) is a leading cause of long-term neurological disability, profoundly affecting function and quality of life. Advances in artificial intelligence, particularly machine learning (ML), have enabled prognostic models to estimate recovery, functional independence, and complication risks. Despite increasing use, the predictive accuracy and clinical applicability of ML models for traumatic SCI have not been fully evaluated. We systematically searched PubMed, Cochrane Library, and CINAHL-EBSCO for peer-reviewed studies published in the past five years applying ML to outcome prediction in adult patients with traumatic SCI in the United States. Eligible models included random forests, support vector machines, neural networks, gradient boosting, deep learning, and ML-enhanced logistic regression. Outcomes of interest were neurological recovery (AIS, SCIM, FIM, gait), functional independence, complications, reoperation risk, quality of life, and mortality. Non-traumatic SCI, animal studies, case reports, reviews, protocols, and studies lacking performance metrics were excluded. Screening and extraction were conducted independently by three reviewers using Covidence. Following PRISMA guidelines, 454 records were screened, with 32 studies included, representing over 200,000 patients across registry, ICU, and rehabilitation datasets. Mortality, neurological recovery, and hospital/ICU outcomes were most commonly predicted. Tree-based and ensemble methods (Random Forest, XGBoost, LightGBM, CatBoost) were frequently applied, often paired with SHAP analysis or decision-tree thresholds for interpretability. ML models generally outperformed conventional regression benchmarks, with mortality prediction AUCs of 0.84–0.89, functional outcome R² up to 0.88, and discharge disposition AUCs of 0.80–0.87. Key predictors consistently included admission AIS grade, motor scores, age, injury level, and ICU vital signs. Overall, ML prognostic models show strong predictive accuracy and potential as clinically interpretable tools for risk stratification and outcome prediction in traumatic SCI. However, reliance on retrospective data, heterogeneous outcomes, and limited external validation limit immediate clinical application. Future research should prioritize prospective validation, standardized outcomes, and integration into bedside decision-support systems to advance precision-oriented, patient-centered SCI care.