Enterprise AI Analysis
Does Provider Identity at Triage Improve Machine Learning Prediction of Hospital Admission? A Comparative Analysis of Ten Supervised Classifiers with SHAP Explainability
This study rigorously evaluated whether incorporating provider identity improves machine learning prediction of hospital admission from emergency department (ED) triage data. Across ten diverse supervised classifiers, provider identity offered negligible incremental predictive value beyond standard triage variables. The top-performing model, CatBoost, achieved an AUC of 0.8906. SHAP explainability revealed that ESI level, respiratory rate, temperature, complaint category, and age were the dominant predictors, with clinically intuitive directional effects. The observed provider admission rate variation (24.3% to 39.9%) was primarily attributed to patient case-mix differences rather than independent practice patterns. This suggests that robust admission prediction models can be developed using triage data alone, without requiring provider identity, enhancing trust and interpretability in clinical decision support.
Executive Impact
Unlock the potential of data-driven insights to optimize resource allocation and improve patient flow in your emergency department.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
This section details the study's design, cohort construction, feature engineering, and the machine learning models employed. It highlights the use of a temporal holdout test set to ensure realistic prospective performance evaluation, and the application of SHAP for model explainability.
Study Methodology Workflow
| Feature | Models (target encoding) | CatBoost (native handling) |
|---|---|---|
| Missing Values | Training-set medians | Internal handling |
| Categorical Variables | Label-encoded | Native categoricals |
| Provider Feature | Target-encoded mean admission rate | Native categorical |
The results highlight CatBoost as the top-performing model, achieving an AUC of 0.8906. Crucially, adding provider identity did not significantly improve prediction across any of the ten classifiers. SHAP analysis confirmed that vital signs and ESI level were the most influential predictors, aligning with clinical intuition.
| CatBoost | XGBoost |
|---|---|
| Respiratory Rate | ESI Level |
| ESI Level | Respiratory Rate |
| Temperature | Temperature |
| Complaint Category | Complaint Category |
| Age | Age |
This section interprets the findings, attributing observed provider variation to case-mix differences rather than independent practice patterns. It underscores the importance of explainability for clinical adoption and discusses limitations, including the single-site nature of the study and the need for prospective validation.
Case-Mix vs. Practice Variation
The study found a 15.6-percentage-point spread in admission rates among high-volume providers (24.3% to 39.9%). However, target-encoded provider identity alone achieved an AUC of only 0.5346, barely above random chance. This indicates that provider identity did not contribute unique predictive signal beyond what standard triage data already captured, suggesting that observed admission rate differences are primarily due to patient case-mix variation rather than independent provider practice patterns.
Key findings supporting this conclusion:
- Provider identity alone had an AUC of 0.5346.
- Negligible correlation (0.059) with baseline model errors.
- Provider assignment is often based on pod staffing, not unmeasured acuity.
Quantify Your AI Advantage
Our AI solutions streamline emergency department triage and admission prediction, reducing wait times and optimizing resource allocation. By accurately identifying patients requiring admission earlier, hospitals can proactively manage bed requests and nursing staff, improving patient flow and reducing operational bottlenecks.
Your AI Implementation Roadmap
A structured approach ensures successful integration and maximum impact. Here’s a typical timeline for deploying our AI solution.
Data Integration & Pre-processing
Securely integrate EHR data, clean, and preprocess for model training, focusing on triage variables and outcomes.
Model Training & Validation
Train and validate ensemble ML models like CatBoost on historical data, establishing baseline performance and generalizability.
SHAP Explainability & Clinical Review
Implement SHAP to provide transparent feature contributions, allowing clinicians to review and trust model reasoning.
Pilot Deployment & Real-time Evaluation
Pilot the model in a real-time clinical workflow, monitoring performance, user feedback, and patient outcomes.
Scaling & Continuous Improvement
Scale the solution across the department, integrating feedback for continuous model refinement and performance optimization.
Ready to Transform Your Enterprise?
Partner with us to leverage cutting-edge AI for improved operational efficiency and patient outcomes. Our experts are ready to design a tailored solution for your organization.