Skip to main content
Enterprise AI Analysis: Pediatric diabetes prediction using machine learning

Healthcare AI / Machine Learning

Pediatric diabetes prediction using machine learning

Diabetes is a chronic condition that affects a substantial portion of the global population and is linked to elevated mortality rates and a range of severe health complications. Despite its clinical importance, progress in diabetes research is often constrained by the limited availability of comprehensive datasets and robust predictive models. To address these challenges, researchers are increasingly turning to big data analytics and machine learning (ML) methodologies. This study presents the development of an ML-based system aimed at predicting the likelihood of diabetes and classifying its various types. A novel dataset, termed Diabetes Types Dataset, was constructed by integrating four heterogeneous dataset sources: paediatrics data from the Mansoura University Children Hospital repository, the Pima Indian Diabetes (PIMA) dataset, the Pone dataset, and a Gestational Diabetes dataset. The classification of diabetes types was approached as a multiclass problem using a suite of supervised ML algorithms, including Artificial Neural Networks (ANN), Logistic Regression, Naive Bayes, Decision Trees, Adaptive Boosting, Random Forests, Gradient Boosting, Support Vector Machines, and K-Nearest Neighbors. Model performance was evaluated using several metrics: Accuracy, Precision, Mean Squared Error, and Area Under the Receiver Operating Characteristic Curve. Among the models tested, the ANN classifier demonstrated the highest accuracy, achieving a peak performance of 99.98%. Further validation was conducted using an external dataset referred to as diabetes_prediction, which confirmed the model's robustness with consistent accuracy. Additionally, the proposed system was applied to a publicly available dataset, diabetes_Dataset, containing 34 features used to predict 12 distinct types of diabetes efficiently. The results suggest that this ML-driven approach can significantly enhance the ability of healthcare professionals to detect and classify diabetes types, thereby supporting early intervention and improved disease management.

Author(s): Abeer El-Sayyid El-Bashbishy & Hazem M. El-Bakry

Published: January 15, 2026

99.98% ANN Prediction Accuracy

This research introduces a novel machine learning system for the early and accurate prediction and classification of pediatric diabetes types, leveraging a new 'Diabetes Types Dataset' integrated from diverse sources. Utilizing supervised ML algorithms, notably Artificial Neural Networks, the system achieves a remarkable 99.98% accuracy, significantly advancing diagnostic capabilities and supporting proactive disease management in healthcare.

Transformative Impact Metrics

The proposed ML system significantly enhances early detection and precise classification of pediatric diabetes, leading to improved clinical outcomes and more effective long-term disease management. By integrating diverse patient data and achieving high accuracy, it provides healthcare professionals with a powerful tool for proactive intervention, reducing the burden of severe complications.

0 Prediction Accuracy
0 Dataset Integration
0 ML Algorithms Used
0 Pediatric Focus

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Enterprise Process Flow

Data Collection
Data Pre-processing
Train Phase
Test Phase
Model Optimization
Prediction

Data Integration & Preprocessing Pipeline: The study's foundation is a novel 'Diabetes Types Dataset' (DTD) constructed from four heterogeneous sources (Pediatrics, PIMA, Pone, Gestational Diabetes). A robust preprocessing pipeline, including MICE-based imputation for missing values and SMOTE for class balancing, ensures data quality and fairness, crucial for multi-class classification.

Machine Learning Model Comparison

Model Accuracy Precision Recall F1 Score
ANN 99.98% 99.98% 99.98% 99.98%
RF 99.94% 99.94% 99.94% 99.94%
GB 99.94% 99.94% 99.94% 99.94%
SVM 99.42% 99.42% 99.42% 99.42%
KNN 99.39% 99.39% 99.39% 99.39%
DT 99.75% 99.75% 99.75% 99.75%
AB 98.95% 98.92% 98.92% 98.92%
LR 99.20% 99.20% 99.20% 99.20%
NB 97.66% 97.59% 97.59% 97.59%

Machine Learning Model Comparison: Nine supervised ML algorithms (ANN, LR, NB, DT, AB, RF, GB, SVM, KNN) were evaluated for multi-class classification. ANN demonstrated the highest accuracy, achieving 99.98%, significantly outperforming other models and confirming its robustness through external validation.

Diagnosis Most Influential Feature for Prediction

Key Feature Contribution to Prediction: SHAP analysis revealed the most influential features for diabetes prediction. Diagnosis, Age, Blood Pressure, Insulin, and Number of Pregnancies were identified as critical variables, providing transparency and interpretability to the model's decisions, crucial for clinical trust.

Impact on Pediatric Healthcare

A major challenge in pediatric healthcare is the early and accurate diagnosis of various diabetes types due to their diverse manifestations and subtle initial symptoms in younger patients. Traditional diagnostic methods can be time-consuming and may lead to delayed interventions, which can have significant long-term health consequences for children. This ML-driven approach addresses these challenges head-on.

  • Enables early, precise identification of diabetes types in pediatric patients.
  • Supports proactive management strategies and personalized treatment plans.
  • Reduces the risk of severe, long-term complications associated with delayed diagnosis.
  • Provides a scalable and robust diagnostic tool for diverse healthcare settings.

Impact on Pediatric Healthcare: The system's ability to accurately predict and classify pediatric diabetes types (Type 1, Type 2, Gestational, Normal) empowers healthcare professionals with timely, data-driven insights. This facilitates early diagnosis and supports proactive disease management, potentially reducing long-term health complications for children.

Calculate Your Enterprise AI ROI

Implementing advanced AI for pediatric diabetes prediction can significantly reduce diagnostic errors, improve patient outcomes, and optimize healthcare resource allocation. Calculate your potential operational savings and efficiency gains.

Estimated Annual Savings $0
Annual Hours Reclaimed 0

Your AI Implementation Roadmap

Our phased approach ensures a seamless integration of AI into your enterprise, maximizing impact with minimal disruption.

Data Preparation & Model Training

Integrate diverse pediatric datasets, apply advanced preprocessing (MICE, SMOTE), and train the ANN model on the DTD for high accuracy.

Clinical Validation & Integration

Conduct rigorous clinical validation using external datasets and integrate the ML system into existing EMR/EHR workflows within healthcare facilities.

Deployment & Monitoring

Deploy the validated AI system, establish continuous monitoring for performance, and provide training for healthcare professionals on its usage and interpretation.

Continuous Improvement & Expansion

Iteratively refine the model with new data, explore enhancements like genetic algorithms or medical imaging integration, and expand its application to other diseases.

Ready to Innovate with AI?

Unlock the full potential of machine learning for advanced diagnostics and improved patient care. Book a personalized strategy session with our AI experts.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking