Skip to main content
Enterprise AI Analysis: Artificial Intelligence Models for Forecasting Mosquito-Borne Viral Diseases in Human Populations: A Global Systematic Review and Comparative Performance Analysis

Enterprise AI Analysis

Artificial Intelligence Models for Forecasting Mosquito-Borne Viral Diseases in Human Populations: A Global Systematic Review and Comparative Performance Analysis

This review synthesizes the predictive performance, methodological quality, and operational readiness of AI/ML models for forecasting mosquito-borne viral diseases. It highlights the potential of tree-ensemble approaches for short-term, fine-scale forecasts but also exposes significant variability, risk of bias, and a critical need for standardized reporting and rigorous external validation to ensure real-world applicability.

Published: 7 January 2026 | DOI: 10.3390/make8010015

Executive Impact & Key Findings

Leverage AI to enhance public health surveillance with robust forecasting models, enabling proactive intervention and resource allocation for mosquito-borne viral diseases.

0 Total Studies Analyzed
0 Studies Focused on Dengue
0 Studies at High Risk of Bias
0 Tree-Ensemble Acc. (Classification)
0 Short-Term, Fine-Scale Error
0 Long-Horizon, National Error

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

63/98

Studies at High Risk of Bias (PROBAST)

The PROBAST assessment indicated that a majority of studies were at high risk of bias due to lack of transparent reporting, inconsistent data handling, and insufficient safeguards against overfitting. This highlights a critical need for more rigorous methodological practices in AI/ML forecasting research.

Global Distribution of Research Efforts

Asia (62 studies)
South America (23 studies)
North America (6 studies)
Europe (2 studies)
Africa (1 study)
Oceania (1 study)

Most research on AI/ML for mosquito-borne viral disease forecasting originates from regions with higher disease burden, such as Asia and South America. This geographical distribution reflects the urgent need for predictive tools in these areas.

Regression Error: Scale-Dependent Performance

Regression metrics revealed marked heterogeneity across studies, largely driven by differences in spatial scale, temporal resolution, and underlying case magnitude. Low errors occurred in fine-resolution forecasts (e.g., weekly city-level), whereas larger errors were associated with national-level or high-incidence settings.

For RMSE and MAE, values in the 'very small' (≤1) category were common for short-term, fine-scale forecasts (e.g., weekly city level). In contrast, 'large' errors (>1000) were primarily observed in national-scale predictions using classical ML and time-series/statistical models, highlighting a critical limitation for broader applications.

Priorities for Real-World Implementation

Recommendation Description
Standardised reporting across studies Promotes transparency and comparability of models.
Rigorous external validation Ensures models generalize to independent datasets and real-world conditions.
Context-specific calibration Adapts models to local epidemiological patterns and resource settings for reliable performance.
To bridge the gap between methodological innovation and practical deployment, future AI models need transparent reporting, robust validation using independent datasets, and careful calibration to ensure reliability in diverse public health contexts. Addressing algorithmic inequity by training on diverse, locally generated data is also critical.

Classical ML Regression: Variability at Similar Scales

Classical ML models showed substantial variability in regression performance. For instance, MAE values on the same 'cases-monthly-city level' scale ranged from 4.57 to 200.68, highlighting inconsistent performance even within identical spatiotemporal settings. RMSE values also spanned a wide range, from 0.04 to 9296.35, depending on model and scale.

R² values ranged from 0.18 to 0.99, and Pearson's r from 0.50 to 0.91, often declining with longer forecast horizons. This indicates that classical ML can be effective but requires careful calibration to specific contexts and has limitations for long-term or broader forecasts.

Tree-Ensemble Classification Performance Highlights

Metric Value Range
AUC 0.84-0.99 (consistently high)
Sensitivity 0.64-0.99
Specificity 0.73-0.98
F1-Score 0.72-0.98 (most ≥0.90)
Tree-ensemble models, particularly Random Forest and XGBoost, consistently demonstrated high predictive ability and balanced performance across various classification metrics, indicating their robustness for outbreak detection.

Deep Learning Classification Performance Variability

Metric Value Range
AUC 0.98 ± 0.01 (MobileNetV3Small)
Sensitivity 0.00-0.97 (wide range)
Accuracy 0.26-1.00 (wide range)
F1-Score 0.00-0.98 (wide range)
Deep learning models exhibited wide variability in performance, heavily dependent on architectural choices, forecast horizon, and data imbalance, particularly evident in city-specific LSTM forecasts with extreme class imbalance.

Hybrid and superensemble models combine multiple AI algorithms to leverage their strengths. While showing high AUC values (0.93–0.97 for stronger ensembles), their performance can vary widely depending on the underlying base learners and the complexity of the integrated framework. Regression MAE values spanned from 0.17 to over 50,000 cases, with RMSE values ranging from 0.02 to over 20,000, underscoring significant scale-dependence and the need for robust evaluation.

Time-series and statistical models like ARIMA and Prophet served as baselines, with AUC values reported around 0.78 for temporal averages. Mechanistic models, such as SIR + EAKF, were less frequently evaluated with standard regression metrics but focused on timing and peak prediction errors. Both categories often exhibited higher MAPE values (up to 94.84% for NNAR) and were highly sensitive to data quality and forecast horizon.

Calculate Your Potential ROI with AI-Powered Forecasting

Estimate the operational efficiency gains and cost savings your organization could achieve by implementing advanced AI models for disease forecasting, reducing manual effort and improving public health response.

Annual Savings Potential $0
Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A phased approach to integrate robust AI forecasting into your public health surveillance, ensuring reliable predictions and efficient resource management.

Phase 1: Data Integration & Preprocessing

Establish standardized data pipelines for diverse data sources (epidemiological, climatic, demographic). Implement robust procedures for handling missing values and data imbalance, crucial for model reliability.

Phase 2: Model Selection & Training

Identify and train appropriate AI/ML models based on your specific forecasting needs. Prioritize tree-ensemble models for classification tasks due to their consistent performance, and carefully select regression models considering spatial and temporal scales.

Phase 3: Validation & Calibration

Conduct rigorous external validation using independent datasets to assess generalizability. Perform context-specific calibration to fine-tune models to local epidemiological patterns and environmental factors, enhancing real-world accuracy.

Phase 4: Operational Deployment & Monitoring

Integrate validated AI models into existing public health surveillance and early-warning systems. Implement continuous monitoring of model performance and regularly retrain models with new data to maintain predictive accuracy over time.

Ready to Transform Your Public Health Surveillance?

Book a personalized consultation with our AI experts to design a tailored strategy for integrating advanced forecasting models into your operations.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking