AI Impact Analysis
Risk-based predictive modelling for audit verification: evidence from EU-funded programmes
This study proposes a machine learning framework to support risk-based verification of expenditure declarations in European Structural and Investment Funds, reflecting the current regulatory emphasis on proportional and data-driven audit strategies. Quantitatively, the problem is formulated as an imbalanced three-class classification task with ordered outcomes on high-dimensional administrative data; the ordinal structure is exploited ex post in evaluation and error interpretation. The framework classifies expense documents as validated, partially validated, or not validated, and provides audit authorities with interpretable probability estimates for each case. A predictive model was trained and validated on more than ninety thousand expense documents from the Italian Regional Operational Programme co-funded by the European Regional Development Fund (2014–2020). Methodological challenges—ordered outcomes, severe class imbalance, and mixed-type features—were addressed through targeted preprocessing and the CatBoost gradient-boosting algorithm. The model achieved satisfactory predictive performance, offering probabilistic outputs aligned with the ordered structure of audit outcomes. Variable-importance analysis confirmed the relevance of both financial and administrative variables in predicting irregularities. The framework is designed with operational integration in mind and could underpin risk-based sampling in expenditure verification, subject to further validation across time, programmes, and beneficiary structures. Departing from a literature that largely focuses on binary classification or fraud detection, the study addresses the under-studied challenge of multi-class prediction in public expenditure control and provides an interpretable prototype decision-support tool. The model could support public authorities in prioritizing controls and allocating resources more efficiently, contributing to the modernization of European Union fund management and promoting data-driven, proportionate oversight—conditional on governance arrangements and external validation.
Key Executive Impact Metrics
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Improved Efficiency & Compliance
0 of documents auto-validated under calibrated policyEnterprise Process Flow
| Feature | Traditional Approach | AI-Powered Approach |
|---|---|---|
| Verification Scope | Exhaustive (100%) checks | Risk-based (targeted sampling) |
| Resource Allocation | Uniform, often inefficient | Optimized, high-risk focus |
| Irregularity Detection | Reactive, manual | Proactive, predictive |
| Decision Basis | Heuristic, expert judgment | Data-driven probability estimates |
Case Study: Italian Regional Operational Programme
The model was trained and validated on over 90,000 expense documents from the Italian Regional Operational Programme co-funded by the European Regional Development Fund (2014–2020). This real-world application demonstrated the framework's ability to handle high-dimensional administrative data with ordered, imbalanced outcomes, achieving satisfactory predictive performance in a critical public finance context. The project highlighted that auditing 40% of cases could reduce undetected irregular amount to 0.66%, a significant improvement over traditional methods.
Advanced ROI Calculator
Estimate the potential return on investment for implementing an AI-powered risk-based verification system in your organization.
Your Implementation Roadmap
A structured approach to integrating AI into your public financial management processes.
Phase 1: Data Integration & Preprocessing
Integrate administrative micro-data into the ML framework, ensuring data quality, handling missing values, and preparing mixed-type features for model training.
Phase 2: Model Development & Validation
Train CatBoost classifier with targeted imbalance handling (undersampling), perform leakage-safe validation (beneficiary-grouped, time-aware splits), and calibrate probability outputs.
Phase 3: Operational Policy & Decision Rules
Define threshold-based auto-validation policies and ranking-based audit budgets, translating probabilistic risk scores into actionable decisions aligned with governance requirements.
Phase 4: Continuous Monitoring & Retraining
Establish a protocol for ongoing performance monitoring, concept drift assessment, and periodic model retraining to adapt to evolving administrative environments and beneficiary behavior.
Ready to Transform Your Operations?
Leverage advanced AI to enhance efficiency, reduce risk, and ensure compliance in your public financial management. Connect with our experts to design a tailored solution.