Artificial Intelligence Research
ComputAgeBench: Epigenetic Aging Clocks Benchmark
Authors: Dmitrii Kriukov, Evgeniy Efimov, Ekaterina Kuzmina, Anastasiia Dudkovskaia, Ekaterina E. Khrameeva, Dmitry V. Dylov
Published: August 2025 (KDD '25)
Executive Impact Summary
ComputAgeBench introduces the first comprehensive framework for benchmarking epigenetic aging clocks, addressing the critical need for standardized validation in longevity research. This platform allows for rigorous comparison of biological age predictors against pre-defined aging-accelerating conditions, ensuring reliable biomarkers for health and aging.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Epigenetic Aging Clock Validation Flow
Robust AAC/ADC Selection Criteria
To ensure clinical relevance and data quality, conditions for ComputAgeBench are selected based on three criteria: 1) Decreased Life Expectancy: Must lead to reduced lifespan even with treatment. 2) Chronic Nature: Must have sufficient time to drive observable DNAm changes. 3) Systemic Manifestation: Must affect DNAm in blood, saliva, or buccal cells. This meticulous selection ensures the benchmark's reliability.
High-Quality Data Aggregation Guidelines
Our benchmark integrates datasets adhering to strict guidelines: 1) Open Access: Pre-processed data readily available. 2) Sample Source: Limited to blood, saliva, and buccal cells. 3) Annotated Ages: Samples typically aged 18-90 (excluding progeroid syndromes). 4) Data Type: Illumina Infinium BeadChip (27K/450K/850K). 5) Sufficient Samples: Minimum 10 samples per dataset, 5 AAC samples per dataset, 10 AAC samples total.
Comprehensive Clock Evaluation: Four-Task Framework
ComputAgeBench evaluates aging clocks across four distinct tasks to ensure a holistic assessment of their performance: 1) Relative Aging Acceleration Prediction (AA2): Distinguishes AAC from healthy controls using two-sample Welch's test. 2) Absolute Aging Acceleration Prediction (AA1): Predicts positive aging acceleration in AAC cohorts using one-sample Student's t-test. 3) Chronological Age Prediction Accuracy: Measures median absolute error (Med(|A|)) on healthy controls. 4) Systematic Chronological Age Prediction Bias: Evaluates median aging acceleration (Med(A)) on healthy controls to detect covariate shifts.
The second-generation clock PhenoAgeV2 demonstrates the highest cumulative benchmarking score, excelling in distinguishing individuals with aging-accelerating conditions from healthy cohorts. It scored 20 in AA2 and 9 in AA1, with a Med(|A|) of 7.6 years and a small bias of -2.6 years.
AA2 Task: Distinguishing AACs
The AA2 task rigorously tests a clock's ability to differentiate between healthy and AAC samples. PhenoAgeV2 leads with 20/42 successful detections, followed by GrimAgeV1/V2 (14/42). Notably, all clocks struggled with cardiovascular and metabolic diseases, suggesting an implicit training bias towards immune system-related conditions, especially HIV.
AA1 Task: Predicting AACs Without Controls
The AA1 task assesses a clock's ability to predict positive aging acceleration in AAC cohorts without a control group. GrimAgeV2 and Zhang19_EN perform well (20/24 and 19/24), but often exhibit significant prediction bias (Med(A)). HorvathV1/V2 and VidalBralo show lower bias, making their AA1 performance more reliable.
Addressing the Biomarker Paradox & Validation Challenges
Biological age, as a latent variable, lacks a direct ground truth for training and validation. The 'biomarkers paradox' highlights that clocks highly correlated with chronological age often fail to predict mortality. ComputAgeBench tackles this by focusing on distinguishing AACs, providing a more biologically relevant validation strategy in the absence of direct mortality or disease onset data. This framework enables rigorous evaluation, moving beyond simple chronological age prediction.
Catalyzing Future Aging Research
This benchmark is designed to bring aging biology and machine learning closer, facilitating reliable biomarker discovery. Future work could include: 1) Expanding to more diverse biological samples and conditions. 2) Integrating multi-omics data for enhanced prediction. 3) Developing new clock models explicitly trained to minimize bias across diverse AACs. 4) Continuous collaboration between communities to refine benchmarking metrics.
Quantify Your AI Transformation ROI
Estimate the potential annual savings and reclaimed productivity hours by integrating advanced AI solutions, tailored to your enterprise.
Your AI Implementation Roadmap
A typical journey to integrate cutting-edge AI, from initial strategy to scaled operations, ensuring measurable impact and sustained growth.
Phase 1: Discovery & Strategy (2-4 Weeks)
In-depth analysis of current workflows, identification of AI opportunities, data readiness assessment, and crafting a tailored AI strategy with clear KPIs. Deliverables include a detailed proposal and a strategic roadmap.
Phase 2: Pilot & Proof-of-Concept (4-8 Weeks)
Development and deployment of a small-scale AI pilot, focusing on a high-impact, low-risk area. This phase validates the technology, refines models, and demonstrates initial ROI. Includes iterative feedback and optimization cycles.
Phase 3: Integration & Expansion (8-16 Weeks)
Seamless integration of the AI solution into existing enterprise systems. Scaled deployment across relevant departments, comprehensive training for end-users, and establishment of robust monitoring and maintenance protocols.
Phase 4: Optimization & Future-Proofing (Ongoing)
Continuous performance monitoring, advanced analytics for further optimization, and exploration of new AI capabilities. Regular reviews ensure the solution evolves with business needs and technological advancements.
Ready to Transform Your Enterprise with AI?
Our experts are ready to guide you through the complexities of AI integration, delivering innovative solutions that drive real business value and competitive advantage.