VILLAIN at AVerImaTeC: Verifying Image–Text Claims via Multi-Agent Collaboration
A cutting-edge multimodal fact-checking system, VILLAIN, utilizes prompt-based multi-agent collaboration with vision-language models to verify image-text claims, achieving top performance in the AVerImaTeC shared task.
Unlocking Advanced Multimodal Fact-Checking
VILLAIN demonstrates how multi-agent collaboration with VLMs can achieve superior performance in verifying complex image-text claims, setting a new benchmark for automated fact-checking systems.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
AVerImaTeC Shared Task Focus
Real-world Image-Text Claims Automated verification using external evidenceCase Study: The Need for Multimodal Fact-Checking
Challenge: The proliferation of misleading multimodal content necessitates robust automated verification systems capable of handling both images and text.
Solution: VILLAIN addresses this by employing a multi-agent system with VLM capabilities to process and verify image-text claims.
Impact: Achieving state-of-the-art performance in real-world scenarios, VILLAIN sets a new standard for accuracy and reliability in multimodal fact-checking.
Enterprise Process Flow
| Feature | VILLAIN | Baseline |
|---|---|---|
| Text Embedding Model |
|
|
| Text Reranking Model |
|
|
| Visual Embedding Model |
|
|
| Evidence Enrichment |
|
|
Enterprise Process Flow
Impact of Evidence Analysis Agents
+0.040 Evid-Eval score improvement for Gemini-2.5-Flash with agentsEnterprise Process Flow
Number of QA Pairs Generated
20 Total high-impact Question-Answer pairsEnterprise Process Flow
Veracity Score on Test Set
0.546 VILLAIN's top performance across all metrics| Feature | VILLAIN (HUMANE) | ADA-AGGR |
|---|---|---|
| Veracity Score |
|
|
| Q-Eval Score |
|
|
| Evid-Eval Score |
|
|
| Justification Score |
|
|
Case Study: Ablation Study: Knowledge Store Filling
Challenge: Many original evidence entries had empty URL2text fields or generic content, limiting the factual information available for verification.
Solution: Implemented an automated URL content extraction pipeline using Playwright to fill missing text fields and filter irrelevant content.
Impact: Consistently improved Evid-Eval scores across all models (e.g., +0.03 for Gemini-2.5-Flash), demonstrating the value of richer, cleaner evidence.
Gemini-2.5-Pro Performance
Superior Outperforms other models across most evaluation metricsCase Study: Effectiveness of Multi-Agent Collaboration
Challenge: Verifying complex multimodal claims requires robust evidence processing, analysis, and reasoning across different modalities.
Solution: VILLAIN's multi-agent pipeline leverages VLM agents for modality-specific and cross-modal analysis, iterative QA generation, and final verdict prediction with justification.
Impact: Experimental results confirm VILLAIN's consistent outperformance, highlighting the effectiveness of multi-agent collaboration and iterative reasoning for multimodal fact-checking.
Calculate Your Potential ROI
Estimate the potential cost savings and efficiency gains your organization could achieve with AI implementation.
Your AI Implementation Roadmap
A phased approach to integrate AI seamlessly into your enterprise operations, ensuring maximum impact with minimal disruption.
Phase 1: Discovery & Strategy
Comprehensive analysis of existing workflows, identification of AI opportunities, and development of a tailored AI strategy.
Phase 2: Pilot & Prototyping
Deployment of AI prototypes in a controlled environment, performance evaluation, and iterative refinement based on feedback.
Phase 3: Integration & Scaling
Seamless integration of AI solutions into core enterprise systems and phased rollout across relevant departments.
Phase 4: Optimization & Monitoring
Continuous monitoring of AI system performance, ongoing optimization, and adaptation to evolving business needs.
Ready to Transform Your Enterprise?
Partner with OwnYourAI to navigate the complexities of AI adoption and unlock unparalleled operational efficiency and innovation.