VILLAIN at AVerImaTeC: Verifying Image–Text Claims via Multi-Agent Collaboration

A cutting-edge multimodal fact-checking system, VILLAIN, utilizes prompt-based multi-agent collaboration with vision-language models to verify image-text claims, achieving top performance in the AVerImaTeC shared task.

Unlocking Advanced Multimodal Fact-Checking

VILLAIN demonstrates how multi-agent collaboration with VLMs can achieve superior performance in verifying complex image-text claims, setting a new benchmark for automated fact-checking systems.

0 Veracity Score (Test Set)

0 Leaderboard Rank

0 Q-Eval Score

0 Evid-Eval Score

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

AVerImaTeC Shared Task Focus

Real-world Image-Text Claims Automated verification using external evidence

Case Study: The Need for Multimodal Fact-Checking

Challenge: The proliferation of misleading multimodal content necessitates robust automated verification systems capable of handling both images and text.

Solution: VILLAIN addresses this by employing a multi-agent system with VLM capabilities to process and verify image-text claims.

Impact: Achieving state-of-the-art performance in real-world scenarios, VILLAIN sets a new standard for accuracy and reliability in multimodal fact-checking.

Enterprise Process Flow

Knowledge Store Ktxt

→

Knowledge Store Kimg

→

Knowledge Store K1

→

Textual/Visual Evidence Retrieval

→

Evidence Enrichment (URL content filling)

Evidence Retrieval Mechanism Comparison

Feature	VILLAIN	Baseline
Text Embedding Model	mxbai-embed-large-v1	Generic embeddings
Text Reranking Model	mxbai-rerank-large-v1	No reranking
Visual Embedding Model	Ops-MM-embedding-v1-7B	No dedicated VLM
Evidence Enrichment	URL content filling via Playwright	No enrichment

Enterprise Process Flow

Claim

→

Retrieved Evidence (Textual, Image, Cross-modal)

→

Text-Text Agent (ATT)

→

Image-Text Agent (AIT)

→

Cross-Modal Agent (ACM)

→

Analysis Outputs (OTT, OIT, OCM)

Impact of Evidence Analysis Agents

+0.040 Evid-Eval score improvement for Gemini-2.5-Flash with agents

Enterprise Process Flow

Claim, Analysis Outputs

→

AQA Agent (VLM)

→

Generate 5 QA pairs (iteration i)

→

Append to existing QA pairs

→

Loop (4 iterations for 20 total QA pairs)

Number of QA Pairs Generated

20 Total high-impact Question-Answer pairs

Enterprise Process Flow

Claim

→

Generated QA Pairs (Q)

→

Verdict Prediction Agent (Av)

→

Select Top-10 Relevant QA Pairs (Q*)

→

Predict Verdict (v)

→

Generate Justification (j)

Veracity Score on Test Set

0.546 VILLAIN's top performance across all metrics

Leaderboard Performance Comparison (Test Set)

Feature	VILLAIN (HUMANE)	ADA-AGGR
Veracity Score	0.546 (1st)	0.537
Q-Eval Score	0.890 (1st)	0.370
Evid-Eval Score	0.536 (1st)	0.463
Justification Score	0.556 (1st)	0.433

Case Study: Ablation Study: Knowledge Store Filling

Challenge: Many original evidence entries had empty URL2text fields or generic content, limiting the factual information available for verification.

Solution: Implemented an automated URL content extraction pipeline using Playwright to fill missing text fields and filter irrelevant content.

Impact: Consistently improved Evid-Eval scores across all models (e.g., +0.03 for Gemini-2.5-Flash), demonstrating the value of richer, cleaner evidence.

Gemini-2.5-Pro Performance

Superior Outperforms other models across most evaluation metrics

Case Study: Effectiveness of Multi-Agent Collaboration

Challenge: Verifying complex multimodal claims requires robust evidence processing, analysis, and reasoning across different modalities.

Solution: VILLAIN's multi-agent pipeline leverages VLM agents for modality-specific and cross-modal analysis, iterative QA generation, and final verdict prediction with justification.

Impact: Experimental results confirm VILLAIN's consistent outperformance, highlighting the effectiveness of multi-agent collaboration and iterative reasoning for multimodal fact-checking.

Calculate Your Potential ROI

Estimate the potential cost savings and efficiency gains your organization could achieve with AI implementation.

Industry

Number of Employees

Avg. Weekly Hours on Manual Tasks

Avg. Hourly Rate ($)

Annual Cost Savings $0

Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A phased approach to integrate AI seamlessly into your enterprise operations, ensuring maximum impact with minimal disruption.

Phase 1: Discovery & Strategy

Comprehensive analysis of existing workflows, identification of AI opportunities, and development of a tailored AI strategy.

Phase 2: Pilot & Prototyping

Deployment of AI prototypes in a controlled environment, performance evaluation, and iterative refinement based on feedback.

Phase 3: Integration & Scaling

Seamless integration of AI solutions into core enterprise systems and phased rollout across relevant departments.

Phase 4: Optimization & Monitoring

Continuous monitoring of AI system performance, ongoing optimization, and adaptation to evolving business needs.

Ready to Transform Your Enterprise?

Partner with OwnYourAI to navigate the complexities of AI adoption and unlock unparalleled operational efficiency and innovation.

Book a Consultation

VILLAIN at AVerImaTeC: Verifying Image–Text Claims via Multi-Agent Collaboration

A cutting-edge multimodal fact-checking system, VILLAIN, utilizes prompt-based multi-agent collaboration with vision-language models to verify image-text claims, achieving top performance in the AVerImaTeC shared task.

Unlocking Advanced Multimodal Fact-Checking

Deep Analysis & Enterprise Applications

AVerImaTeC Shared Task Focus

Case Study: The Need for Multimodal Fact-Checking

Enterprise Process Flow

Evidence Retrieval Mechanism Comparison

Enterprise Process Flow

Impact of Evidence Analysis Agents

Enterprise Process Flow

Number of QA Pairs Generated

Enterprise Process Flow

Veracity Score on Test Set

Leaderboard Performance Comparison (Test Set)

Case Study: Ablation Study: Knowledge Store Filling

Gemini-2.5-Pro Performance

Case Study: Effectiveness of Multi-Agent Collaboration

Calculate Your Potential ROI

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: Pilot & Prototyping

Phase 3: Integration & Scaling

Phase 4: Optimization & Monitoring

Ready to Transform Your Enterprise?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai