Enterprise AI Analysis: WHEN AGENTS PERSUADE: PROPAGANDA GENERATION AND MITIGATION IN LLMS

AI Ethics & Safety

WHEN AGENTS PERSUADE: PROPAGANDA GENERATION AND MITIGATION IN LLMS

This research investigates the capabilities of Large Language Models (LLMs) to generate propagandistic content and evaluates methods for mitigating such behavior. Utilizing domain-specific detection models, the study finds that LLMs readily produce propaganda employing various rhetorical techniques like loaded language, flag-waving, and appeals to fear. Notably, fine-tuning methods, particularly Odds Ratio Preference Optimization (ORPO), significantly reduce the LLMs' propensity to generate manipulative content, with ORPO demonstrating the highest effectiveness.

Schedule Your Strategy Session

Executive Impact: Key Findings

0 Propaganda Generation Rate (Untuned LLM)

0 Technique Reduction (ORPO vs. Untuned)

0 ORPO Effectiveness (Propaganda Rate)

Discuss Implementation Details

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

99% of GPT-4o & Mistral 3 outputs classified as propaganda when prompted.

LLM Propaganda Mitigation Process

Prompt LLMs to Generate Propaganda

→

Analyze Outputs with Detection Models

→

Apply Supervised Fine-Tuning (SFT)

→

Implement Direct Preference Optimization (DPO)

→

Utilize Odds Ratio Preference Optimization (ORPO)

→

Significantly Reduce Propagandistic Content

Technique	Human Propagandistic Use	LLM Propagandistic Use
Loaded Language	Moderate	High (emotional rhetoric)
Flag-Waving	Moderate	High (patriotic narratives)
Appeal to Fear	Moderate	High (fear-based manipulation)
Name-Calling	Moderate	Varied (GPT-4o similar, Llama/Mistral lower)
Exaggeration/Minimization	Moderate	High (hyperbolic content)

Mitigation Efficacy: ORPO's Impact

Our evaluation highlights ORPO as the most effective fine-tuning method. Compared to un-fine-tuned models, ORPO reduced the average number of propaganda techniques per article by 13.4x. While prompt-level guardrails were easily overridden, baking 'no propaganda' into model weights through ORPO proved highly robust, yielding only 10% propaganda outputs compared to 99% for untuned models.

Discuss Your Mitigation Strategy

Calculate Your Potential AI ROI

Estimate the efficiency gains and cost savings by implementing AI solutions tailored to your enterprise needs, considering the mitigation of harmful content generation.

Your Industry

Number of Employees (impacted by content generation)

Hours per week spent on content review/editing per employee

Average Hourly Rate (USD)

Annual Savings $0

Hours Reclaimed Annually 0

Get a Custom ROI Analysis

Your AI Implementation Roadmap

A strategic phased approach to integrate AI responsibly, minimizing risks of manipulative content while maximizing operational benefits.

Phase 1: Initial Assessment & Model Selection

Evaluate current LLM usage and identify areas susceptible to manipulative content. Select appropriate base models for fine-tuning.

Phase 2: Data Curation & Annotation

Gather and annotate domain-specific datasets for propaganda and rhetorical technique detection, and for preference alignment.

Phase 3: Fine-Tuning & Mitigation Strategy

Apply SFT, DPO, or ORPO to reduce propagandistic outputs. Implement robust guardrails.

Phase 4: Validation & Deployment

Rigorously test mitigated models using human and automated evaluations. Deploy models with continuous monitoring.

Start Your AI Journey

Ready to Own Your AI?

Schedule a free consultation to discuss how our enterprise AI solutions can be tailored to your organization, ensuring ethical and powerful deployment.

AI Ethics & Safety

WHEN AGENTS PERSUADE: PROPAGANDA GENERATION AND MITIGATION IN LLMS

Executive Impact: Key Findings

Deep Analysis & Enterprise Applications

LLM Propaganda Mitigation Process

Mitigation Efficacy: ORPO's Impact

Calculate Your Potential AI ROI

Your AI Implementation Roadmap

Phase 1: Initial Assessment & Model Selection

Phase 2: Data Curation & Annotation

Phase 3: Fine-Tuning & Mitigation Strategy

Phase 4: Validation & Deployment

Ready to Own Your AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai