ENTERPRISE AI ANALYSIS

Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions

This analysis delves into a novel black-box attack, Image-based Prompt Injection (IPI), which exploits multimodal large language models (MLLMs) by embedding adversarial instructions into natural images. We uncover the vulnerabilities, explore the trade-offs between attack success and stealth, and outline critical implications for enterprise AI security.

Schedule Your Strategy Session

Neha Nagaraja, School of Informatics, Computing, and Cyber Systems, Northern Arizona University

Zhilong Wang, Bytedance

Bo Zhang, Bytedance

Lan Zhang, School of Informatics, Computing, and Cyber Systems, Northern Arizona University

Pawan Patil, Bytedance

Abstract

Multimodal Large Language Models (MLLMs) integrate vision and text to power applications, but this integration introduces new vulnerabilities. We study Image-based Prompt Injection (IPI), a black-box attack in which adversarial instructions are embedded into natural images to override model behavior. Our end-to-end IPI pipeline incorporates segmentation-based region selection, adaptive font scaling, and background-aware rendering to conceal prompts from human perception while preserving model interpretability. Using the COCO dataset and GPT-4-turbo, we evaluate 12 adversarial prompt strategies and multiple embedding configurations. The results show that IPI can reliably manipulate the output of the model, with the most effective configuration achieving up to 64% attack success under stealth constraints. These findings highlight IPI as a practical threat in black-box settings and underscore the need for defenses against multimodal prompt injection.

Executive Impact: Key Findings

Our analysis highlights critical vulnerabilities and strategic insights for enterprise AI, focusing on the practical implications of image-based prompt injection.

0% Max Stealthy ASR Achieved

0+ Prompt Strategies Evaluated

0% Max ASR for Top Prompts

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

The IPI Threat & Modality Vulnerabilities

IPI Attack Methodology

Empirical Findings & Trade-offs

Strategic Implications & Defenses

Image-based Prompt Injection (IPI) represents a critical and novel security vulnerability in Multimodal Large Language Models (MLLMs). Unlike traditional text-based prompt injection, IPI embeds adversarial instructions directly into images, exploiting the visual channel of MLLMs. This black-box attack poses unique challenges due to its invisibility requirement and modality-specific perception.

64% Maximum Attack Success Rate under stealth constraints (Global Region-Averaged Coloring + Object-Aware Prefix)

The Evolving Landscape of Prompt Injection

Multimodal Large Language Models (MLLMs) integrate vision and text to power applications, but this integration introduces new vulnerabilities. We study Image-based Prompt Injection (IPI), a black-box attack in which adversarial instructions are embedded into natural images to override model behavior. Our end-to-end IPI pipeline incorporates segmentation-based region selection, adaptive font scaling, and background-aware rendering to conceal prompts from human perception while preserving model interpretability. These findings highlight IPI as a practical threat in black-box settings and underscore the need for defenses against multimodal prompt injection.

The research landscape has increasingly moved toward Multimodal Large Language Models (MLLMs), which extend beyond text to handle inputs such as images, audio, and video. Among these, vision has gained particular traction powering applications in image captioning, accessibility tools, autonomous perception, and agentic workflows. By 2025, visual modalities stand as the second most widely studied and deployed component across both academia and industry.

In contrast with textual prompt injection, image-based prompt injection exhibits distinctive characteristics: 1) Invisibility Requirement: IPI must embed adversarial instructions in a way that remains hidden from human detection yet interpretable by the model. 2) Modality-Specific Perception: MLLMs interpret embedded instructions through the visual channel, which is fundamentally different from how standard language models process purely textual prompts.

Our novel Image-based Prompt Injection (IPI) method employs a systematic, end-to-end pipeline to embed adversarial instructions into natural images. This ensures the embedded cues are interpreted by MLLMs as executable prompts while remaining minimally perceptible to human observers.

Enterprise Process Flow: IPI Pipeline

Adversarial Prompt Engineering

→

SAM-based Region Selection & Ranking

→

Background-Aware Prompt Embedding

→

Model Query & Attack Evaluation

Extensive experiments reveal the critical trade-offs between prompt visibility and attack effectiveness. We evaluated various parameters, including prompt wording, font size, spatial placement, and color strategies, to identify optimal configurations for stealthy and successful prompt injections.

Strategy	Description	Stealth	Attack Success Rate (ASR)	Notes
Background-Averaged Patch Coloring	Each character uses the average RGB of its local background patch with a brightness offset.	Moderate (local blending)	Low (peaking at 25%)	Local visual coherence. Limited contrast, inconsistent success.
Pixel-Level Blending	Each text pixel is individually blended with its corresponding background pixel using a small brightness offset.	High (seamless integration)	Very Low (max 10%)	Excessive blending obscures clarity. Model frequently fails to detect prompt.
Global Region-Averaged Coloring	All characters use a single uniform color from the average RGB of the entire injection region with a fixed brightness offset.	Moderate (natural blend in uniform regions)	High (up to 64%)	Best balance of stealth & interpretability. Enhanced with object-aware prefixing.

≥ 0.3 Minimum Font Size Scale for Reliable Injection (Below 0.20, success negligible)

The demonstrated feasibility of Image-based Prompt Injection has significant implications for the design and security of MLLM-driven enterprise systems. Understanding these vulnerabilities is crucial for developing robust mitigation strategies and ensuring responsible AI deployment.

Implications for Enterprise AI & Mitigation Strategies

Our attack is designed to be transferable across multimodal LLMs that combine vision and language inputs. Because it embeds instructions within visual elements rather than relying on model-specific parameters, the same principle can apply across different architectures, datasets, and real-world imagery. We therefore believe the technique is broadly generalizable to other models that interpret text within images, though its effectiveness may vary depending on each model's safety filters and input pre-processing pipelines.

While our focus was on demonstrating attack feasibility, the results also highlight a clear trade-off between visibility and stealth. Making the overlaid text more visually blended, for example, through background or pixel-level averaging, reduces perceptibility to human observers but can also decrease the model's ability to read and follow the embedded instructions. Conversely, using more visible text improves injection reliability but makes the manipulation easier to detect through human inspection. This tension defines a practical frontier for image-based prompt injection: attackers must trade human imperceptibility for reliability, and defenders can exploit that trade-off with modest sanitization or detection measures.

To mitigate image-based prompt injection, several defensive directions can be explored. Reinforcement learning and alignment tuning can help models learn to ignore visually embedded instructions by reinforcing safe response behavior. At inference time, system-level guardrails such as OCR-based detection, input sanitization, and moderation layers can screen images for hidden text or instruction patterns before they influence generation. A practical mitigation strategy is to replace raw visual inputs with sanitized, query-aware image descriptions, enabling the model to reason over safe textual summaries rather than potentially adversarial image content.

Calculate Your Potential AI ROI

Estimate the efficiency gains and cost savings your enterprise could realize by implementing advanced AI solutions, informed by the latest research.

Your Industry

Number of Employees Impacted by Manual Processes

Average Weekly Hours Spent on Repetitive Tasks per Employee

Average Hourly Cost per Employee ($)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A structured approach to integrating AI, from initial strategy to full-scale deployment and continuous optimization.

Phase 1: Discovery & Strategy

Comprehensive assessment of your current infrastructure, identification of key business challenges, and development of a tailored AI strategy to align with enterprise goals.

Phase 2: Pilot & Proof of Concept

Rapid prototyping and deployment of a focused AI pilot project to validate feasibility, measure initial impact, and refine the solution based on real-world feedback.

Phase 3: Scaled Deployment

Phased rollout of the AI solution across relevant departments, ensuring seamless integration with existing systems and robust performance monitoring.

Phase 4: Optimization & Future-Proofing

Continuous monitoring, performance tuning, and iterative enhancement of AI models. Strategic planning for future AI advancements and scaling opportunities.

Ready to Secure Your Enterprise AI?

Understand the unique vulnerabilities and strategic advantages AI brings. Schedule a consultation with our experts to discuss how to safeguard your systems and leverage cutting-edge research for innovation.

Book Your Expert Consultation

ENTERPRISE AI ANALYSIS

Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions

Abstract

Executive Impact: Key Findings

Deep Analysis & Enterprise Applications

The Evolving Landscape of Prompt Injection

Enterprise Process Flow: IPI Pipeline

Implications for Enterprise AI & Mitigation Strategies

Calculate Your Potential AI ROI

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: Pilot & Proof of Concept

Phase 3: Scaled Deployment

Phase 4: Optimization & Future-Proofing

Ready to Secure Your Enterprise AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai