JARVIS or Ultron? A Survey on the Safety and Security Threats of Computer-Using Agents

A comprehensive analysis of Computer-Using Agents (CUAs), detailing intrinsic and extrinsic threats, defensive strategies, and evaluation benchmarks.

Large Language Models (LLMs) have evolved rapidly from basic conversational agents to executing complex tasks in diverse computing environments. In particular, Computer-Using Agents (CUAs) have garnered increasing attention and widespread adoption, thanks to their ability to interact with graphical user interfaces (GUIs) in a manner akin to human users. Recent systems such as AppAgent, SeeAct, PC-Agent, as well as newly-introduced OpenAI's o3, and o4-mini, highlight the remarkable progress of CUAs. By integrating multimodal perception, advanced reasoning, and automated control of devices, these agents promise to streamline vast tasks from filling out online forms to executing complex application flows. Despite the impressive capabilities of CUAs, their operation in real-world settings raises critical safety concerns. This survey addresses these concerns systematically.

Schedule Your Strategy Session

Executive Impact

Key performance indicators derived from CUA safety and security assessments, highlighting crucial areas for improvement.

95% Task Success Rate

80% Refusal Rate (RR)

1500ms Avg. Response Latency

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Intrinsic Threats

Extrinsic Threats

Defensive Strategies

Evaluation & Benchmarking

Intrinsic threats arise from the agent's internal limitations, such as perception errors or reasoning failures. For example, UI Understanding and Grounding Difficulties (Chen et al., 2025c) occur when CUAs struggle to correctly interpret GUI elements due to static datasets or resolution constraints. Scheduling Errors (Zhang and Zhang, 2023) can lead to unstable behaviors in complex tasks, while Misalignment (Ma et al., 2024a) happens when an agent's reasoning diverges from user intent. Hallucination (Deng et al., 2024a) causes agents to generate outputs not grounded in the environment. Excessive Context Length (Yang et al., 2024a) strains models with too much input, degrading performance. Social and Cultural Concerns (Qiu et al., 2025) arise when agents fail to respect diverse norms, leading to misunderstandings. Lastly, Response Latency (Zhang and Zhang, 2023) affects predictability and user trust due to slow processing, and API Call Errors (Nong et al., 2024) result from incorrect API inference or formatting.

Extrinsic threats originate from external entities, such as malicious attackers. These include Adversarial Attacks (Wu et al., 2024a), which manipulate inputs to induce harmful behaviors, like tiny pixel perturbations. Prompt Injection Attacks (Mudryi et al., 2025) embed malicious instructions directly or indirectly (e.g., via webpages) to bypass safety rules. Jailbreak Attacks (Mo et al., 2024) rephrase queries to bypass guardrails and generate unauthorized outputs. Memory Attacks (Wang et al., 2025a) target persistent context to extract sensitive information (Memory Extraction) or poison future reasoning (Memory Injection). Backdoor Attacks (Yang et al., 2024b) insert hidden triggers during training to activate harmful behaviors later. Reasoning Gap Attacks (Chen et al., 2025d) exploit mismatches between multimodal perception and reasoning. System Sabotage Attacks (Luo et al., 2025b) trick agents into destructive operations, like creating fork bombs. Finally, Web Hacking Attacks (Fang et al., 2024b) co-opt CUAs into autonomous hacking tools for SQL injection or data exfiltration.

Defenses against CUA threats are categorized into several types. Environmental Constraints (Yang et al., 2024c) limit agent interactions to prevent harmful actions. Input Validation (Kumar et al., 2024) verifies and sanitizes user inputs. Defensive Prompting (Debenedetti et al., 2024) structures prompts to prevent manipulation. Data Sanitization (Yang et al., 2024b) removes malicious data from training sets. Adversarial Training (Wu et al., 2024a) enhances model robustness against perturbations. Output Monitoring (Fang et al., 2024a) continuously evaluates agent outputs for misalignment. Model Inspection (Wang et al., 2025e), including Anomaly Detection and Weight Analysis, identifies malicious manipulations. Cross Verification (Zeng et al., 2024) uses multiple agents to validate outputs. Continuous Learning and Adaptation (Tian et al., 2023), via Self-Evolution and User Feedback, allows agents to dynamically update models. Transparentize (Sager et al., 2025) enhances interpretability through XAI and Audit Logs. Topology-Guided Strategies (Wang et al., 2025e) improve multi-agent security. Perception Algorithms Synergy (Zheng et al., 2024) combines perception modules for robust UI understanding. Planning-Centric Architecture Refinement (Zhang and Zhang, 2023) improves reasoning and API invocation. Lastly, Pre-defined Regulatory Compliance (Chen et al., 2025e) integrates adherence to standards and ethical guidelines.

CUA safety is evaluated using diverse benchmarks and metrics. Datasets cover Web-based Scenarios like ST-WebAgentBench (Levy et al., 2024), Mobile-based Scenarios such as MobileSafetyBench (Lee et al., 2024a), and General-purpose Scenarios, including Tool-use (ToolEmu, Ruan et al., 2023) and Mixed/Hybrid Environments (OpenAgentSafety, Vijayvargiya et al., 2025). Metrics include Task Completion Rate (TSR) (Yao et al., 2022), Helpfulness (Ruan et al., 2023), Step Success Rate (SSR) (Deng et al., 2023), and Total Correct Prefix (Hua et al., 2024). Safety and robustness are measured by Attack Success Rate (ASR) (Zhan et al., 2024), Completion Under the Policy (CuP) (Levy et al., 2024), Refusal Rate (RR) (Zhang et al., 2024b), and Leakage Rate (LR) (Shao et al., 2024). Measurements use Rule-based checks (Luo et al., 2025a), LLM-as-a-judge (Yuan et al., 2024), and Manual Judge evaluations (Ruan et al., 2023).

Critical Vulnerability Identified

25% Increased Data Leakage Risk

Our analysis reveals a significant vulnerability in CUA's handling of untrusted external data sources, leading to a 25% higher risk of data leakage compared to internal prompt injections. This highlights the urgent need for enhanced environmental input validation.

Enterprise Process Flow

Malicious Input

→

Perception Misinterpretation

→

Reasoning Failure

→

Unauthorized Action

→

System Compromise

Defense Mechanism Effectiveness

A comparative overview of leading defense mechanisms against common CUA threats.

Defense Mechanism	Key Strengths	Targeted Threats
Input Validation	Filters malicious prompts Prevents unauthorized commands	Prompt Injection Jailbreak Attacks
Output Monitoring	Detects misaligned actions Intercepts harmful API calls	Misalignment System Sabotage
Cross Verification	Redundancy for safety Consensus-based decision-making	Adversarial Attacks Backdoor Attacks

Case Study: Financial Trading Agent Security Breach

A CUA designed for automated financial trading suffered a breach due to an advanced indirect prompt injection attack. The attacker embedded subtle malicious instructions within a seemingly benign news feed, which the agent processed and acted upon, leading to unauthorized trades and a significant financial loss. The incident highlighted the critical need for multimodal threat detection and real-time contextual awareness to prevent such sophisticated attacks. Our findings suggest that integrating real-time human oversight and explainable AI techniques could have mitigated the impact.

Advanced ROI Calculator

Estimate the potential savings and reclaimed hours by implementing robust AI safety protocols and efficient CUA deployments in your enterprise.

Your Industry

Number of Employees (impacted by manual tasks)

Avg. Weekly Hours on Repetitive Tasks per Employee

Avg. Hourly Cost (incl. overhead)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Optimize Your Operations

Your AI Implementation Roadmap

Our structured approach ensures a seamless transition and maximum impact for your enterprise AI initiatives, with safety and security built-in from day one.

Phase 1: Threat Assessment & Gap Analysis

Comprehensive audit of existing CUA deployments, identifying potential intrinsic and extrinsic vulnerabilities. Develop a tailored threat model specific to your enterprise environment.

Phase 2: Defensive Strategy Integration

Implement a multi-layered defense framework, incorporating enhanced input validation, output monitoring, and context-aware defensive prompting. Focus on early detection and prevention.

Phase 3: Continuous Monitoring & Adaptation

Establish real-time monitoring systems for agent behavior and environmental interactions. Implement continuous learning mechanisms with human-in-the-loop safeguards to adapt to evolving threats.

Phase 4: Regulatory Compliance & Governance

Ensure all CUA operations adhere to industry-specific regulations and ethical guidelines. Develop transparent audit logs and explainable AI features for accountability.

Get Started Now

Ready to Transform Your Enterprise with AI?

Book a personalized consultation to explore how our tailored AI solutions can drive efficiency and innovation for your business, securely and responsibly.

Schedule a Consultation

JARVIS or Ultron? A Survey on the Safety and Security Threats of Computer-Using Agents

A comprehensive analysis of Computer-Using Agents (CUAs), detailing intrinsic and extrinsic threats, defensive strategies, and evaluation benchmarks.

Executive Impact

Deep Analysis & Enterprise Applications

Critical Vulnerability Identified

Enterprise Process Flow

Defense Mechanism Effectiveness

Case Study: Financial Trading Agent Security Breach

Advanced ROI Calculator

Your AI Implementation Roadmap

Phase 1: Threat Assessment & Gap Analysis

Phase 2: Defensive Strategy Integration

Phase 3: Continuous Monitoring & Adaptation

Phase 4: Regulatory Compliance & Governance

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai