Enterprise AI Analysis

Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making

DuSAR (Dual-Strategy Agent with Reflecting) is a demonstration-free framework enabling a single frozen LLM to perform co-adaptive reasoning via two complementary strategies: a high-level holistic plan and a context-grounded local policy. It reduces per-step token consumption by 3-9× while maintaining strong performance, offering a new paradigm for efficient, robust AI agents.

Schedule Your Strategy Session

Executive Impact: At a Glance

DuSAR achieves state-of-the-art results on ALFWorld and Mind2Web, outperforming retrieval-augmented baselines by 2.8× and over 2× in success rate respectively, while drastically reducing token consumption.

0% ALFWorld Success Rate (Llama3.1-70B)

0% Mind2Web Task Success Rate (Llama3.1-70B)

0x Token Consumption Reduction

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Methodology

Experimental Validation

Efficiency & Generalization

Ablation Studies

DuSAR, a demonstration-free, zero-shot framework, internalizes a dual-strategy reasoning paradigm within a frozen LLM. It comprises three tightly coupled modules: Holistic Reflecting, Local Reflecting, and Decision Reflecting, implementing a closed-loop, co-adaptive decision-making process without requiring expert demonstrations, in-context examples, or parameter updates. Inspired by human metacognition, DuSAR decouples strategic planning from tactical execution through two complementary strategies: A Holistic Strategy that maintains a high-level plan for task decomposition and long-term coherence; and a Local Strategy that generates context-sensitive actions and evaluates immediate progress. These strategies interact via a lightweight reflection mechanism, continuously assessing progress and dynamically revising the global plan when stuck or refining it upon meaningful advancement.

DuSAR was extensively validated on ALFWorld and Mind2Web using open-source LLMs (7B-70B). It consistently outperforms state-of-the-art demonstration-driven agents like Synapse and TRAD across various model scales and task types. Notably, it achieves robust performance even on small models, where retrieval-based methods struggle, and demonstrates superior generalization in cross-domain settings due to its internal reasoning scaffold.

A key finding is DuSAR's remarkable token efficiency, reducing per-step token consumption by 3-9× compared to retrieval-based methods, which translates to lower latency and higher throughput. This efficiency, combined with its demonstration-free nature, enables strong cross-domain generalization and adaptability to novel situations without relying on static exemplars or external supervision. The framework's ability to generate and refine structured plan graphs in situ through environmental interaction makes it robust to unexpected states and errors.

Ablation studies confirm the necessity of dual-strategy coordination. Variants with only Holistic, only Local, or naive concatenation of strategies showed significantly lower performance, highlighting that co-adaptive integration is more critical than mere component presence. Optional integration of expert demonstrations further boosts results, particularly for high-level guidance on smaller models, underscoring DuSAR's flexibility and compatibility with external knowledge.

37.1% Success Rate on ALFWorld with Llama3.1-70B (State-of-the-Art)

Enterprise Process Flow

Holistic Strategy Initialization

→

Observe Environment

→

Local Strategy Generation & Progress Assessment

→

Decision Reflecting & Action Selection

→

Execute Action & Receive Feedback

→

Holistic Strategy Refinement (if needed)

→

Loop until Task Completion

Comparison of DuSAR with Baseline Methods
Method	Demonstrations	Planning Approach	Adaptation	Token Efficiency
ReAct (Yao et al. 2023c)	✓ Fixed	Sequential reasoning-acting loop	Reactive self-correction	Medium (1.4k-1.9k/step)
Synapse (Zheng et al. 2024)	✓ Trajectory	Single-stage trajectory retrieval	Task-based exemplar selection	Moderate (1.5k-2.1k/step)
TRAD (Zhou et al. 2024)	✓ Step-wise	Two-stage retrieval (task + thought)	Thought-based dynamic refinement	Low (3.2k-3.6k/step)
DuSAR (Ours)	❌	Co-adaptive dual-strategy	Dynamic co-adaptation	High (335-564/step)

Case Study: ALFWorld PutTwo Task - 'Put two soapbars in garbagecan'

We illustrate how each method handles this complex 6-step task requiring sequential object manipulation.

ReAct (Llama3.1-70B, 0% success rate): No global plan; each step's reasoning is independent. After finding first soapbar, ReAct cannot systematically track that it needs a second soapbar, leading to redundant exploration of previously visited locations. Achieves 0% success rate across all model scales.

Synapse (Llama3.1-70B, 0% success rate): Retrieved exemplar has only 3 steps, but PutTwo requires 6 steps. Synapse cannot extend beyond the exemplar's length, leading to premature task completion attempt. Single-stage retrieval lacks dynamic refinement to adapt to longer sequences.

TRAD (Llama3.1-70B, 0% success rate): Two-stage retrieval helps initially, but when thought-based retrieval cannot find relevant traces for multi-object scenarios, TRAD falls back to mismatched exemplars. Higher token cost (3.2k-3.6k/step) also limits context window on smaller models.

DuSAR (Llama3.1-70B, 52.9% success rate): Co-adaptive refinement enables dynamic plan adjustment. When first soapbar is found (s4 = 75), Holistic Strategy updates to explicitly track the second object requirement, preventing redundant exploration. Score milestones (25, 50, 75, 90, 100) provide fine-grained progress tracking for the 6-step sequence.

Advanced ROI Calculator

Estimate the potential return on investment for integrating AI into your operations. Adjust variables to see the impact on efficiency and cost savings.

Your Industry

Number of Employees

Avg. Manual Hours / Week per Employee

Avg. Hourly Rate ($)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Book a Custom ROI Analysis

Your AI Implementation Roadmap

A typical enterprise AI adoption journey involves several key phases. Our approach ensures a structured, efficient, and successful transition.

Phase 1: Discovery & Strategy Alignment

Initial consultations to understand your enterprise's unique needs, current AI maturity, and strategic objectives. We define key performance indicators (KPIs) and tailor a DuSAR integration roadmap.

Phase 2: Pilot Program & Customization

Deploy DuSAR in a controlled pilot environment, customizing prompt templates and agent configurations to align with specific task domains and existing IT infrastructure. Initial performance benchmarks are established.

Phase 3: Iterative Refinement & Expansion

Based on pilot results, we iteratively refine DuSAR's strategies and parameters, incorporating feedback and expanding deployment to additional tasks or departments. Ongoing performance monitoring and optimization.

Phase 4: Full-Scale Integration & Training

Seamlessly integrate DuSAR into your enterprise's core workflows, providing comprehensive training for your teams on monitoring, managing, and extending the AI agents. Establish internal governance and best practices.

Phase 5: Continuous Optimization & Support

Ongoing support, regular performance reviews, and proactive optimization to ensure DuSAR continues to deliver maximum value as your operational needs evolve. Explore new features and advancements.

Start Your AI Journey Today

Ready to Transform Your Enterprise with AI?

Connect with our AI strategists to explore how DuSAR and our other advanced AI solutions can drive unparalleled efficiency and innovation in your organization.

Schedule a Free Consultation

Enterprise AI Analysis

Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making

Executive Impact: At a Glance

Deep Analysis & Enterprise Applications

Enterprise Process Flow

Case Study: ALFWorld PutTwo Task - 'Put two soapbars in garbagecan'

Advanced ROI Calculator

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy Alignment

Phase 2: Pilot Program & Customization

Phase 3: Iterative Refinement & Expansion

Phase 4: Full-Scale Integration & Training

Phase 5: Continuous Optimization & Support

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai