Enterprise AI Analysis

SWE-Fuse: Empowering Software Agents via Issue-free Trajectory Learning and Entropy-aware RLVR Training

Authors: Xin-Cheng Wen, Binbin Chen, Haoxuan Lan, Hang Yu, Peng Di, Cuiyun Gao

Large language models (LLMs) have transformed the software engineering landscape. Recently, numerous LLM-based agents have been developed to address real-world software issue fixing tasks. Despite their state-of-the-art performance, these agents face a significant challenge: Insufficient high-quality issue descriptions. Real-world datasets often exhibit misalignments between issue descriptions and their corresponding solutions, introducing noise and ambiguity that mislead automated agents and limit their problem-solving effectiveness. We propose SWE-Fuse, an issue-description-aware training framework that fuses issue-description-guided and issue-free samples for training SWE agents. It consists of two key modules: (1) An issue-free-driven trajectory learning module for mitigating potentially misleading issue descriptions while enabling the model to learn step-by-step debugging processes; and (2) An entropy-aware RLVR training module, which adaptively adjusts training dynamics through entropy-driven clipping. It applies relaxed clipping under high entropy to encourage exploration, and stricter clipping under low entropy to ensure training stability. We evaluate SWE-Fuse on the widely studied SWE-bench Verified benchmark shows to demonstrate its effectiveness in solving real-world software problems. Specifically, SWE-Fuse outperforms the best 8B and 32B baselines by 43.0% and 60.2% in solve rate, respectively. Furthermore, integrating SWE-Fuse with test-time scaling (TTS) enables further performance improvements, achieving solve rates of 49.8% and 65.2% under TTS@8 for the 8B and 32B models, respectively.

Schedule Your Strategy Session

Revolutionizing Software Issue Resolution with Advanced AI Agents

SWE-Fuse introduces an innovative framework significantly enhancing software agent capabilities in fixing real-world issues. By addressing the critical challenge of low-quality issue descriptions through a novel issue-free-driven trajectory learning and an entropy-aware reinforcement learning approach, SWE-Fuse achieves state-of-the-art solve rates. This empowers AI agents to perform complex debugging tasks more effectively, reducing reliance on perfect issue descriptions and accelerating software development cycles.

0 SWE-Fuse-32B Solve Rate

0 SWE-Fuse-32B + TTS@8 Solve Rate

0 Improvement vs. 32B Baselines

0 Valid Trajectories for Training

Discuss Your AI Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

RQ1: SOTA Comparison

RQ2: Trajectory Data Impact

RQ3: RLVR Effectiveness

State-of-the-Art Performance

SWE-Fuse demonstrates state-of-the-art performance among open-source models, outperforming baselines by 9.1% (8B models) and 11.7% (32B models) in solve rate on SWE-bench Verified. With Test-Time Scaling (TTS@8), solve rates further improve to 49.8% (8B) and 65.2% (32B). This competitive edge, even against models with significantly more parameters, highlights SWE-Fuse's effective trajectory learning and RLVR training.

Impact of Trajectory Data in SFT

The quantity and quality of trajectory data are crucial. Data scaling shows clear benefits, with resolve rates increasing from 13.5% (0 samples) to 39.0% (all 14k samples) for the Qwen3-8B model. An optimal balance of 25-50% issue-free trajectories achieves the best performance, leveraging general code modification patterns without losing task-specific context. Furthermore, rigorous filtering ensures SWE-Fuse relies on genuine problem-solving capabilities, not 'git hacking' shortcuts.

Entropy-aware RLVR Training Module

The entropy-aware RLVR training module significantly contributes to stable convergence and higher ultimate performance. Models initialized with the cold-start SFT phase exhibit rapid performance gains and achieve superior final outcomes compared to training from scratch. This demonstrates RLVR's robustness and scalability across both 8B and 32B models, effectively biasing the policy towards productive action spaces and expanding capabilities through task-specific knowledge.

SWE-Fuse Enterprise Process Flow for Issue Resolution

Multi-step Trajectory Construction

→

Trajectory Data Filter

→

Issue-Free-driven SFT

→

Entropy-aware RLVR Training

SWE-Fuse Performance vs. Leading AI Agents (SWE-bench Verified)

Feature	SWE-Fuse (32B + TTS@8)	Top Closed-Source (e.g., Qwen3-Coder-480B)	Typical Open-Source Baseline (e.g., Skywork-SWE-32B + TTS@8)
Solve Rate (%)	65.2%	67.0%	47.0%
Learning Approach	Issue-free Trajectory Learning Entropy-aware RLVR	Proprietary, Large-scale pre-training	SFT (often without RL)
Data Robustness	Handles Insufficient/Noisy Issue Descriptions	Relies on High-Quality Data	Sensitive to Data Quality
Model Size	32B	480B - 1T+	7B - 32B
Open-Source Framework	Yes	No	Yes
Key Advantage	Achieves near closed-source performance with smaller models and robust learning	Market leader for large-scale, generalist code generation	Provides foundational capabilities for software agents

Case Study: Resolving `astropy-13236` with SWE-Fuse

Problem: The `astropy-13236` issue involved complex changes related to structured arrays being converted to `NdarrayMixin` in a `Table` context, requiring a `FutureWarning` and a behavior change in version 5.2.

SWE-Fuse Solution: SWE-Fuse successfully addressed both requirements by formulating a multi-round plan, creating a reproduction script, modifying the `astropy/table/table.py` file, and verifying the `FutureWarning` and the correct column type. It produced a verifiable, complete fix by round 34, using its self-generated reproduction script to ensure precise modifications and behavioral changes in long-context scenarios.

Competitor Challenge (Claude-4-Sonnet): Claude-4-Sonnet's approach was incomplete. It only implemented the `FutureWarning` without the full behavior change for version 5.2, leading to an incorrect patch submission. It struggled with the multi-turn interaction and precisely aligning changes with the pull request objectives.

Key Takeaway: SWE-Fuse's issue-free-driven trajectory learning and multi-turn interaction capabilities enable it to systematically debug and implement complex fixes, even when competitors struggle with the full scope of the problem.

Quantify Your AI Impact: Advanced ROI Calculator

Estimate the potential cost savings and efficiency gains your enterprise could achieve by implementing SWE-Fuse's advanced AI agents.

Your Industry

Number of Software Engineers

Average Weekly Hours on Issue Fixing (per engineer)

Average Hourly Cost of an Engineer ($)

Estimated Annual Savings $0

Engineer Hours Reclaimed Annually 0

Your Accelerated Implementation Roadmap

We've distilled SWE-Fuse's robust methodology into a clear, phased roadmap designed for rapid enterprise integration and measurable impact.

Phase 01: Strategic Assessment & Customization

Begin with a deep dive into your existing software development workflows and issue resolution processes. We identify key integration points and customize the SWE-Fuse framework to align with your specific codebase and development environment, leveraging our issue-free trajectory learning for optimal data preparation.

Phase 02: Pilot Deployment & Agent Training

Deploy SWE-Fuse agents on a designated pilot project. Our team assists in initial training using your curated historical issue data and issue-free samples, ensuring the model rapidly acquires step-by-step debugging capabilities and adapts to your enterprise-specific problem patterns.

Phase 03: Performance Optimization with RLVR

Implement the entropy-aware RLVR training module to continuously fine-tune agent performance. This phase focuses on adaptive training dynamics and targeted reward signals to maximize issue resolution rates and stability, ensuring your agents are robust against diverse and complex real-world challenges.

Phase 04: Scaled Integration & Continuous Improvement

Expand SWE-Fuse's deployment across more projects and teams. Establish feedback loops and monitoring mechanisms to capture new issue patterns and agent interactions, facilitating continuous learning and iterative enhancements to maintain state-of-the-art software issue resolution capabilities.

Get Started with Your Roadmap

Ready to Empower Your Software Agents?

Don't let complex software issues hinder your development cycles. Unlock the full potential of AI-driven issue resolution with SWE-Fuse. Schedule a personalized consultation with our experts to explore how our framework can integrate seamlessly into your enterprise, optimize your workflows, and deliver tangible ROI.

Book Your Free Consultation Now

Enterprise AI Analysis

SWE-Fuse: Empowering Software Agents via Issue-free Trajectory Learning and Entropy-aware RLVR Training

Revolutionizing Software Issue Resolution with Advanced AI Agents

Deep Analysis & Enterprise Applications

State-of-the-Art Performance

Impact of Trajectory Data in SFT

Entropy-aware RLVR Training Module

SWE-Fuse Enterprise Process Flow for Issue Resolution

SWE-Fuse Performance vs. Leading AI Agents (SWE-bench Verified)

Case Study: Resolving `astropy-13236` with SWE-Fuse

Quantify Your AI Impact: Advanced ROI Calculator

Your Accelerated Implementation Roadmap

Phase 01: Strategic Assessment & Customization

Phase 02: Pilot Deployment & Agent Training

Phase 03: Performance Optimization with RLVR

Phase 04: Scaled Integration & Continuous Improvement

Ready to Empower Your Software Agents?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai