Enterprise AI Analysis
SWE-Fuse: Empowering Software Agents via Issue-free Trajectory Learning and Entropy-aware RLVR Training
Authors: Xin-Cheng Wen, Binbin Chen, Haoxuan Lan, Hang Yu, Peng Di, Cuiyun Gao
Publication: arXiv:2603.07927v1
Large language models (LLMs) have transformed the software engineering landscape. Recently, numerous LLM-based agents have been developed to address real-world software issue fixing tasks. Despite their state-of-the-art performance, these agents face a significant challenge: Insufficient high-quality issue descriptions. Real-world datasets often exhibit misalignments between issue descriptions and their corresponding solutions, introducing noise and ambiguity that mislead automated agents and limit their problem-solving effectiveness. We propose SWE-Fuse, an issue-description-aware training framework that fuses issue-description-guided and issue-free samples for training SWE agents. It consists of two key modules: (1) An issue-free-driven trajectory learning module for mitigating potentially misleading issue descriptions while enabling the model to learn step-by-step debugging processes; and (2) An entropy-aware RLVR training module, which adaptively adjusts training dynamics through entropy-driven clipping. It applies relaxed clipping under high entropy to encourage exploration, and stricter clipping under low entropy to ensure training stability. We evaluate SWE-Fuse on the widely studied SWE-bench Verified benchmark shows to demonstrate its effectiveness in solving real-world software problems. Specifically, SWE-Fuse outperforms the best 8B and 32B baselines by 43.0% and 60.2% in solve rate, respectively. Furthermore, integrating SWE-Fuse with test-time scaling (TTS) enables further performance improvements, achieving solve rates of 49.8% and 65.2% under TTS@8 for the 8B and 32B models, respectively.
Revolutionizing Software Issue Resolution with Advanced AI Agents
SWE-Fuse introduces an innovative framework significantly enhancing software agent capabilities in fixing real-world issues. By addressing the critical challenge of low-quality issue descriptions through a novel issue-free-driven trajectory learning and an entropy-aware reinforcement learning approach, SWE-Fuse achieves state-of-the-art solve rates. This empowers AI agents to perform complex debugging tasks more effectively, reducing reliance on perfect issue descriptions and accelerating software development cycles.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
State-of-the-Art Performance
SWE-Fuse demonstrates state-of-the-art performance among open-source models, outperforming baselines by 9.1% (8B models) and 11.7% (32B models) in solve rate on SWE-bench Verified. With Test-Time Scaling (TTS@8), solve rates further improve to 49.8% (8B) and 65.2% (32B). This competitive edge, even against models with significantly more parameters, highlights SWE-Fuse's effective trajectory learning and RLVR training.
Impact of Trajectory Data in SFT
The quantity and quality of trajectory data are crucial. Data scaling shows clear benefits, with resolve rates increasing from 13.5% (0 samples) to 39.0% (all 14k samples) for the Qwen3-8B model. An optimal balance of 25-50% issue-free trajectories achieves the best performance, leveraging general code modification patterns without losing task-specific context. Furthermore, rigorous filtering ensures SWE-Fuse relies on genuine problem-solving capabilities, not 'git hacking' shortcuts.
Entropy-aware RLVR Training Module
The entropy-aware RLVR training module significantly contributes to stable convergence and higher ultimate performance. Models initialized with the cold-start SFT phase exhibit rapid performance gains and achieve superior final outcomes compared to training from scratch. This demonstrates RLVR's robustness and scalability across both 8B and 32B models, effectively biasing the policy towards productive action spaces and expanding capabilities through task-specific knowledge.
SWE-Fuse Enterprise Process Flow for Issue Resolution
| Feature | SWE-Fuse (32B + TTS@8) | Top Closed-Source (e.g., Qwen3-Coder-480B) | Typical Open-Source Baseline (e.g., Skywork-SWE-32B + TTS@8) |
|---|---|---|---|
| Solve Rate (%) | 65.2% | 67.0% | 47.0% |
| Learning Approach |
|
|
|
| Data Robustness |
|
|
|
| Model Size | 32B | 480B - 1T+ | 7B - 32B |
| Open-Source Framework |
|
|
|
| Key Advantage |
|
|
|
Case Study: Resolving `astropy-13236` with SWE-Fuse
Problem: The `astropy-13236` issue involved complex changes related to structured arrays being converted to `NdarrayMixin` in a `Table` context, requiring a `FutureWarning` and a behavior change in version 5.2.
SWE-Fuse Solution: SWE-Fuse successfully addressed both requirements by formulating a multi-round plan, creating a reproduction script, modifying the `astropy/table/table.py` file, and verifying the `FutureWarning` and the correct column type. It produced a verifiable, complete fix by round 34, using its self-generated reproduction script to ensure precise modifications and behavioral changes in long-context scenarios.
Competitor Challenge (Claude-4-Sonnet): Claude-4-Sonnet's approach was incomplete. It only implemented the `FutureWarning` without the full behavior change for version 5.2, leading to an incorrect patch submission. It struggled with the multi-turn interaction and precisely aligning changes with the pull request objectives.
Key Takeaway: SWE-Fuse's issue-free-driven trajectory learning and multi-turn interaction capabilities enable it to systematically debug and implement complex fixes, even when competitors struggle with the full scope of the problem.
Quantify Your AI Impact: Advanced ROI Calculator
Estimate the potential cost savings and efficiency gains your enterprise could achieve by implementing SWE-Fuse's advanced AI agents.
Your Accelerated Implementation Roadmap
We've distilled SWE-Fuse's robust methodology into a clear, phased roadmap designed for rapid enterprise integration and measurable impact.
Phase 01: Strategic Assessment & Customization
Begin with a deep dive into your existing software development workflows and issue resolution processes. We identify key integration points and customize the SWE-Fuse framework to align with your specific codebase and development environment, leveraging our issue-free trajectory learning for optimal data preparation.
Phase 02: Pilot Deployment & Agent Training
Deploy SWE-Fuse agents on a designated pilot project. Our team assists in initial training using your curated historical issue data and issue-free samples, ensuring the model rapidly acquires step-by-step debugging capabilities and adapts to your enterprise-specific problem patterns.
Phase 03: Performance Optimization with RLVR
Implement the entropy-aware RLVR training module to continuously fine-tune agent performance. This phase focuses on adaptive training dynamics and targeted reward signals to maximize issue resolution rates and stability, ensuring your agents are robust against diverse and complex real-world challenges.
Phase 04: Scaled Integration & Continuous Improvement
Expand SWE-Fuse's deployment across more projects and teams. Establish feedback loops and monitoring mechanisms to capture new issue patterns and agent interactions, facilitating continuous learning and iterative enhancements to maintain state-of-the-art software issue resolution capabilities.
Ready to Empower Your Software Agents?
Don't let complex software issues hinder your development cycles. Unlock the full potential of AI-driven issue resolution with SWE-Fuse. Schedule a personalized consultation with our experts to explore how our framework can integrate seamlessly into your enterprise, optimize your workflows, and deliver tangible ROI.