Skip to main content
Enterprise AI Analysis: Leanstral: Open-Source foundation for trustworthy vibe-coding

Enterprise AI Analysis

Leanstral: Open-Source foundation for trustworthy vibe-coding

AI agents have proven to be highly capable tools at code generation. Yet, as we push these models to high-stakes domains, ranging from frontier research mathematics to mission-critical software, we encounter a scaling bottleneck: the human review. The time and specialized expertise required to manually verify become the primary impedance of engineering velocity.

Executive Impact at a Glance

Leanstral delivers significant performance gains and cost efficiencies, redefining standards for verifiable code generation.

0 Outperformance vs. Sonnet
0 More Cost-Efficient vs. Opus
0 FLTEval Score (Pass@4)
0 Active Parameters

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Introducing Leanstral: The First Open-Source Lean 4 Agent

We release Leanstral, the first open-source code agent designed for Lean 4. Lean 4 is a proof assistant capable of expressing complex mathematical objects and software specifications. Unlike existing proving systems, Leanstral is highly efficient (with 6B active parameters) and trained for operating in realistic formal repositories.

Open and Accessible: Leanstral weights are released under an Apache 2.0 license, available in agent mode within Mistral Vibe, and through a free API endpoint. A tech report detailing the training approach and a new evaluation suite FLTEval will also be released.

Efficient and Mighty: Leveraging a highly sparse architecture, Leanstral is optimized for proof engineering tasks. Parallel inference with Lean as a perfect verifier ensures both performance and cost-efficiency against existing closed-source competitors.

Upgradable via MCP: Leanstral supports arbitrary MCPs through vibe and was specifically trained for maximal performance with the frequently used lean-lsp-mcp.

Benchmarking Leanstral: Outperforming Competitors

To reflect usefulness in realistic proof engineering scenarios, we benchmark Leanstral for completing all formal proofs and correctly defining new mathematical concepts in each PR to the FLT project, instead of isolated mathematical problems. We compare Leanstral against leading coding agents (Claude Opus 4.6, Sonnet 4.6, Haiku 4.5) and open-source models (Qwen3.5 397B-A17B, Kimi-K2.5 1T-A32B, GLM5 744B-A40B).

Leanstral vs. OSS Models: Leanstral-120B-A6B demonstrates a significant efficiency advantage over its much larger open-source peers. While models like GLM5-744B-A40B and Kimi-K2.5-1T-32B struggle to scale, capping their FLTEval scores at approximately 16.6 and 20.1 respectively, Leanstral outperforms them both with just a single pass. Even Qwen3.5-397B-A17B, the strongest OSS competitor shown, requires 4 passes to reach a score of 25.4. In contrast, Leanstral achieves a superior score of 26.3 with half that investment (pass@2) and continues to scale linearly, reaching 29.3 at the same cost level.

Leanstral vs. Claude Family: Leanstral serves as a high-value alternative to the Claude suite, offering competitive performance at a fraction of the price: Leanstral pass@2 reaches a score of 26.3, beating Sonnet by 2.6 points, while costing only $36 to run, compared to Sonnet’s $549. At pass@16, Leanstral reaches a score of 31.9, comfortably beating Sonnet by 8 points. While Claude Opus 4.6 remains the leader in quality, it carries a staggering cost of $1,650, 92 times higher than running Leanstral.

Case Studies: Practical Applications of Leanstral

Leanstral proves its real-world utility in complex code migration and program reasoning scenarios.

Answering StackExchange Posts: Leanstral successfully diagnosed and fixed a Lean 4.29.0-rc6 compilation issue caused by a def alias blocking a rw tactic. It accurately recreated the failing environment, identified the definitional equality problem, and proposed switching def to abbrev for a transparent alias, restoring functionality and explaining the rationale clearly.

Reasoning About Programs: Leanstral successfully converted definitions from Rocq to Lean, implementing custom notation. Furthermore, it demonstrated the ability to translate Rocq statements to Lean and prove properties about programs in the newly defined language, even without prior proof examples.

Accessing Leanstral Today

Leanstral is available today for everyone to use, with multiple options for integration and deployment:

  • Zero-Setup in Mistral Vibe: Integrated directly into Mistral Vibe for immediate, zero-setup vibe coding and proving. Use /leanstall to activate, then Shift+Tab to select Leanstral, or use vibe --agent lean.
  • Labs API: Access the model via our free/near-free API endpoint labs-leanstral-2603. This endpoint is highly accessible for a limited period to gather realistic feedback and observability data for future models.
  • Own the Weights: Download the Apache 2.0 licensed model and run it on your own hardware, offering full control and customization.
92x More cost-efficient than Claude Opus for equivalent quality

Model Performance & Cost Comparison (Leanstral vs. Claude Family)

Model Cost ($) Score (FLTEval)
Haiku 184 23.0
Sonnet 549 23.7
Opus 1,650 39.6
Leanstral (Pass@1) 18 21.9
Leanstral (Pass@2) 36 26.3
Leanstral (Pass@4) 72 29.3
Leanstral (Pass@8) 145 31.0
Leanstral (Pass@16) 290 31.9

Case Study: Solving Complex Lean 4 Migration Issues

Challenge: A real-world Stack Exchange post detailed a script failing to compile in a new Lean 4 version due to a rw tactic not matching patterns involving a type alias (def T2 := List Bool).

Leanstral's Approach: The agent successfully built test code to recreate the failing environment and diagnosed the underlying issue: def created a rigid definition that actively blocked the rw tactic from recognizing the underlying structure.

Solution & Impact: Leanstral proposed a simple yet effective fix: replace def with abbrev. Because abbrev creates a transparent alias definitionally equal to the original type, the rw tactic could once again perfectly match the pattern. The solution was delivered with clear rationale, significantly reducing debugging time and specialized expertise required.

Case Study: Automating Program Reasoning & Verification

Challenge: The task was to convert program definitions from Rocq to Lean and then prove properties about these programs, starting only from Rocq statements without existing proofs.

Leanstral's Approach: Leanstral not only successfully translated the inductive definitions of commands (ceval) and states, but also implemented custom Lean notation for command evaluation (e.g., c " / " st " ⇒ " st').

Solution & Impact: For an example command like plus2, Leanstral was able to generate a theorem (plus2_spec) specifying its behavior (adding 2 to variable X) and then provide the full formal proof in Lean. This demonstrates Leanstral's capability to understand, translate, and formally verify program logic, significantly accelerating the process of building trustworthy software components.

Calculate Your Potential AI ROI

Estimate the significant time and cost savings your enterprise could achieve by integrating advanced AI solutions like Leanstral.

Input Your Business Parameters

Estimated Annual Cost Savings $0
Annual Hours Reclaimed 0

Your AI Implementation Roadmap

Our structured approach ensures a seamless and successful integration of AI, maximizing your ROI with minimal disruption.

Discovery & Strategy

In-depth analysis of current workflows, identifying key pain points and high-impact AI opportunities. Definition of clear objectives and success metrics.

Pilot & Proof of Concept

Rapid deployment of a focused AI solution to validate technical feasibility and demonstrate tangible value within a controlled environment.

Integration & Customization

Full-scale integration of AI solutions into existing enterprise systems, with tailored adjustments to fit unique operational requirements.

Training & Adoption

Comprehensive training programs for your teams, ensuring smooth adoption and proficiency with new AI tools and processes.

Optimization & Scaling

Continuous monitoring, performance tuning, and expansion of AI capabilities across departments to unlock further efficiencies and innovations.

Ready to Transform Your Enterprise?

Don't miss out on the competitive edge AI can provide. Let's discuss a tailored strategy for your organization.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking