An Empirical Study on Quantized LLMs for Legacy Code Modernization

REMODEL-LLM: Transforming C code to Java using LLMs

This paper investigates the efficacy of 19 small, quantized LLMs (under 20 billion parameters) for the C to Java translation task. We use a novel, hybrid pipeline that leverages Abstract Syntax Trees (ASTs) for semantic decomposition and employs a highly constrained, rule based prompting strategy. The results are stark: a clear multi tiered performance divide emerged.

Schedule Your Strategy Session

Executive Impact: Key Findings for Enterprises

Understanding the immediate implications for enterprise software development and strategic AI adoption.

13/20 Max Tests Passed

0% Tier 3 Success Rate

50%+ Tier 1 Pass Rate

Key Takeaway: Quantized LLMs show potential but face hard ceilings on complex semantic transformations like function pointers and sizeof. Strategic prompting is key, but fundamental paradigm shifts remain challenging.

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Methodology

Performance Tiers

Key Findings

Methodology

Explores the hybrid AST-driven pipeline and guardrail-driven prompting strategy.

Enterprise Process Flow

C Repository

→

C Code Analysis (AST)

→

Global Variables

→

Functions

→

Structures

→

LLM Translation

→

Java Repository

→

Compile and Compare Outputs

Performance Tiers

Details the stark multi-tiered performance divide among the 19 models tested.

Tier	Description	Key Characteristics	Models
Tier 1 (Viable)	Passed >50% of tests. Handled complex rules, but failed on most advanced concepts.	Understood and acted on complex, rule-based instructions Failures on C-specific concepts (enums, sizeof, function pointers)	deepseek-coder-v2, codeqwen, phi4
Tier 2 (Flawed but Occasionally Successful)	Passed some tests (20-40%). Showed flawed semantic understanding and copied C-like syntax.	Produced runnable code, but major semantic failures (wrong output, C-like syntax for malloc/printf/&) Shallow syntactic understanding	mistral-nemo, mistral
Tier 3 (Complete Failure)	0% success rate. Unable to generate basic runnable Java boilerplate.	Failed at fundamental level (ClassNotFoundError, missing main, non-code hallucinations) Incapable of basic Java structure	llama3.1, gemma3, starcoder2, etc.

Key Findings

Highlights specific successes and universal failures.

1/19 Model passed T3 (Union) and T8 (Bitfields) due to superior rule application.

Case Study: Pointer Arithmetic (T1)

Highlight: deepseek-coder-v2's successful refactoring of C pointers to Java array indices.

Test Case 1 (T1) was a foundational test for C pointer arithmetic. The deepseek-coder-v2 model demonstrated perfect adherence to prompt rules, refactoring C pointer logic into Java array index logic using a separate index variable.

Case Study: goto Statement (T13) Failure

Highlight: mistral-nemo's syntactically invalid translation of C goto to Java 'continue'.

Test Case 13 (T13) presented a harder C specific control flow problem, requiring refactoring a goto-based loop. mistral-nemo produced a syntactically invalid translation by naively mapping 'goto' to 'continue', demonstrating its inability to perform semantic restructuring.

Calculate Your Potential ROI

Estimate the potential return on investment for modernizing legacy C code to Java with AI assistance.

Industry

Number of Developers Impacted

Average Weekly Hours on Maintenance

Average Hourly Rate ($)

Annual Estimated Savings $194,500

Annual Hours Reclaimed 2,925 hours

Get a Custom Estimate

Your AI-Powered Modernization Roadmap

A phased approach to integrate AI-powered code translation into your enterprise workflow.

Phase 1: Pilot & Assessment

Translate a small, critical C module to Java, assess translation quality, and refine AI prompting strategies.

Phase 2: Tooling Integration

Integrate REMODEL-LLM pipeline with existing CI/CD, establish automated verification, and build internal expertise.

Phase 3: Iterative Modernization

Expand to larger C codebases, focusing on modules with high maintenance burden or security risks, with human-in-the-loop review.

Phase 4: Ecosystem & Optimization

Leverage Java ecosystem benefits, optimize translated code for performance, and continuous monitoring.

Ready to Transform Your Legacy Code?

Connect with our experts to discuss how REMODEL-LLM can accelerate your C to Java migration.

Schedule Your Strategy Session Explore Our Solutions

An Empirical Study on Quantized LLMs for Legacy Code Modernization

REMODEL-LLM: Transforming C code to Java using LLMs

Executive Impact: Key Findings for Enterprises

Deep Analysis & Enterprise Applications

Methodology

Enterprise Process Flow

Performance Tiers

Key Findings

Case Study: Pointer Arithmetic (T1)

Case Study: goto Statement (T13) Failure

Calculate Your Potential ROI

Your AI-Powered Modernization Roadmap

Phase 1: Pilot & Assessment

Phase 2: Tooling Integration

Phase 3: Iterative Modernization

Phase 4: Ecosystem & Optimization

Ready to Transform Your Legacy Code?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai