Enterprise AI Analysis
Automating Fuzz Driver Generation for Deep Learning Libraries with Large Language Models
The widespread adoption of deep learning (DL) libraries has raised concerns about their reliability and security. While prior works leveraged large language models (LLMs) to generate test programs for DL library APIs, the hardcoded program behaviors and low code validity rates render them impractical for real-world testing. To address these challenges, we propose FD-FACTORY, a fully automated framework that leverages LLMs to generate fuzz drivers for DL API testing.
Key Outcomes for Enterprise Security
FD-FACTORY significantly enhances the reliability and security posture of deep learning systems through automated fuzz driver generation.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
The widespread adoption of deep learning (DL) libraries has raised concerns about their reliability and security. DL libraries, like conventional codebases, inevitably exhibit inherent security issues, including common program bugs such as incorrect functionalities, numerical errors, and NaN production. More critically, underlying C/C++ implementations increase the likelihood of memory-unsafe vulnerabilities like overflow and null pointer dereference. Prior LLM-based approaches for DL testing, such as TitanFuzz and FuzzGPT, generated one-time static programs, suffered from low success rates (~30%), and lacked mechanisms for reuse and repair. This necessitates a new approach that automates fuzz driver generation and validation, supporting long-term, iterative fuzzing campaigns.
FD-FACTORY is a fully automated framework that leverages LLMs to generate reusable fuzz drivers for DL API testing. Inspired by industrial production lines, it decomposes the generation process into eight stages: Preparation, Initial Fuzz Driver Generation, Early Stop Checks, Verification, Issue Diagnosis, Decision Making, Repair Loop, and Deployment. Each stage uses dedicated agents or tools to enhance construction efficiency and fuzz driver quality. Key innovations include producing reusable fuzz drivers for integration with industrial fuzzing engines, automated constraint analysis via LLMs parsing API documentation, decoupling the program into target function, data constructor, and entry point for reduced LLM task complexity, and integrating static analysis, dynamic fuzzing, and issue diagnosis into a complete automated verification pipeline, including an early stop module.
Enterprise Process Flow: FD-FACTORY Stages
| Framework | Method | PyTorch Success Rate (%) | TensorFlow Success Rate (%) |
|---|---|---|---|
| TitanFuzz | One-time Program | 38.20 | 30.67 |
| FuzzGPT | One-time Program | 27.69 | 17.41 |
| LLM4FDG-DL | Reusable Fuzz Drivers | 19.33 | 10.67 |
| FD-FACTORY | Reusable Fuzz Drivers | 73.67 | 65.33 |
Case Study: Division-by-Zero Vulnerability Discovery
FD-FACTORY successfully uncovered a division-by-zero vulnerability in torch.nn.AdaptiveLogSoftmaxWithLoss. The fuzzer's data_constructor generated an input where div_value was 0.0. This input propagated to the underlying C/C++ implementation, causing a runtime crash due to lack of input validation. This demonstrates the framework's ability to trigger real-world defects and highlights the importance of comprehensive fuzz driver design, especially robust data construction and validation.
Calculate Your Potential ROI
Estimate the efficiency gains and cost savings your enterprise could realize by automating AI development and deployment.
Your Automated Fuzzing Roadmap
A phased approach to integrating FD-FACTORY into your existing development and security workflows.
Constraint Analysis & Initial Generation
Leverage LLMs to parse API docs for constraints and generate the first fuzz driver draft, tailored to your DL libraries.
Automated Verification & Diagnosis
Apply early stop checks, static (Ruff) and dynamic analysis (Atheris) to automatically identify syntax, semantic, and runtime issues in generated drivers.
Iterative Repair & Refinement
Utilize LLMs in a continuous repair loop to iteratively fix identified issues, improve code quality, and enhance fuzz driver effectiveness.
Deployment & Long-term Fuzzing
Deploy validated, reusable fuzz drivers for continuous, resource-efficient vulnerability detection and coverage tracking in real-world scenarios.
Ready to Transform Your AI Security?
Automate fuzz driver generation, enhance coverage, and proactively detect vulnerabilities in your deep learning libraries with FD-FACTORY.