AI in Autonomous Driving
K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation
Authors: Mingxuan Mu, Guo Yang, Lei Chen, Ping Wu, Jianxun Cui
Published Date: March 5, 2026
Generating realistic and diverse trajectories is a critical challenge in autonomous driving simulation. While Large Language Models (LLMs) show promise, existing methods often rely on structured data like vectorized maps, which fail to capture the rich, unstructured visual context of a scene. To address this, we propose K-Gen, an interpretable keypoint-guided multimodal framework that leverages Multimodal Large Language Models (MLLMs) to unify rasterized BEV map inputs with textual scene descriptions. Instead of directly predicting full trajectories, K-Gen generates interpretable keypoints along with reasoning that reflects agent intentions, which are subsequently refined into accurate trajectories by a refinement module. To further enhance keypoint generation, we apply T-DAPO, a trajectory-aware reinforcement fine-tuning algorithm. Experiments on WOMD and nuPlan demonstrate that K-Gen outperforms existing baselines, highlighting the effectiveness of combining multimodal reasoning with keypoint-guided trajectory generation.
Executive Impact: K-Gen in Action
K-Gen introduces a novel multimodal framework for interpretable, high-fidelity trajectory generation in autonomous driving. By integrating rasterized map inputs and separating strategic intent from motion execution, K-Gen provides a new paradigm for language-guided modeling. Extensive experiments on WOMD and nuPlan validate its superior accuracy, safety, and generalization in diverse autonomous driving scenarios.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Multimodal Trajectory Generation
K-Gen unifies rasterized BEV map inputs with textual scene descriptions using MLLMs. This approach captures the rich, unstructured visual context of a scene, enabling more flexible and faithful understanding of complex driving environments compared to traditional vectorized map inputs. The key innovation is moving beyond structured data to embrace raw visual and textual data for a holistic scene understanding.
Interpretable Keypoint Generation
Keypoint-Guided MethodologyInstead of direct full trajectory prediction, K-Gen generates sparse, interpretable keypoints along with reasoning reflecting agent intentions. These keypoints are then refined into accurate full trajectories by a specialized refinement module. This two-step process enhances accuracy, stability, and interpretability, providing human-readable explanations of agent behavior.
K-Gen's Trajectory Generation Flow
The system processes multimodal scene data, uses a MLLM for reasoning and sparse keypoint generation, and then refines these into complete, accurate trajectories. This modular design ensures both high-level intent capture and fine-grained motion control.
| Feature | Standard DAPO | T-DAPO (K-Gen) |
|---|---|---|
| Focus | General LLM Alignment | Trajectory Generation & Interpretability |
| Reward Signals | Preference-based | Trajectory-centric (Accuracy, CoT Length, Format Correctness) |
| Sample Weighting | Uniform | Challenging Samples Emphasis (top 30% mADE/mFDE) |
| Output Control | Coarse-grained | Fine-grained, physically consistent motion reconstruction |
| Benefits |
|
|
T-DAPO (Trajectory-aware Decoupled Clip and Dynamic Sampling Policy Optimization) is a specialized reinforcement fine-tuning algorithm designed for trajectory generation. It incorporates trajectory-centric reward signals and emphasizes challenging samples, leading to higher fidelity and more interpretable outputs than standard DAPO.
Impact on Autonomous Driving
Customer: Changan Automobile Co., Ltd
Problem: Traditional autonomous driving simulations struggle with generating diverse, realistic, and interpretable traffic scenarios due to reliance on structured, abstract map data.
Solution: K-Gen's multimodal approach, combining visual maps with text descriptions, generates accurate and human-interpretable trajectories. Its keypoint-guided strategy and T-DAPO algorithm produce physically consistent and safe motions.
Results: Outperformed existing baselines in trajectory quality (mADE, mFDE, SCR) on WOMD and nuPlan datasets. Achieved superior interpretability through Chain-of-Thought reasoning and attention heatmaps that highlight safety-critical regions. Leads to more robust and explainable autonomous driving systems.
Calculate Your Potential ROI with K-Gen
Estimate the significant time savings and cost efficiencies your organization could achieve by integrating advanced AI solutions like K-Gen into your autonomous driving development. This calculator provides a realistic projection based on industry benchmarks.
ROI Projection
Your K-Gen Implementation Roadmap
Our phased approach ensures a seamless integration of K-Gen into your existing systems, maximizing impact while minimizing disruption. Each step is designed for clarity, efficiency, and measurable results.
Discovery & Strategy
Understand your current simulation workflows, identify key challenges, and define success metrics. Develop a tailored strategy for K-Gen's integration.
Data Integration & Model Fine-tuning
Integrate your specific map data and driving scenarios. Fine-tune K-Gen with your proprietary datasets using T-DAPO for optimal performance and domain adaptation.
Pilot Deployment & Validation
Deploy K-Gen in a controlled environment, validate generated trajectories against real-world data and benchmarks. Gather feedback and iterate for refinements.
Full-Scale Rollout & Optimization
Integrate K-Gen into your full simulation pipeline. Provide ongoing support, monitoring, and continuous optimization to ensure sustained high performance and ROI.
Ready to Transform Your Autonomous Driving Simulations?
Don't let the complexities of trajectory generation hinder your progress. K-Gen offers a powerful, interpretable, and multimodal solution to accelerate your autonomous driving development. Schedule a free consultation with our AI experts to explore how K-Gen can specifically address your organization's needs.