AI in Autonomous Driving

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Authors: Mingxuan Mu, Guo Yang, Lei Chen, Ping Wu, Jianxun Cui

Published Date: March 5, 2026

Generating realistic and diverse trajectories is a critical challenge in autonomous driving simulation. While Large Language Models (LLMs) show promise, existing methods often rely on structured data like vectorized maps, which fail to capture the rich, unstructured visual context of a scene. To address this, we propose K-Gen, an interpretable keypoint-guided multimodal framework that leverages Multimodal Large Language Models (MLLMs) to unify rasterized BEV map inputs with textual scene descriptions. Instead of directly predicting full trajectories, K-Gen generates interpretable keypoints along with reasoning that reflects agent intentions, which are subsequently refined into accurate trajectories by a refinement module. To further enhance keypoint generation, we apply T-DAPO, a trajectory-aware reinforcement fine-tuning algorithm. Experiments on WOMD and nuPlan demonstrate that K-Gen outperforms existing baselines, highlighting the effectiveness of combining multimodal reasoning with keypoint-guided trajectory generation.

Schedule Your Strategy Session

Executive Impact: K-Gen in Action

K-Gen introduces a novel multimodal framework for interpretable, high-fidelity trajectory generation in autonomous driving. By integrating rasterized map inputs and separating strategic intent from motion execution, K-Gen provides a new paradigm for language-guided modeling. Extensive experiments on WOMD and nuPlan validate its superior accuracy, safety, and generalization in diverse autonomous driving scenarios.

0 Reduction in Collisions

0 Trajectory Accuracy (mADE)

0 Simulation Efficiency

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Core Innovation

Keypoint-Guided Strategy

Workflow

T-DAPO Algorithm

Real-World Impact

Multimodal Trajectory Generation

K-Gen unifies rasterized BEV map inputs with textual scene descriptions using MLLMs. This approach captures the rich, unstructured visual context of a scene, enabling more flexible and faithful understanding of complex driving environments compared to traditional vectorized map inputs. The key innovation is moving beyond structured data to embrace raw visual and textual data for a holistic scene understanding.

Interpretable Keypoint Generation

Keypoint-Guided Methodology

Instead of direct full trajectory prediction, K-Gen generates sparse, interpretable keypoints along with reasoning reflecting agent intentions. These keypoints are then refined into accurate full trajectories by a specialized refinement module. This two-step process enhances accuracy, stability, and interpretability, providing human-readable explanations of agent behavior.

K-Gen's Trajectory Generation Flow

Scene Data (Map Images & Textual Inputs)

→

MLLM (Multimodal Reasoning & Keypoints)

→

Sparse Keypoints

→

Trajectory Refinement Module

→

Complete Trajectories

The system processes multimodal scene data, uses a MLLM for reasoning and sparse keypoint generation, and then refines these into complete, accurate trajectories. This modular design ensures both high-level intent capture and fine-grained motion control.

T-DAPO: Enhanced Reinforcement Fine-Tuning

Feature	Standard DAPO	T-DAPO (K-Gen)
Focus	General LLM Alignment	Trajectory Generation & Interpretability
Reward Signals	Preference-based	Trajectory-centric (Accuracy, CoT Length, Format Correctness)
Sample Weighting	Uniform	Challenging Samples Emphasis (top 30% mADE/mFDE)
Output Control	Coarse-grained	Fine-grained, physically consistent motion reconstruction
Benefits	Improved alignment Stable training	High-fidelity trajectories Interpretable reasoning Enhanced safety & physical compliance

T-DAPO (Trajectory-aware Decoupled Clip and Dynamic Sampling Policy Optimization) is a specialized reinforcement fine-tuning algorithm designed for trajectory generation. It incorporates trajectory-centric reward signals and emphasizes challenging samples, leading to higher fidelity and more interpretable outputs than standard DAPO.

Impact on Autonomous Driving

Customer: Changan Automobile Co., Ltd

Problem: Traditional autonomous driving simulations struggle with generating diverse, realistic, and interpretable traffic scenarios due to reliance on structured, abstract map data.

Solution: K-Gen's multimodal approach, combining visual maps with text descriptions, generates accurate and human-interpretable trajectories. Its keypoint-guided strategy and T-DAPO algorithm produce physically consistent and safe motions.

Results: Outperformed existing baselines in trajectory quality (mADE, mFDE, SCR) on WOMD and nuPlan datasets. Achieved superior interpretability through Chain-of-Thought reasoning and attention heatmaps that highlight safety-critical regions. Leads to more robust and explainable autonomous driving systems.

Calculate Your Potential ROI with K-Gen

Estimate the significant time savings and cost efficiencies your organization could achieve by integrating advanced AI solutions like K-Gen into your autonomous driving development. This calculator provides a realistic projection based on industry benchmarks.

ROI Projection

Your Industry

Number of Employees (impacted by manual processes)

Average Weekly Hours Spent on Manual Tasks

Average Hourly Cost of Labor ($)

Annual Cost Savings $0

Annual Hours Reclaimed 0

Unlock Your AI Potential

Your K-Gen Implementation Roadmap

Our phased approach ensures a seamless integration of K-Gen into your existing systems, maximizing impact while minimizing disruption. Each step is designed for clarity, efficiency, and measurable results.

Discovery & Strategy

Understand your current simulation workflows, identify key challenges, and define success metrics. Develop a tailored strategy for K-Gen's integration.

Data Integration & Model Fine-tuning

Integrate your specific map data and driving scenarios. Fine-tune K-Gen with your proprietary datasets using T-DAPO for optimal performance and domain adaptation.

Pilot Deployment & Validation

Deploy K-Gen in a controlled environment, validate generated trajectories against real-world data and benchmarks. Gather feedback and iterate for refinements.

Full-Scale Rollout & Optimization

Integrate K-Gen into your full simulation pipeline. Provide ongoing support, monitoring, and continuous optimization to ensure sustained high performance and ROI.

Ready to Transform Your Autonomous Driving Simulations?

Don't let the complexities of trajectory generation hinder your progress. K-Gen offers a powerful, interpretable, and multimodal solution to accelerate your autonomous driving development. Schedule a free consultation with our AI experts to explore how K-Gen can specifically address your organization's needs.

Book Your Free Consultation

AI in Autonomous Driving

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Executive Impact: K-Gen in Action

Deep Analysis & Enterprise Applications

Multimodal Trajectory Generation

Interpretable Keypoint Generation

K-Gen's Trajectory Generation Flow

T-DAPO: Enhanced Reinforcement Fine-Tuning

Impact on Autonomous Driving

Calculate Your Potential ROI with K-Gen

ROI Projection

Your K-Gen Implementation Roadmap

Discovery & Strategy

Data Integration & Model Fine-tuning

Pilot Deployment & Validation

Full-Scale Rollout & Optimization

Ready to Transform Your Autonomous Driving Simulations?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai