AI & Machine Learning
Multimodal Data Fusion for Tabular and Textual Data: Zero-Shot, Few-Shot, and Fine-Tuning of Generative Pre-Trained Transformer Models
In traffic safety analysis, previous research has often focused on tabular data or textual crash narratives in isolation, neglecting the potential benefits of a hybrid multimodal approach. This study introduces the Multimodal Data Fusion (MDF) framework, which fuses tabular data with textual narratives by leveraging advanced Large Language Models (LLMs), such as GPT-2, GPT-3.5, and GPT-4.5, using zero-shot (ZS), few-shot (FS), and fine-tuning (FT) learning strategies. We employed few-shot learning with GPT-4.5 to generate new labels for traffic crash analysis, such as driver fault, driver actions, and crash factors, alongside the existing label for severity. Our methodology was tested on crash data from the Missouri State Highway Patrol, demonstrating significant improvements in model performance. GPT-2 (fine-tuned) was used as the baseline model, against which more advanced models were evaluated. GPT-4.5 few-shot learning achieved 98.9% accuracy for crash severity prediction and 98.1% accuracy for driver fault classification. In crash factor extraction, GPT-4.5 few-shot achieved the highest Jaccard score (82.9%), surpassing GPT-3.5 and fine-tuned GPT-2 models. Similarly, in driver actions extraction, GPT-4.5 few-shot attained a Jaccard score of 73.1%, while fine-tuned GPT-2 closely followed with 72.2%, demonstrating that task-specific fine-tuning can achieve performance close to state-of-the-art models when adapted to domain-specific data. These findings highlight the superior performance of GPT-4.5 few-shot learning, particularly in classification and information extraction tasks, while also underscoring the effectiveness of fine-tuning on domain-specific datasets to bridge performance gaps with more advanced models. The MDF framework's success demonstrates its potential for broader applications beyond traffic crash analysis, particularly in domains where labeled data are scarce and predictive modeling is essential.
Executive Impact Summary
The Multimodal Data Fusion (MDF) framework significantly advances traffic crash analysis by integrating tabular and textual data using Generative Pre-Trained Transformer (GPT) models. This novel approach enables the generation of crucial new labels like 'driver fault,' 'driver actions,' and 'crash factors' through few-shot learning, augmenting existing severity classifications. Tested on Missouri State Highway Patrol data, the framework achieved remarkable accuracy: GPT-4.5 few-shot demonstrated 98.9% accuracy for crash severity prediction and 98.1% for driver fault classification. In information extraction, it scored 82.9% Jaccard for crash factors and 73.1% for driver actions, outperforming GPT-3.5 and the GPT-2 fine-tuned baseline. These results underscore the power of advanced LLMs and few-shot learning in complex, data-scarce domains, paving the way for more precise road safety interventions.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Methodology
This section details the Multimodal Data Fusion (MDF) framework, which integrates structured tabular data with unstructured textual crash narratives using advanced Large Language Models (LLMs) such as GPT-2, GPT-3.5, and GPT-4.5. Key steps include data preprocessing, serialization of tabular data into text, and the generation of new labels (driver fault, driver actions, crash factors) using GPT-4.5's few-shot learning capabilities. The framework emphasizes validation by domain experts and evaluates model performance using metrics like Accuracy, Precision, Recall, F1-Score, and Jaccard Index across zero-shot, few-shot, and fine-tuning strategies. This holistic approach addresses the limitations of traditional analysis by providing a richer context for understanding traffic incidents.
Results
The evaluation demonstrated that GPT-4.5 few-shot learning consistently outperformed other models in crash severity and driver fault classification, achieving 98.9% and 98.1% accuracy, respectively. For multi-label information extraction, GPT-4.5 few-shot also led with Jaccard scores of 82.9% for crash factors and 73.1% for driver actions. While GPT-2 fine-tuned performed comparably in severity prediction, it struggled with nuanced tasks like identifying 'not at fault' cases. The fused data approach significantly improved predictive accuracy and provided deeper insights into crash dynamics compared to unimodal models, highlighting the critical role of narrative data in capturing contextual nuances.
Implications
The MDF framework offers substantial implications for road safety and urban mobility. By accurately identifying driver fault, actions, and crash factors, policymakers can develop targeted interventions and data-driven safety measures. The success of few-shot learning with advanced LLMs in data-scarce environments makes this framework highly adaptable for various domains beyond traffic safety, such as healthcare analytics and disaster response. The ability to extract nuanced insights from fused data can revolutionize transportation management and smart city initiatives, leading to more efficient traffic management and proactive accident prevention.
Enterprise Process Flow
| Name | Attributes | Advantages |
|---|---|---|
| GPT-4.5 Few-Shot |
|
|
| GPT-3.5 Few-Shot |
|
|
| GPT-2 Fine-Tuned (Baseline) |
|
|
Impact of Data Fusion on Driver Fault Prediction
Our case studies demonstrate how combining tabular and narrative data significantly improves driver fault prediction and factor extraction. For instance, in 'Example 1: Fault Change Due to Combined Data', tabular-only analysis incorrectly predicted 'No' fault, while narrative-only suggested 'Yes'. The fused model, leveraging full context including driver swerving to avoid collision with an unknown vehicle and hitting a median barrier, accurately identified 'No' fault. This highlights that narrative data provides crucial contextual details that tabular data alone often miss, leading to more accurate and reliable classifications critical for targeted road safety interventions. Another example from our dataset shows that only fused data could identify complex scenarios involving both vehicle defects and multiple vehicle collisions, ensuring comprehensive factor extraction.
Estimate Your AI-Driven Impact
Calculate the potential time savings and cost efficiencies your organization could achieve by implementing an AI solution similar to the MDF framework. Tailor the estimates based on your industry and team size.
Your AI Implementation Roadmap
Our phased approach ensures a smooth and effective integration of AI solutions tailored to your enterprise needs, from initial strategy to continuous optimization.
Phase 1: Discovery & Strategy Alignment
Comprehensive assessment of your current data infrastructure and business objectives. We identify key areas where AI can deliver maximum impact and define a clear strategic roadmap.
Phase 2: Data Engineering & Model Prototyping
Data collection, cleaning, and preparation. Development of initial AI models, including custom few-shot prompts and fine-tuning strategies tailored to your specific data and tasks.
Phase 3: Integration & Pilot Deployment
Seamless integration of the AI framework into your existing systems. Pilot deployment with a select team to gather feedback and refine the solution in a controlled environment.
Phase 4: Full-Scale Rollout & Training
Deployment of the AI solution across your enterprise, accompanied by comprehensive training for your teams to ensure effective adoption and utilization.
Phase 5: Performance Monitoring & Optimization
Continuous monitoring of AI model performance and business impact. Iterative refinement and optimization to adapt to evolving data patterns and maximize long-term ROI.
Ready to Transform Your Enterprise with AI?
Book a free consultation with our AI strategists to explore how multimodal data fusion and advanced LLMs can drive unprecedented insights and efficiencies for your business.