Skip to main content
Enterprise AI Analysis: Multimodal Data Fusion for Tabular and Textual Data: Zero-Shot, Few-Shot, and Fine-Tuning of Generative Pre-Trained Transformer Models

AI & Machine Learning

Multimodal Data Fusion for Tabular and Textual Data: Zero-Shot, Few-Shot, and Fine-Tuning of Generative Pre-Trained Transformer Models

In traffic safety analysis, previous research has often focused on tabular data or textual crash narratives in isolation, neglecting the potential benefits of a hybrid multimodal approach. This study introduces the Multimodal Data Fusion (MDF) framework, which fuses tabular data with textual narratives by leveraging advanced Large Language Models (LLMs), such as GPT-2, GPT-3.5, and GPT-4.5, using zero-shot (ZS), few-shot (FS), and fine-tuning (FT) learning strategies. We employed few-shot learning with GPT-4.5 to generate new labels for traffic crash analysis, such as driver fault, driver actions, and crash factors, alongside the existing label for severity. Our methodology was tested on crash data from the Missouri State Highway Patrol, demonstrating significant improvements in model performance. GPT-2 (fine-tuned) was used as the baseline model, against which more advanced models were evaluated. GPT-4.5 few-shot learning achieved 98.9% accuracy for crash severity prediction and 98.1% accuracy for driver fault classification. In crash factor extraction, GPT-4.5 few-shot achieved the highest Jaccard score (82.9%), surpassing GPT-3.5 and fine-tuned GPT-2 models. Similarly, in driver actions extraction, GPT-4.5 few-shot attained a Jaccard score of 73.1%, while fine-tuned GPT-2 closely followed with 72.2%, demonstrating that task-specific fine-tuning can achieve performance close to state-of-the-art models when adapted to domain-specific data. These findings highlight the superior performance of GPT-4.5 few-shot learning, particularly in classification and information extraction tasks, while also underscoring the effectiveness of fine-tuning on domain-specific datasets to bridge performance gaps with more advanced models. The MDF framework's success demonstrates its potential for broader applications beyond traffic crash analysis, particularly in domains where labeled data are scarce and predictive modeling is essential.

Executive Impact Summary

The Multimodal Data Fusion (MDF) framework significantly advances traffic crash analysis by integrating tabular and textual data using Generative Pre-Trained Transformer (GPT) models. This novel approach enables the generation of crucial new labels like 'driver fault,' 'driver actions,' and 'crash factors' through few-shot learning, augmenting existing severity classifications. Tested on Missouri State Highway Patrol data, the framework achieved remarkable accuracy: GPT-4.5 few-shot demonstrated 98.9% accuracy for crash severity prediction and 98.1% for driver fault classification. In information extraction, it scored 82.9% Jaccard for crash factors and 73.1% for driver actions, outperforming GPT-3.5 and the GPT-2 fine-tuned baseline. These results underscore the power of advanced LLMs and few-shot learning in complex, data-scarce domains, paving the way for more precise road safety interventions.

0 Severity Prediction Accuracy
0 Driver Fault Classification Accuracy
0 Crash Factor Extraction Jaccard Score
0 Driver Actions Extraction Jaccard Score

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Methodology

This section details the Multimodal Data Fusion (MDF) framework, which integrates structured tabular data with unstructured textual crash narratives using advanced Large Language Models (LLMs) such as GPT-2, GPT-3.5, and GPT-4.5. Key steps include data preprocessing, serialization of tabular data into text, and the generation of new labels (driver fault, driver actions, crash factors) using GPT-4.5's few-shot learning capabilities. The framework emphasizes validation by domain experts and evaluates model performance using metrics like Accuracy, Precision, Recall, F1-Score, and Jaccard Index across zero-shot, few-shot, and fine-tuning strategies. This holistic approach addresses the limitations of traditional analysis by providing a richer context for understanding traffic incidents.

Results

The evaluation demonstrated that GPT-4.5 few-shot learning consistently outperformed other models in crash severity and driver fault classification, achieving 98.9% and 98.1% accuracy, respectively. For multi-label information extraction, GPT-4.5 few-shot also led with Jaccard scores of 82.9% for crash factors and 73.1% for driver actions. While GPT-2 fine-tuned performed comparably in severity prediction, it struggled with nuanced tasks like identifying 'not at fault' cases. The fused data approach significantly improved predictive accuracy and provided deeper insights into crash dynamics compared to unimodal models, highlighting the critical role of narrative data in capturing contextual nuances.

Implications

The MDF framework offers substantial implications for road safety and urban mobility. By accurately identifying driver fault, actions, and crash factors, policymakers can develop targeted interventions and data-driven safety measures. The success of few-shot learning with advanced LLMs in data-scarce environments makes this framework highly adaptable for various domains beyond traffic safety, such as healthcare analytics and disaster response. The ability to extract nuanced insights from fused data can revolutionize transportation management and smart city initiatives, leading to more efficient traffic management and proactive accident prevention.

0
GPT-4.5 Few-Shot Accuracy for Crash Severity

Enterprise Process Flow

Cleanse Tabular Data
Serialize Data (Tabular + Narrative)
Generate Labels (GPT-4 Few-Shot)
Validate Labels (Expert Review)
Data Splitting
Model Conditioning & Application
Evaluate & Aggregate Insights

Model Performance Comparison (F1-Scores)

Name Attributes Advantages
GPT-4.5 Few-Shot
  • High Accuracy
  • Multi-label Efficiency
  • Contextual Understanding
  • Superior performance in classification and extraction tasks
  • Effective with minimal labeled data
  • Captures nuanced textual data
GPT-3.5 Few-Shot
  • Good Accuracy
  • Adaptable
  • Significant improvement over baseline
  • Handles diverse tasks
GPT-2 Fine-Tuned (Baseline)
  • Decent Severity Prediction
  • Struggles with 'Not At Fault'
  • Strong in binary severity classification (fatal/non-fatal)

Impact of Data Fusion on Driver Fault Prediction

Our case studies demonstrate how combining tabular and narrative data significantly improves driver fault prediction and factor extraction. For instance, in 'Example 1: Fault Change Due to Combined Data', tabular-only analysis incorrectly predicted 'No' fault, while narrative-only suggested 'Yes'. The fused model, leveraging full context including driver swerving to avoid collision with an unknown vehicle and hitting a median barrier, accurately identified 'No' fault. This highlights that narrative data provides crucial contextual details that tabular data alone often miss, leading to more accurate and reliable classifications critical for targeted road safety interventions. Another example from our dataset shows that only fused data could identify complex scenarios involving both vehicle defects and multiple vehicle collisions, ensuring comprehensive factor extraction.

Estimate Your AI-Driven Impact

Calculate the potential time savings and cost efficiencies your organization could achieve by implementing an AI solution similar to the MDF framework. Tailor the estimates based on your industry and team size.

Estimated Annual Cost Savings
$0
Estimated Annual Hours Reclaimed
0

Your AI Implementation Roadmap

Our phased approach ensures a smooth and effective integration of AI solutions tailored to your enterprise needs, from initial strategy to continuous optimization.

Phase 1: Discovery & Strategy Alignment

Comprehensive assessment of your current data infrastructure and business objectives. We identify key areas where AI can deliver maximum impact and define a clear strategic roadmap.

Phase 2: Data Engineering & Model Prototyping

Data collection, cleaning, and preparation. Development of initial AI models, including custom few-shot prompts and fine-tuning strategies tailored to your specific data and tasks.

Phase 3: Integration & Pilot Deployment

Seamless integration of the AI framework into your existing systems. Pilot deployment with a select team to gather feedback and refine the solution in a controlled environment.

Phase 4: Full-Scale Rollout & Training

Deployment of the AI solution across your enterprise, accompanied by comprehensive training for your teams to ensure effective adoption and utilization.

Phase 5: Performance Monitoring & Optimization

Continuous monitoring of AI model performance and business impact. Iterative refinement and optimization to adapt to evolving data patterns and maximize long-term ROI.

Ready to Transform Your Enterprise with AI?

Book a free consultation with our AI strategists to explore how multimodal data fusion and advanced LLMs can drive unprecedented insights and efficiencies for your business.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking