ORIGINAL ARTICLE
Large Language Models can Identify the Presence of MASH and Extract VCTE Measurements from Unstructured Documentation
Metabolic dysfunction-associated steatohepatitis (MASH) is a leading cause of cirrhosis. Vibration Controlled Transient Elastography (VCTE) measurements are often captured in text-based reports and not readily accessible for clinical research. Large language models (LLMs) show promise for curating information from unstructured documentation, but their efficiency for MASH and VCTE extraction are unclear. We used a cohort of 493 patients with compensated MASH cirrhosis. We compared the abilities of GPT-40 and Claude 3.5 Sonnet for identifying the presence of MASH and extracting maximum VCTE stiffness and Controlled Attenuation Parameter (CAP) measurements from clinical documentation. We ran a cost analysis of the LLMs. As exploratory analysis, we used LASSO-Cox to associate LLM-extracted features with death or decompensation. For identifying MASH in clinical notes, GPT-40 and Claude 3.5 achieved F1-scores of 90.5% and 80.0%. For identifying peak VCTE measurements, GPT-4o achieved 99.3% and 99.1% accuracies for stiffness and CAP, while Claude 3.5 achieved 93.3% and 94.1% accuracies. LLM extraction of one variable required ~ 2000 tokens per note, with a cost of ~$0.012/note for GPT-40 and ~$0.014/note for Claude 3.5. In LASSO-Cox regressions, VCTE stiffness (HR 1.03, 95% CI 1.01-1.05, p=0.016) and CAP score (HR 0.99, 95% CI 0.99–1.00, p=0.029) were statistically significant predictive variables for death or decompensation. LLMs can extract MASH presence and VCTE parameters from documentation with high accuracy and low cost. When incorporated into survival analyses, LLM-extracted variables are associated with important clinical outcomes. Given the growing availability of LLMs, liver diseases researchers should incorporate these methods to facilitate real-world studies.
Executive Impact: At a Glance
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Enterprise Process Flow
| Metric | GPT-4o | Claude 3.5 Sonnet | Rule-Based NLP |
|---|---|---|---|
| MASH F1-score | 90.5% | 80.0% | 84.4% |
| VCTE Stiffness Accuracy | 99.3% | 93.3% | N/A |
| VCTE CAP Accuracy | 99.1% | 94.1% | N/A |
| Cost per Note (approx.) | ~$0.012 | ~$0.014 | Higher (development) |
Impact on Liver Disease Research
LLMs can extract MASH presence and VCTE parameters from unstructured documentation with high accuracy and low cost. When incorporated into survival analyses, these LLM-extracted variables are associated with important clinical outcomes, facilitating real-world studies for population-level management of MASH and improved risk stratification.
Estimate Your Enterprise AI ROI
Calculate potential time and cost savings by automating data extraction with LLMs.
Your AI Implementation Roadmap
A structured approach to integrating LLM-based data extraction into your enterprise workflows.
Discovery & Strategy
Assess existing data workflows, identify LLM integration points, and define clear objectives for MASH and VCTE data extraction.
Platform Integration
Securely integrate LLM APIs (e.g., GPT-4o, Claude 3.5 Sonnet) into your UCSF-like private AI platform, ensuring PHI compliance.
Prompt Engineering & Validation
Develop and refine prompts for MASH identification and VCTE measurement extraction. Validate accuracy against gold-standard data.
Pilot Deployment & Refinement
Roll out LLM-based extraction in a pilot program, gather feedback, and fine-tune models for optimal performance and cost-efficiency.
Full-Scale Rollout & Monitoring
Implement across your enterprise. Establish continuous monitoring for data quality and model drift. Incorporate LLM-extracted data into research workflows.
Ready to Transform Your Data Extraction?
Unlock the full potential of your unstructured clinical data with cutting-edge AI. Schedule a personalized consultation to explore how LLMs can streamline your research and improve patient outcomes.