Enterprise AI Analysis

A Data and Knowledge Cross-Level Fusion-Driven Learning Framework for Detecting Missing Diagnosis

Shaohui Liu, Xien Liu, Xinyue Fang, Chenwei Yan, Kaiyin Zhou, Xinxin You, Meiwei Li & Ji Wu

Received: 4 February 2025 | Accepted: 29 April 2026

This paper introduces DKFusion, a novel data and knowledge cross-level fusion-driven learning framework designed for the automated identification of missed diagnoses in Electronic Medical Records (EMRs). Addressing issues like inaccurate documentation, incorrect DRG assignments, and reduced reimbursements, DKFusion integrates diagnosis recall, contextual validation, and deduplication modules. Evaluated on real-world EMRs from six Chinese hospitals, the model significantly outperforms traditional and LLM-based baselines, demonstrating superior F1 scores and boosting precision. It can identify potential missed diagnoses in 37.8% of EMRs, leading to altered DRG groupings in 9.0% of cases and affecting 3.2% of insurance reimbursement. DKFusion also supports human-AI collaboration modes, boosting efficiency and precision in clinical workflows.

Schedule Your Strategy Session

Quantifying the Impact on Healthcare Operations

DKFusion delivers tangible benefits across key healthcare metrics, from diagnostic accuracy to financial optimization and operational efficiency.

0 EMRs with Missed Diagnoses Identified

0 DRG Grouping Changes Triggered

0 Insurance Reimbursement Impacted

0 In-Domain F1 Score Achieved

0 Minimum Precision Boost vs. Baselines

0 Average EMR Review Time with AI

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

DKFusion's Superior Performance

DKFusion demonstrates superior performance across both in-domain and out-of-domain settings, substantially surpassing traditional baselines (including BERT-based and expert-system approaches), as well as LLMs utilizing standard instruction prompting or general medical multi-agent frameworks. Specifically, DKFusion achieved F1 scores of 62.5% on the in-domain set and 59.9% on the out-of-domain set. Even when compared to LLMs utilizing targeted supervised SFT, DKFusion remains highly competitive; it outperforms the second-best model, Baichuan-M2-SFT-Section, by 6.4% on in-domain tasks while maintaining comparable out-of-domain generalization (+0.1%). Overall, while LLMs show immense potential in this task when subjected to targeted optimization, our model achieves greater efficiency and robustness through a fusion-driven strategy of knowledge and data. It accomplishes this despite having less than 1% of the parameter count of Baichuan-M2-32B.

Quantifying DRG & Reimbursement Benefits

In the DRG payment system, 9.03% of cases with missed diagnoses found by our model will lead to DRG grouping changes, resulting in an increase of 3.15% in medical insurance payments. This emphasizes the significant impact of missed discharge diagnoses within the DRG payment system. We found that the impact of missed diagnoses on costs varies significantly, related to CHS-DRG settings and the original DRG grouping of the EMRs. For medical groups with high original costs, missed diagnoses may have a more significant impact on the care process, leading to a greater impact on costs when CC/MCC is added.

Optimizing Clinical Workflows with AI

We explored two modes of human-machine collaboration: (1) Model-driven mode: Prioritizes precision under model leadership, with human supervision to minimize prediction errors, ideal for rapid evaluations and batch EMR processing. DKFusion-S offers higher precision and minimizes false alerts. (2) Specialist-driven mode: Human specialists lead decision-making, with model support reducing workload while ensuring accuracy, crucial for EMR quality control and complete diagnosis lists. Simulated tests showed that using the model for recommendations and expert verification reduces manual EMR review time from approximately 24 minutes to 2.3 minutes per EMR, a nearly tenfold increase in efficiency. For the model-driven mode, the system achieves physician-comparable precision with an average of 6.8 seconds per EMR.

Understanding Model Limitations & Future Directions

Our error analysis revealed two main categories: false positives (29.9%) and false negatives (47.6%). False positives occur when diagnoses are incorrectly identified in context or are already recorded. False negatives result from recall failures (e.g., diagnosis not in dictionary) or erroneous clinical associations (linking similar but distinct diagnoses). The study acknowledges limitations including restricted effective coverage of ICD-10 codes, empirical verification only in Chinese EMRs, challenges with long-tail errors, and focusing only on explicitly documented conditions. Future work will expand annotation scope, incorporate formal diagnostic criteria, and jointly optimize with related DRG tasks.

Enterprise Process Flow: DKFusion Framework

Diagnosis Recall

→

Contextual Validation

→

Diagnosis Deduplication

DKFusion leverages a three-step pipeline to identify and validate missed diagnoses, ensuring accuracy and efficiency in complex EMR data.

2.3 Minutes Average EMR Review Time with AI Support (down from 24 minutes manually)

Human-Machine Collaboration Performance (In-Domain Test Set)

Method	Precision	Recall	F1	Infer Speed (Time per record)
Doctor	80.8%	77.6%	79.1%	1440.0 seconds
DKFusion followed by doctor (Specialist-Driven)	97.9%	53.9%	69.5%	138.0 seconds
DKFusion-S (Model-Driven)	81.2%	17.1%	28.2%	6.8 seconds

This table highlights the significant efficiency gains and improved precision when integrating DKFusion into human review workflows.

Real-World Impact: Detecting a Missed Diagnosis

This case study exemplifies the significant impact of detecting a previously missed diagnosis using DKFusion. A patient with malignant intracranial tumors developed postoperative intracranial pneumatocoele after tumor resection. In the original EMR, this condition was omitted from the discharge diagnosis list as a CC diagnosis. DKFusion identified this crucial omission, leading to a revision from Current DRG BR25:Cerebral ischemic disorder without complications or comorbidities to New DRG BR21:Cerebral ischemic disorder with severe complications and comorbidities. This correction resulted in a substantial financial impact, preventing a financial loss of 18,849 RMB (approximately $2,600 USD) due to inaccurate grouping and payment. The system provides clear evidence by highlighting its mention in 'Special examinations' and 'Diagnosis and treatment process' sections of the EMR, facilitating rapid physician confirmation.

Calculate Your Potential AI ROI

Estimate the financial and operational benefits of implementing AI solutions in your enterprise.

Your Industry

Number of Employees (impacted by manual data tasks)

Avg. Weekly Hours Spent on Manual Data Tasks per Employee

Avg. Hourly Cost per Employee ($)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A structured approach to integrating advanced AI, ensuring seamless adoption and maximum value.

Phase 1: Data & Knowledge Fusion

Establish robust data pipelines and integrate domain-specific knowledge to create a comprehensive foundation for AI models. This phase includes constructing diagnostic dictionaries and leveraging existing ICD knowledge with EMR data.

Phase 2: Model Training & Evaluation

Develop and rigorously test AI models using a combination of supervised and contrastive learning, ensuring high performance across various clinical scenarios. This involves fine-tuning modules like diagnosis recall, contextual validation, and deduplication.

Phase 3: Human-AI Collaboration Integration

Implement and optimize human-machine collaboration workflows, allowing clinicians to efficiently review and validate AI-generated insights. This includes designing model-driven and specialist-driven modes for optimal efficiency.

Phase 4: Real-World Deployment & Impact Analysis

Deploy the AI solution within existing hospital information systems and continuously monitor its impact on diagnostic accuracy, DRG assignments, and insurance reimbursement. Regular evaluation ensures ongoing benefits and refinement.

Plan Your Phased Rollout

Ready to Transform Your Healthcare Operations?

Book a personalized strategy session with our AI experts to explore how DKFusion can benefit your institution.

Book a Free Consultation

Enterprise AI Analysis

A Data and Knowledge Cross-Level Fusion-Driven Learning Framework for Detecting Missing Diagnosis

Quantifying the Impact on Healthcare Operations

Deep Analysis & Enterprise Applications

DKFusion's Superior Performance

Quantifying DRG & Reimbursement Benefits

Optimizing Clinical Workflows with AI

Understanding Model Limitations & Future Directions

Enterprise Process Flow: DKFusion Framework

Human-Machine Collaboration Performance (In-Domain Test Set)

Real-World Impact: Detecting a Missed Diagnosis

Calculate Your Potential AI ROI

Your AI Implementation Roadmap

Phase 1: Data & Knowledge Fusion

Phase 2: Model Training & Evaluation

Phase 3: Human-AI Collaboration Integration

Phase 4: Real-World Deployment & Impact Analysis

Ready to Transform Your Healthcare Operations?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai