Enterprise AI Analysis

Healthcare-Focused Turkish Medical LLM: Training on Real Patient-Doctor Question-Answer Data for Enhanced Medical Insight

M. ALI BAYRAM, BANU DIRI, SAVAS YILDIRIM
Yildiz Technical University, Istanbul, Turkey
Published: ACM Trans. Asian Low-Resour. Lang. Inf. Process., Vol. 24, No. 11, Article 129 (November 2025).

This study introduces a specialized Turkish Medical LLM fine-tuned on over 167,732 real patient-doctor question-answer pairs sourced from a trusted medical platform. Utilizing models like LLAMA 3, the fine-tuning process was supported by Low-Rank Adaptation (LoRA) and involved innovative methods to mitigate catastrophic forgetting, including spherical linear interpolation (Slerp) merging. Evaluation through similarity scores, GPT-3.5 assessments, and expert reviews indicates significant improvement in the model's ability to generate medically accurate responses. This Turkish Medical LLM demonstrates potential to support medical decision-making and patient interaction in Turkish healthcare settings, offering an essential resource for enhancing AI inclusivity across languages.

Schedule Your Strategy Session

Executive Impact: Unlocking Clinical Precision

The development of a Turkish Medical LLM trained on real-world patient-doctor interactions demonstrates significant advancements for healthcare AI, offering improved accuracy and cultural relevance.

0.5151 Avg. Similarity Score (Fine-tuned Model)

11.33% Similarity Improvement (vs. Base LLAMA 3)

7.132 GPT-3.5 Score (Fine-tuned Model)

7.85% Proximity to Expert GPT-3.5 Scores

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Understanding Turkish Medical Language

Turkish, as an agglutinative language with rich morphological complexity, presents unique challenges for Large Language Models. General-purpose models often struggle to address the nuanced grammar, syntax, and lexicon of medical Turkish, leading to inaccurate or inappropriate responses. This limitation hinders healthcare professionals and patients from accessing high-quality, culturally relevant AI tools. Developing a Turkish-specific medical LLM is crucial for equitable healthcare delivery and enhancing AI inclusivity across languages.

Real-World Patient-Doctor Interactions

The specialized Turkish Medical LLM was trained on over 167,732 real patient-doctor question-answer pairs sourced from doktorsitesi.com, a widely-used platform in Turkey. This dataset ensures the model captures authentic linguistics and cultural subtleties specific to the Turkish medical context, differentiating it from models trained on synthetic or translated data. The dataset covers a diverse range of medical fields and includes contributions from 84,907 specialists and 66,231 professors, reflecting a broad spectrum of medical expertise.

Fine-tuning LLAMA 3 with LoRA and Slerp Merging

The base model selected for this project was LLAMA 3 (8B) due to its open-source availability and strong performance. The fine-tuning process utilized Low-Rank Adaptation (LoRA) with rank 8 to efficiently update model weights. A critical challenge of catastrophic forgetting, where general language understanding diminishes during specialization, was addressed using an innovative Slerp model merge approach. This technique blends the weights of the fine-tuned model with its base model to retain general language capabilities alongside specialized medical expertise, improving the model's overall language versatility from 19 to 53 out of 100 on Turkish LLM benchmarks.

Robust Evaluation and Performance

The fine-tuned Turkish Medical LLM achieved an average cosine similarity score of 0.5151 to expert answers, representing an 11.33% improvement over the base LLAMA 3 (0.4627). GPT-3.5 assessments further validated its quality, assigning an average rating of 7.132, demonstrating a 7.85% proximity to human expert scores (7.692). Expert human doctors rated the fine-tuned model's responses at 2.59 (on a scale of 1-5), significantly outperforming the base model's 2.00, indicating strong medical relevance, clarity, and comprehensiveness.

Interpreting Culturally Embedded Phrases

A key qualitative finding was the model's ability to correctly interpret culturally embedded Turkish medical expressions. While general models like LLaMA 3 and MedAlpaca misinterpreted metaphorical phrases (e.g., "Boğazıma bir şey battı" - Something stabbed my throat) due to literal translations or geographic misinterpretations, the Turkish Medical LLM accurately understood these as symptoms requiring medical attention (e.g., foreign object in throat) and provided medically sound recommendations. This highlights its essential role in patient-facing applications within the Turkish healthcare context.

Roadmap for Advanced Healthcare AI

Future directions include enhancing the model's adaptability through real-time feedback mechanisms and ensuring it retains linguistic versatility while specializing in medical contexts. Research into cross-language transfer learning could leverage insights from English-language medical LLMs to support Turkish models where data is sparse. Critical ethical considerations, such as patient confidentiality and preventing overreliance on AI for medical decision-making, will guide responsible deployment strategies to ensure the model complements, rather than replaces, human expertise, fostering empathetic and patient-centered AI applications.

11.33% Improved Semantic Similarity to Expert Answers

Enterprise Process Flow

Dataset (doktorsitesi.com)

→

LLAMA 3 (Base Model)

→

Fine-tuning Process

→

SLERP Merge (Base + Fine-tuned Model)

→

Specialized Turkish Medical LLM

Qualitative Comparison: Culturally Embedded Phrases

Prompt (TR/EN)	Model	Review of Model's Response
Mideme bir şey oturdu gibi, ne yapmalıyım? (It feels like something has settled in my stomach, what should I do?)	LLAMA 3	Misinterpreted as emotional confusion; suggested breathing exercises, mindfulness.
	MedAlpaca	Took phrase literally: "I do not have the ability to physically pick up or hold objects."
	Turkish Medical LLM	Correctly interpreted symptoms as possible gastritis or ulcer; recommended healthy diet, herbal teas, probiotics, and consulting a gastroenterologist.
Boğazıma bir şey battı, ne yapmalıyım? (Something stabbed my throat, what should I do?)	LLAMA 3	Interpreted "Boğazıma" as Bosphorus Strait; suggested contacting maritime authorities.
	MedAlpaca	Refused medical advice, citing lack of diagnostic capabilities.
	Turkish Medical LLM	Correctly understood as a possible medical emergency involving a foreign object in the throat; urged immediate consultation with a doctor.

Case Study: Precision in Turkish Medical Context

One of the most compelling demonstrations of our specialized Turkish Medical LLM's capability is its nuanced interpretation of culturally embedded medical expressions. General-purpose models often falter, defaulting to literal or irrelevant responses. For instance, when presented with "Mideme bir şey oturdu gibi, ne yapmalıyım?" (literally: It feels like something has settled in my stomach, what should I do?), LLAMA 3 misinterpreted it as emotional stress, while MedAlpaca offered a literal, non-medical response about physical inability to pick up objects.

Our fine-tuned Turkish Medical LLM, however, correctly interpreted this idiomatic expression as indicative of gastrointestinal discomfort, such as gastritis or an ulcer. It then provided medically sound advice, including dietary recommendations and the suggestion to consult a gastroenterologist. This crucial difference underscores the model's ability to bridge linguistic and cultural gaps in healthcare communication, ensuring patients receive accurate and contextually appropriate guidance.

Calculate Your Potential AI Impact

Estimate the efficiency gains and cost savings your enterprise could achieve by integrating specialized AI solutions.

Your Industry

Number of Employees (Impacted by AI)

Avg. Hours/Week on Manual Tasks (per employee)

Avg. Hourly Cost (fully loaded)

Estimated Annual Savings $0

Employee Hours Reclaimed Annually 0

Quantify Your AI ROI

Your AI Implementation Roadmap

A strategic outline for integrating advanced AI into your enterprise, ensuring a smooth transition and maximum impact.

Phase 1: Discovery & Strategy

Conduct in-depth analysis of current workflows, identify key pain points, and define AI objectives tailored to your specific enterprise needs. This includes data assessment and initial model selection.

Phase 2: Data Preparation & Fine-tuning

Gather, clean, and pre-process proprietary enterprise data. Implement advanced fine-tuning techniques, like LoRA and Slerp merging, to specialize models while preserving core capabilities and mitigating forgetting.

Phase 3: Model Deployment & Integration

Deploy the specialized AI model into your existing infrastructure. This involves seamless API integration, robust testing, and ensuring compatibility with current systems and security protocols.

Phase 4: Performance Monitoring & Iteration

Continuously monitor AI model performance, gather user feedback, and implement iterative improvements. This phase focuses on adaptive fine-tuning, ethical governance, and scaling solutions.

Start Your AI Journey

Ready to Transform Your Enterprise with AI?

Book a personalized consultation with our AI strategists to explore how these insights can be applied to your organization.

Book a Free Consultation

Enterprise AI Analysis

Healthcare-Focused Turkish Medical LLM: Training on Real Patient-Doctor Question-Answer Data for Enhanced Medical Insight

Executive Impact: Unlocking Clinical Precision

Deep Analysis & Enterprise Applications

Understanding Turkish Medical Language

Real-World Patient-Doctor Interactions

Fine-tuning LLAMA 3 with LoRA and Slerp Merging

Robust Evaluation and Performance

Interpreting Culturally Embedded Phrases

Roadmap for Advanced Healthcare AI

Enterprise Process Flow

Qualitative Comparison: Culturally Embedded Phrases

Case Study: Precision in Turkish Medical Context

Calculate Your Potential AI Impact

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: Data Preparation & Fine-tuning

Phase 3: Model Deployment & Integration

Phase 4: Performance Monitoring & Iteration

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai