Enterprise AI Analysis
Yes-MT's Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024
This analysis explores the innovative approaches and findings from Yes-MT's participation in WMT 2024, focusing on leveraging LLMs and fine-tuning techniques to address the challenges of low-resource Indic language translation.
Executive Impact Summary
Our findings highlight the significant potential of Large Language Models (LLMs), particularly when fine-tuned with techniques like LoRA, in enhancing translation quality even under low-resource conditions. Contrastive submissions utilizing fine-tuned LLMs demonstrated substantial improvements over primary systems trained from scratch. This demonstrates a clear path to unlocking new capabilities for diverse language support.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Experimentation Workflow
Our methodology systematically explored various translation approaches, from training Transformer models from scratch to fine-tuning state-of-the-art LLMs using efficient adaptation techniques.
Enterprise Process Flow
Core Model Performance (ChrF Scores)
Comparison of ChrF scores for various models under different training types (Monolingual vs. Multilingual) for English to Indic language translation.
| Model | Training Type | en-as | en-kha | en-mz | en-mni |
|---|---|---|---|---|---|
| Transformers | Multilingual | 16.06 | 19.67 | 5.49 | 20.60 |
| IndicBart | Monolingual | 6.4 | 11.2 | 25.1 | 8.8 |
| Multilingual | 6.5 | 11.4 | 25.3 | 9.1 | |
| mT5-small | Monolingual | 14.3 | 12.9 | 31.4 | 19.2 |
| Multilingual | 15.6 | 13.6 | 32.3 | 23.9 | |
| IndicTrans2-2B | ZeroShot | 49.2 | - | - | 44.9 |
| ZeroShot | 49.5 | - | - | 45.3 | |
| IndicTrans2-200M | ZeroShot | 47.27 | - | - | 49.12 |
| Multilingual | 47.27 | - | - | 49.12 |
Note: '-' indicates data not available for that language pair/model configuration.
Key Performance Takeaways
Our analysis revealed distinct performance trends, showcasing the benefits of multilingual training and the transformative potential of fine-tuning LLMs.
Case Study: LLM Fine-tuning Success
For Assamese and Manipuri, IndicTrans2 fine-tuned with LoRA achieved the highest ChrF scores among all models. Similarly, for Mizo and Khasi, Llama3 fine-tuned via LoRA and SFT significantly outperformed other systems. These results underscore the effectiveness of leveraging pre-trained LLMs and efficient fine-tuning methods for low-resource translation tasks.
For instance, Llama3-8B-instruct achieved ChrF scores of 31.68 (en-as), 35.26 (en-kha), 37.73 (en-mz), and 44.51 (en-mni) after 2 epochs of fine-tuning, demonstrating substantial gains over baseline models.
LLM Performance Across Shot Types (ChrF Scores)
Detailed ChrF scores for various LLM models, evaluating their zero-shot and few-shot translation capabilities, as well as fine-tuning impact.
| Model | Inference | en-as | en-kha | en-mz | en-mni |
|---|---|---|---|---|---|
| Llama3-8B-8192 | Zero Shot | 18.56 | 14.92 | 15.57 | 13.45 |
| Llama3-70B-8192 | Zero Shot | 27.54 | 18.57 | 20.62 | 15.53 |
| mixtral-8x7B-32768 | Zero Shot | 6.79 | 15.45 | 16.57 | 2.65 |
| Llama3-8B-instruct | Zero Shot | 26.13 | 8.38 | 18.06 | 15.29 |
| 1 Epoch | 29.82 | 33.19 | 32.72 | 37.85 | |
| 2 Epoch | 31.68 | 35.26 | 37.73 | 44.51 | |
| Llama3.1-8B-instruct | Zero Shot | 22.93 | 12.03 | 15.23 | 14.47 |
| 3 Shot | 23.26 | 13.66 | 18.89 | 15.30 | |
| 5 Shot | 23.48 | 15.11 | 18.77 | 15.29 | |
| 10 Shot | 23.89 | 16.03 | 19.39 | 15.43 |
Overcoming Challenges & Future Directions
Low-resource language translation presents unique challenges, particularly regarding data scarcity and the generalization of models. Addressing these issues is crucial for robust enterprise-grade AI solutions.
A significant challenge observed was the generation of structured output. LLM models sometimes wrapped translations in extraneous text, complicating extraction. This issue was most prevalent in zero-shot settings, with 66.8% of outputs containing additional text. However, few-shot prompting significantly reduced this inconsistency to 0.18% in 10-shot scenarios. This highlights the need for careful prompt engineering or fine-tuning to ensure clean and structured outputs, especially in low-resource contexts.
Furthermore, discrepancies in performance between different test sets suggest potential translation bias in datasets. This underscores the importance of diverse and varied datasets to improve model robustness and generalization across new data distributions.
Strategic Roadmap for Enhanced Translation
Our future work will focus on integrating diverse data sources and refining LLM interaction strategies to build more reliable and adaptable systems.
Integrate Monolingual Data & Augmentation
Explore techniques like back-translation and other data augmentation to enrich limited datasets and improve model understanding.
Refine Prompt Engineering for LLMs
Develop advanced prompt strategies to ensure consistently structured and concise outputs from LLMs, minimizing extraneous text.
Address Potential Test Data Biases
Focus on creating more reliable translation systems by carefully analyzing and mitigating biases present in test datasets.
Deploy & Monitor Fine-tuned LLM Systems
Implement robust fine-tuned LLM solutions for production, with continuous monitoring and iterative improvements based on real-world performance.
Calculate Your Potential ROI
Estimate the efficiency gains and cost savings your enterprise could realize by implementing advanced AI translation solutions.
Your Path to Advanced AI Translation
Our proven implementation roadmap ensures a smooth transition to enhanced translation capabilities, tailored for your specific low-resource language needs.
Phase 1: Discovery & Strategy
Initial consultation to understand your unique language pairs, data landscape, and specific translation challenges. We'll define clear objectives and a tailored AI strategy.
Phase 2: Data Preparation & Model Selection
Gathering and preprocessing your existing bilingual and monolingual data. Selection and customization of optimal pre-trained models (e.g., Llama 3, IndicTrans2) for fine-tuning.
Phase 3: Fine-Tuning & Optimization
Application of LoRA and SFT techniques to fine-tune selected LLMs on your proprietary data, ensuring high-quality, domain-specific translation. Iterative optimization for performance.
Phase 4: Integration & Deployment
Seamless integration of the fine-tuned translation system into your existing workflows and platforms. Deployment with robust monitoring and ongoing support to ensure optimal operation.
Ready to Transform Your Translation Workflow?
Connect with our AI specialists to explore how custom, fine-tuned LLM solutions can elevate your low-resource language translation capabilities.