Enterprise AI Analysis

PERSIAN-PHI: EFFICIENT CROSS-LINGUAL ADAPTATION OF COMPACT LLMS VIA CURRICULUM LEARNING

The democratization of AI is currently hindered by the immense computational costs required to train Large Language Models (LLMs) for low-resource languages. This paper presents Persian-Phi, a 3.8B parameter model that challenges the assumption that robust multilingual capabilities require massive model sizes or multilingual baselines. We demonstrate how Microsoft's Phi-3 Mini—originally a monolingual English model—can be effectively adapted to Persian through a novel, resource-efficient curriculum learning pipeline. This approach employs a unique "warm-up" stage using bilingual narratives (Tiny Stories) to align embeddings prior to heavy training, followed by continual pretraining and instruction tuning via Parameter-Efficient Fine-Tuning (PEFT). Despite its compact size, Persian-Phi achieves competitive results on Open Persian LLM Leaderboard. Our findings provide a validated, scalable framework for extending the reach of state-of-the-art LLMs to underrepresented languages with minimal hardware resources. The model is publicly available at Persian-Phi.

Schedule Your Strategy Session

Executive Impact: Key Takeaways

Persian-Phi's novel approach demonstrates how compact, monolingual models can be efficiently adapted to low-resource languages, providing a scalable and cost-effective pathway for global AI adoption.

0 Model Parameters

0 Performance Improvement

0 Cost-Efficiency (vs. Larger Models)

0 Scalability for New Languages

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Enterprise Process Flow

Tokenizer Preparation

→

New Parameters Warm-up

→

Continual Pre-training

→

Supervised Fine-Tuning

The adaptation began with an extended tokenizer, followed by a warm-up phase to align new Persian tokens. Deep language understanding was built via continual pre-training on filtered Persian corpora, and finally, instruction tuning refined conversational abilities.

Model	#Params (B)	Part MC	ARC Easy	ARC Challenge	MMLU Pro	AUT MC
Ours (Persian-Phi)	3.85	30.56	64.65	51.00	17.18	43.98
PartAI Dorna2-8B	8.03	35.52	75.28	53.52	24.1	53.45
Meta-LLaMA3.1-8B	8.03	36.68	78.4	60.4	21	54.24
Gemma-2-2b-it	2.61	31.12	71.26	57.72	16.23	49.9
PersianMind-v1.0	6.82	29.27	58.91	48.32	15.51	45.36
Maral-7B-alpha-1	7.24	26.67	44.54	32.88	15.99	36.09
Phi-3-mini-4k-instruct (Baseline)	3.82	27.37	36.78	36.78	17.89	35.1

Persian-Phi achieves competitive results despite its compact size, outperforming several larger Llama 2-based models. Dorna-2 (8B params) remains state-of-the-art but at twice the parameter count, Persian-Phi achieves ~80% of its aggregate performance.

Curriculum Learning Pipeline Novel approach for cross-lingual adaptation

Persian-Phi introduces a unique curriculum learning pipeline: warm-up with bilingual Tiny Stories for embedding alignment, followed by continual pre-training on filtered Persian corpora, and instruction tuning using PEFT. This phased approach enables efficient adaptation of a monolingual model to a new language while preserving its original capabilities.

Strategic Advantages for Enterprise

The Persian-Phi project demonstrates a powerful paradigm for extending advanced AI capabilities to languages like Persian, traditionally underserved due to data scarcity and computational costs. By starting with a compact, high-capability English model (Microsoft's Phi-3 Mini) and strategically adapting it, we offer a resource-efficient and scalable solution. This challenges the notion that robust multilingual support requires massive models or training from scratch, making cutting-edge LLMs accessible with minimal hardware. For enterprises, this means faster time-to-market for AI solutions in new language markets and significant cost savings on development and infrastructure.

Calculate Your Potential AI ROI

Estimate the potential savings and reclaimed hours by implementing efficient AI solutions for your enterprise.

Industry

Number of Employees (impacted by AI)

Avg. Hours per Week on Repetitive Tasks

Avg. Hourly Rate ($)

Annual Savings

Hours Reclaimed Annually

Your Implementation Roadmap

Our structured approach ensures a seamless integration of advanced AI capabilities into your enterprise, maximizing impact while minimizing disruption.

Phase 1: Tokenizer Extension & Warm-up (Est. 2-3 weeks)

Expanded tokenizer to include Persian-specific tokens and initialized new embeddings through bilingual translation tasks (Tiny Stories) to ensure smooth cross-lingual alignment and prevent catastrophic forgetting.

Phase 2: Continual Pre-training (Est. 4-6 weeks)

Applied intensive pre-training on a large, high-quality filtered Persian corpus (TLPC, Wikipedia) to build deep language understanding and fluency. Utilized higher-rank LoRA and full embedding/head tuning.

Phase 3: Supervised Fine-Tuning (SFT) (Est. 2-3 weeks)

Refined the model's instruction-following and conversational abilities using a mixed dataset of Persian and English instruction-response pairs, employing LoRA to balance proficiency in both languages.

Phase 4: Integration & Deployment (Est. 1-2 weeks)

Merged LoRA weights and prepared the final Persian-Phi model for public release and enterprise integration, ensuring robustness and performance.

Book a Consultation

Ready to Transform Your Enterprise with AI?

Unlock the full potential of language models for your specific needs, even in low-resource environments. Contact us today to explore tailored solutions.

Get Started Now

Enterprise AI Analysis

PERSIAN-PHI: EFFICIENT CROSS-LINGUAL ADAPTATION OF COMPACT LLMS VIA CURRICULUM LEARNING

Executive Impact: Key Takeaways

Deep Analysis & Enterprise Applications

Enterprise Process Flow

Strategic Advantages for Enterprise

Calculate Your Potential AI ROI

Your Implementation Roadmap

Phase 1: Tokenizer Extension & Warm-up (Est. 2-3 weeks)

Phase 2: Continual Pre-training (Est. 4-6 weeks)

Phase 3: Supervised Fine-Tuning (SFT) (Est. 2-3 weeks)

Phase 4: Integration & Deployment (Est. 1-2 weeks)

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai