Enterprise AI Analysis: A novel cross-modal alignment learning framework for Dongba single-character dataset construction
A Novel Cross-Modal Alignment Learning Framework for Dongba Single-Character Dataset Construction
The Dongba script is an ancient and unique pictographic writing system created by the Naxi people of China. Currently, existing datasets for Dongba character recognition, constructed through manual imitation or data augmentation, exhibit significant feature differences from authentic characters in ancient manuscripts, greatly limiting real-world application. To address this, we propose a novel dataset construction method based on cross-modal alignment learning for Dongba characters. Combined with dynamic anchor expansion retrieval and multi-granularity hybrid iterative training, we construct an authentic Dongba single-character dataset, Dongba_1512, comprising 1,512 categories and 705,058 samples. Extensive experiments demonstrate the effectiveness of both our proposed dataset construction method and the Dongba_1512, supporting digital research on Dongba manuscripts and showing superior transferability to other ancient scripts.
Authors: Junyao Xing¹, Xiaojun Bi2,3 & Weizheng Qiao2,3
Published: April 2026
Executive Impact
This research presents a groundbreaking framework for digital preservation and intelligent analysis of ancient scripts, yielding significant advancements in data quality and model performance.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Cross-Modal Alignment Learning
Our novel method eliminates reliance on massive single-character annotations by leveraging parallel corpora of Dongba manuscript sentence images paired with Chinese translations. This trains cross-modal alignment models to learn semantic correspondences and extract unified representations of individual characters within a semantically aligned feature space.
Dynamic Anchor Expansion & Iterative Training
A dynamic anchor expansion image retrieval method, combined with multi-granularity hybrid iterative training, progressively discovers new characters and enhances the model's capability to capture local details. This process continually incorporates fine-grained character descriptions into training data, ensuring comprehensive dataset growth.
Authentic & Large-Scale Data
The Dongba_1512 dataset comprises 705,058 samples across 1,512 categories, extracted directly from authentic Dongba historical manuscripts. This collection ensures accurate representation of the script’s unique morphological features and supports robust model training for recognition and OCR tasks.
Transferability to Other Ancient Scripts
Experiments demonstrate that our method is readily extensible and effective for constructing single-character datasets for other low-resource ancient scripts, such as Shui and Yi. This highlights the framework's general utility in paleographic analysis and digital preservation.
Dataset Construction Process
| Model | Top-1 Acc (%) (Authentic Data) | Top-1 Acc (%) (Synthetic Data) |
|---|---|---|
| DenseNet169 | 96.76% | 21.31% |
| EfficientNetB0 | 96.23% | 30.79% |
| RepVGG | 95.99% | 15.81% |
| ResNet50 | 96.26% | 21.52% |
|
Conclusion: Models trained on authentic Dongba data consistently achieve high accuracy, while models trained on synthetic data show catastrophic performance degradation, highlighting the irreplaceable value of authentic samples. |
||
Case Study: Cross-Script Generality: Shui and Yi Scripts
Our method demonstrates broader applicability beyond Dongba, validated through experiments on Shui and Yi ancient scripts. For Shui script, the system achieved 51.49% accuracy, and for Yi script, 70.92% accuracy. This indicates the framework's effectiveness even in low-resource scenarios with limited parallel data (Shui had only 334 pairs). The cross-modal alignment, by learning semantic representations and ignoring morphological noise, proves robust across diverse ancient writing systems, offering a new paradigm for paleographic analysis.
Calculate Your Potential ROI
Estimate the impact of advanced AI solutions on your enterprise. Adjust the parameters to see potential annual savings and reclaimed operational hours.
Your Implementation Roadmap
A structured approach to integrate our AI solutions into your enterprise, ensuring seamless transition and maximized benefits.
Phase 1: Foundation Setup
Establish baseline models with foundational parallel corpora and initial image retrieval for Dongba characters.
Phase 2: Iterative Refinement
Progressively expand training data with character-level annotations and fine-grained descriptions through multi-granularity hybrid iterative training.
Phase 3: Dataset Finalization & Validation
Complete Dongba_1512 dataset construction, perform extensive validation, and ensure transferability to other ancient scripts.
Phase 4: Integration & Deployment
Integrate the Dongba_1512 dataset into digital preservation efforts and scholarly research platforms for intelligent analysis.
Ready to Transform Your Operations?
Our experts are ready to guide you through the AI integration process, tailored to your specific enterprise needs. Discover how our cutting-edge solutions can drive efficiency and innovation.