AI-POWERED INSIGHTS FOR ENTERPRISE
MOFMeld: A Structure-Language Fusion Framework for MOF Property Prediction in Carbon Capture
By Huajie You et al. – Published: April 21, 2026 – DOI: 10.1038/s44387-026-00106-1
MOFMeld introduces a novel structure-language fusion framework to accelerate MOF discovery for carbon capture. By integrating a literature-grounded LLM (MOFLLAMA) with crystal-aware structural embeddings, MOFMeld provides a scalable and transparent pathway for efficient screening. It achieves competitive or superior accuracy for critical MOF properties like pore-limiting diameter, largest cavity diameter, surface area, void fraction, and CO2 uptake, even when trained on substantially less data than traditional GNN baselines. The framework also enhances interpretability through coherent organization of structure-property relationships in its learned embeddings, supported by a MOF knowledge graph for factual and traceable reasoning.
Unlock Advanced Materials Discovery with AI
Efficient carbon capture demands high-performance sorbents. Metal-Organic Frameworks (MOFs) are promising but their discovery is hindered by slow, data-limited conventional methods. MOFMeld directly addresses this by integrating literature-derived knowledge with crystal-aware structural intelligence, accelerating MOF screening for industrial applications.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
MOFMeld Architecture
MOFMeld integrates a literature-grounded Large Language Model (MOFLLAMA) with crystal-aware structural embeddings via a lightweight bridge module. This fusion enables structure-conditioned question answering and property prediction for MOFs.
Enterprise Process Flow
Literature-Grounding with MOFLLAMA
MOFLLAMA, adapted from LLaMA-3.1-8B-Instruct via supervised fine-tuning on ~20,000 MOF QA pairs, leverages a MOF knowledge graph (MOFLLaMA-KG) to ensure factual, traceable reasoning. This provides domain-specific understanding beyond generic LLMs.
| Feature | ChatGPT | MOFLLAMA (with KG Grounding) |
|---|---|---|
| Scope | General overview, basic properties | Experimentally actionable details (synthesis time, structural characteristics, electrochemical window, variants, source citations) |
| Factual Accuracy | General, potentially generic | Specific, traceable, literature-derived evidence |
| Utility | Informative, but lacks actionable specifics | Practical for research, provides provenance |
Predictive Performance & Interpretability
MOFMeld encodes structural information from CIF files, aligns it to the language space, and achieves competitive or superior accuracy for key properties like PLD, LCD, surface area, void fraction, and CO2 uptake. UMAP analysis reveals coherent organization of structure-property relationships.
MOFMeld's Screening of CORE-MOF 2024
MOFMeld was applied to predict properties for the CORE-MOF 2024 database, identifying top candidates for CO2 uptake. Out of the screened candidates, 36 exhibited GCMC CO2 uptake values ≥8mmol.g⁻¹. This demonstrates MOFMeld's practical utility in enriching candidate pools towards high-uptake structures, despite some transfer degradation to experimental MOFs.
Calculate Your Potential ROI
Estimate the efficiency gains and cost savings MOFMeld could bring to your R&D and materials screening processes.
Your AI Implementation Roadmap
A structured approach to integrate MOFMeld into your materials discovery pipeline.
Phase 1: Initial Setup & Data Ingestion (2-4 Weeks)
Establish automated literature ingestion pipeline. Integrate MOF-specific publications, create the MOFLLaMA-KG, and prepare initial MOF structure datasets (hMOF, QMOF).
Phase 2: Model Training & Alignment (4-8 Weeks)
Supervised fine-tuning of LLaMA-3.1-8B-Instruct to create MOFLLAMA. Pretrain MOF-Bridge for structure-text alignment using multi-objective training.
Phase 3: Property Prediction & Validation (3-6 Weeks)
Fine-tune MOF-Bridge on geometric and adsorption properties. Evaluate MOFMeld against GNN baselines on held-out datasets and conduct interpretability analyses (UMAP, attention).
Phase 4: Deployment & Continuous Improvement (Ongoing)
Integrate MOFMeld into screening workflows. Establish continuous knowledge updates via automated literature pipeline and explore broader MOF corpora, structure-aware RAG, and diverse training data.
Ready to Transform Your Research?
Schedule a free 30-minute consultation to discuss how MOFMeld can be tailored to your specific R&D challenges and accelerate your materials discovery.