Enterprise AI Analysis
THOR: A Versatile Foundation Model for Earth Observation Climate and Society Applications
Current Earth observation foundation models are architecturally rigid, struggle with heterogeneous sensors and are constrained to fixed patch sizes. This limits their deployment in real-world scenarios requiring flexible compute-accuracy trade-offs. We propose THOR, a "compute-adaptive" foundation model that solves both input heterogeneity and deployment rigidity. THOR is the first architecture to unify data from Copernicus Sentinel-1, -2, and -3 (OLCI & SLSTR) satellites, processing their native 10 m to 1000 m resolutions in a single model. We pre-train THOR with a novel randomized patch and input image size strategy. This allows a single set of pre-trained weights to be deployed at inference with any patch size, enabling a dynamic trade-off between computational cost and feature resolution without retraining. We pre-train THOR on THOR Pretrain, a new, large-scale multi-sensor dataset and demonstrate state-of-the-art performance on downstream benchmarks, particularly in data-limited regimes like the PANGAEA 10% split, validating that THOR's flexible feature generation excels for diverse climate and society applications.
THOR's Impact at a Glance
Key performance indicators and innovations that position THOR as a leader in Earth Observation AI.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Flexible Multi-sensor Architecture
THOR features a modified Vision Transformer (ViT) backbone designed to handle both input heterogeneity and deployment rigidity. It utilizes separate patch projection layers for each input band, supporting any subset of bands during fine-tuning. This allows grouping arbitrary sets of bands with the same Ground Sampling Distance (GSD), allocating more patches for higher resolution data and fewer for coarser resolution data to optimize token sequence length.
Multi-modal Pre-training Framework
THOR is pre-trained using an extended Masked Autoencoder (MAE) framework combined with novel multi-modal prediction tasks. The loss formulation includes pixel-level reconstruction with a flexible ViT MAE loss, patch-level guided soft contrastive loss leveraging land cover products, pixel-level map prediction for land cover and elevation, and image-level prediction for ERA5-Land climate variables and Sentinel-1 SAR tasks. This comprehensive approach ensures robustness and generality.
Unifying Diverse Earth Observation Data
THOR is the first architecture to unify data from Copernicus Sentinel-1 (SAR), Sentinel-2 (MSI), and Sentinel-3 (OLCI & SLSTR) satellites, processing their native 10m to 1000m resolutions in a single model. The THOR Pretrain dataset, a 22TB multi-sensor dataset, was curated with a novel stratified sampling strategy to ensure geographic and thematic diversity, overcoming biases towards common land covers and actively over-sampling rare classes. This rich dataset includes DEM and ERA5-Land climate variables, aligned spatio-temporally.
Dynamic Compute-Adaptive Inference
By incorporating a randomized patch size and input image size during pre-training, THOR becomes "compute-adaptive." A single set of weights can be deployed with various patch sizes and input image sizes at inference time, allowing a dynamic trade-off between computational cost and feature resolution without retraining. This produces denser, higher-resolution token sequences for fine-grained tasks, enabling simpler and more data-efficient decoders, crucial for limited training data scenarios.
Enterprise Process Flow: THOR Pre-training Steps
| Feature | THOR | Conventional EO FMs |
|---|---|---|
| Input Heterogeneity |
|
|
| Deployment Versatility |
|
|
| Performance in Data-Limited Regimes |
|
|
Use Case: Snow Cover Fraction Regression
THOR demonstrated superior performance in a data-scarce climate task: snow cover fraction regression using Sentinel-3 SLSTR data. By using smaller patches at inference (4x4 vs 16x16), THOR-B reduced RMSE from 14.0 to 9.90, confirming that dense token sequences are beneficial even for coarse-resolution data. A simple linear decoder with 4x4 patches achieved identical performance (9.88 RMSE) to a much larger UperNet, validating THOR's data-efficiency and potential for minimal adaptation.
Calculate Your Potential ROI with THOR
See how THOR's capabilities can translate into tangible efficiencies and cost savings for your enterprise.
Your Path to AI-Powered EO: Implementation Roadmap
A structured approach to integrating THOR's advanced capabilities into your enterprise workflows.
Discovery & Strategy
Initial consultation to understand your specific EO challenges, data landscape, and strategic objectives. We identify key use cases for THOR and define success metrics.
Pilot & Integration Planning
Develop a tailored pilot program leveraging THOR's flexible architecture with a subset of your data. This phase includes API integration planning, data security, and compliance alignment.
Deployment & Customization
Full-scale deployment of THOR, optimized for your infrastructure. Includes fine-tuning the model with your proprietary data and custom decoder development for specific, high-resolution tasks.
Optimization & Scaling
Continuous monitoring of THOR's performance, iterative optimization based on feedback, and scaling solutions to new geographic areas or additional sensor modalities as your needs evolve.
Ready to Transform Your Earth Observation Capabilities?
Book a consultation with our expert team to explore how THOR can be custom-fitted to your enterprise's unique needs and drive unprecedented efficiency.