|
--- |
|
license: mit |
|
--- |
|
|
|
# AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities (ArXiv 2024) |
|
|
|
[Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/) |
|
|
|
|
|
Official models for [_AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities_](https://arxiv.org/pdf/2404.08351.pdf) |
|
|
|
## Abstract |
|
|
|
We introduce AnySat: a JEPA-based multimodal Earth Observation model that train simultaneously on diverse datasets with different scales, resolutions (spatial, spectral, temporal), and modality combinations. |
|
|
|
|
|
For more details and results, please check out our [github](https://github.com/gastruc/AnySat) and [project page](https://gastruc.github.io/projects/omnisat.html). |
|
|
|
<p align="center"> |
|
<img src="https://github.com/gastruc/OmniSat/assets/1902679/9fc20951-1cac-4891-b67f-53ed5e0675ad" width="800" height="400"> |
|
</p> |
|
|
|
## Datasets |
|
|
|
|
|
| Dataset name | Modalities | Labels | Link |
|
| ------------- | ---------------------------------------- | ------------------- | ------------------- | |
|
| PASTIS-HD | **SPOT 6-7 (1m)** + S1/S2 (30-140 / year)| Crop mapping (0.2m) | [huggingface](https://huggingface.co/datasets/IGNF/PASTIS-HD) or [zenodo](https://zenodo.org/records/10908628) | |
|
| TreeSatAI-TS | Aerial (0.2m) + **S1/S2 (10-70 / year)** | Forestry (60m) | [huggingface](https://huggingface.co/datasets/IGNF/TreeSatAI-Time-Series) | |
|
| FLAIR | aerial (0.2m) + S2 (20-114 / year) | Land cover (0.2m) | [huggingface](https://huggingface.co/datasets/IGNF/FLAIR) | |
|
|
|
|
|
<p align="center"> |
|
<img src="https://github.com/user-attachments/assets/18acbb19-6c90-4c9a-be05-0af24ded2052" width="800" height="400"> |
|
</p> |
|
|
|
### Inference 🔥 |
|
|
|
In order to load our pretrained models, you can run: |
|
|
|
```python |
|
from models.huggingface import AnySat |
|
|
|
## Code to use pretrained weights |
|
model = AnySat(size="base", pretrained=True) #Exists also "small" and "tiny" |
|
``` |
|
|
|
To get features from an observation of a batch of observations, you need to provide to the model a dictionnary where keys are from the list: |
|
- "aerial": Single date tensor (Bx4xHxW) with 4 channels (RGB NiR), 0.2m resolution |
|
- "aerial-flair": Single date tensor (Bx5xHxW) with 5 channels (RGB NiR Elevation), 0.2m resolution |
|
- "spot": Single date tensor (Bx3xHxW) with 3 channels (RGB), 1m resolution |
|
- "naip": Single date tensor (Bx4xHxW) with 3 channels (RGB), 1.25m resolution |
|
- "s2": Time series tensor (BxTx10xHxW) with 10 channels (B2 B3 B4 B5 B6 B7 B8 B8a B11 B12), 10m resolution |
|
- "s1-asc": Time series tensor (BxTx2xHxW) with 2 channels (VV VH), 10m resolution |
|
- "s1": Time series tensor (BxTx3xHxW) with 3 channels (VV VH Ratio), 10m resolution |
|
- "alos": Time series tensor (BxTx3xHxW) with 3 channels (HH HV Ratio), 30m resolution |
|
- "l7": Time series tensor (BxTx6xHxW) with 6 channels (B1 B2 B3 B4 B5 B7), 30m resolution |
|
- "l8": Time series tensor (BxTx11xHxW) with 11 channels (B8 B1 B2 B3 B4 B5 B6 B7 B9 B10 B11), rescaled to 10m resolution |
|
- "modis": Time series tensor (BxTx7xHxW) with 7 channels (B1 B2 B3 B4 B5 B6 B7), 250m resolution |
|
|
|
Time series keys require a "{key}_dates" (for example "s2_dates") tensor of size BxT that value an integer that represent the day of the year. |
|
Then you have to choose at which scale you want te produce features. Scale argument is in meters and represent the size of the desired patch size. |
|
Outputs will be composed of the concatenation of a class token and a flattened feature map where each feature encodes a scale x scale zone |
|
Then, you can run: |
|
|
|
```python |
|
features = AnySat(data, scale=scale) # |
|
``` |
|
|
|
And then you can apply those features to the desired downstream task! |
|
|
|
If you want to get a feature map at the density of a specific modality you can specify: |
|
|
|
```python |
|
features = AnySat(data, scale=scale, keep_subpatch=True, modality_keep=modality) #where modality is the name of the desired modality |
|
``` |
|
|
|
Note that the features will be of size 2*D. If you have several modalities of the same desired resolution, you should pick the most informative one (or modify the code to concatenante also the other modalities) |
|
|
|
To reproduce results, add new modalities, or do more experiments see the full code on [github]('https://github.com/gastruc/AnySat'). |
|
|
|
### Citing 💫 |
|
|
|
```bibtex |
|
|
|
``` |
|
|