File size: 10,473 Bytes
553e0d4
 
 
 
48b72b5
553e0d4
 
 
48b72b5
 
7b97687
1a95382
7b97687
553e0d4
48b72b5
553e0d4
48b72b5
 
 
553e0d4
689ccbb
1a95382
689ccbb
553e0d4
48b72b5
 
 
 
 
553e0d4
48b72b5
 
 
f50d6ca
48b72b5
 
553e0d4
 
 
48b72b5
 
 
 
553e0d4
48b72b5
 
 
 
 
553e0d4
5b510f9
 
 
 
 
48b72b5
 
 
5b510f9
 
 
48b72b5
5b510f9
48b72b5
 
 
 
 
 
 
 
 
 
 
 
 
 
5b510f9
48b72b5
553e0d4
48b72b5
 
 
 
 
 
 
 
 
553e0d4
48b72b5
553e0d4
48b72b5
 
 
 
e1a79b8
48b72b5
e1a79b8
48b72b5
e1a79b8
48b72b5
e1a79b8
48b72b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e1a79b8
48b72b5
 
 
 
 
 
553e0d4
e1a79b8
48b72b5
553e0d4
48b72b5
 
 
689ccbb
48b72b5
 
689ccbb
48b72b5
 
 
689ccbb
48b72b5
553e0d4
48b72b5
553e0d4
48b72b5
1a95382
48b72b5
553e0d4
48b72b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2aeec08
48b72b5
 
1a95382
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
---
license: mit
---

# AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities

[Guillaume Astruc](https://gastruc.github.io/), [Nicolas Gonthier](https://ngonthier.github.io/), [Clement Mallet](https://www.umr-lastig.fr/clement-mallet/), [Loic Landrieu](https://loiclandrieu.com/)  

For more details and results, please check out our [github](https://github.com/gastruc/AnySat) and [project page](https://gastruc.github.io/projects/omnisat.html).

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/m1IY9HfCD8NAeykZxqWSb.png" alt="AnySat Architecture" width="500">
</p>

# Abstract

**AnySat** is a versatile Earth Observation model designed to handle diverse data across resolutions, scales, and modalities. Using a **scale-adaptive joint embedding predictive architecture** (JEPA), AnySat can train in a self-supervised manner on highly heterogeneous datasets.

We train a single AnySat model on **GeoPlex**, a collection of 5 multimodal datasets spanning 11 sensors with varying characteristics. In fine-tuning or linear probing, AnySat achieves SOTA or near-SOTA performance for land cover segmentation, crop type classification, change detection, tree species identification, and flood mapping.

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/mENiGjg5gfKH27vqr8xuB.png" alt="AnySat Teaser" width="500">
</p>

# Key Features
- 🌍 **Versatile Model**:Handles diverse datasets with resolutions spanning **3–11 channels**, tiles ranging from **0.3 to 2600 hectares**, and any combination of **11 sensors**.
- 🚀 **Simple to Use**: Install and download AnySat with a single line of code, select your desired modalities and patch size, and immediately generate rich features.
- 🦋 **Flexible Task Adaptation**: Supports fine-tuning and linear probing for tasks like **tile-wise classification** and **semantic segmentation**.
- 🧑‍🎓 **Multi-dataset Training**: Trains a single model across multiple datasets with varying characteristics.


# 🚀  Quickstart

Check out our [demo notebook](https://github.com/gastruc/AnySat/blob/main/demo.ipynb) or [huggingface page](https://huggingface.co/gastruc/anysat) for more details.

## Install and load Anysat

```python

import torch

AnySat = torch.hub.load('gastruc/anysat', 'anysat', pretrained=True, flash_attn=False)

```
Set `flash_attn=True` if you have [flash-attn](https://pypi.org/project/flash-attn/) module installed. It is not required and only impacts memory and speed.

## Format your data

Arrange your data in a dictionary with any of the following keys:

| Dataset       | Description                       | Tensor Size                                          | Channels                                  | Resolution |
|---------------|-----------------------------------|-----------------------------------------|-------------------------------------------|------------|
| aerial        | Single date tensor |Bx4xHxW                                              | RGB, NiR                                  | 0.2m       |
| aerial-flair  | Single date tensor |Bx5xHxW                                              | RGB, NiR, Elevation                       | 0.2m       |
| spot          | Single date tensor |Bx3xHxW                                              | RGB                                       | 1m         |
| naip          | Single date tensor |Bx4xHxW                                              | RGB                                       | 1.25m      |
| s2            | Time series tensor |BxTx10xHxW                                           | B2, B3, B4, B5, B6, B7, B8, B8a, B11, B12 | 10m        |
| s1-asc        | Time series tensor |BxTx2xHxW                                            | VV, VH                                    | 10m        |
| s1            | Time series tensor |BxTx3xHxW                                            | VV, VH, Ratio                             | 10m        |
| alos          | Time series tensor |BxTx3xHxW                                            | HH, HV, Ratio                             | 30m        |
| l7            | Time series tensor |BxTx6xHxW                                            | B1, B2, B3, B4, B5, B7                    | 30m        |
| l8            | Time series tensor |BxTx11xHxW                                           | B8, B1, B2, B3, B4, B5, B6, B7, B9, B10, B11 | 10m     |
| modis         | Time series tensor |BxTx7xHxW                                            | B1, B2, B3, B4, B5, B6, B7                | 250m       |
   
Note that time series requires a `_dates` companion tensor containing the day of the year: 01/01 = 0, 31/12=364.

**Example Input** for a tile of 60x60m and a batch size of B:

```python
data = {
    "aerial": ... #Tensor of size [B, 4, 300, 300] : 4 channels, 300x300 pixels at 20cm res
    "spot": ... #Tensor of size [B, 3, 60, 60]: 3 channels, 60x60 pixels at 1m res
    "s2": ... #Tensor of size [B, 12, 10, 6, 6] : 12 dates, 10 channels, 6x6 pixels at 10m res
    "s2_dates":  ... #Tensor of size [B, 12] : 12 dates
}
```
Ensure that the spatial extent of each modality multiplied by its resolution is consistent.

## Extract Features

Decide on:
- **Patch size** (in m, must be a multiple of 10): adjust according to the scale of your tiles and GPU memory. In general, avoid having more than 1024 patches per tile.
- **Output type**: Choose between:
  - `'tile'`: Single vector per tile
  - `'patch'`: A vector per patch
  - `'dense'`: A vector per sub-patch
  - `'all'`: a tuple with all three outputs
 
The sub patches are `1x1` pixels for time series and `10x10` pixels for VHR images. If using `output='dense'`, specify the `output_modality`.

Example use:
```python
features = AnySat(data, scale=10, output='tile') #tensor of size [D,]
features = AnySat(data, scale=10, output='patch') #tensor of size [D,6,6]
features = AnySat(data, scale=20, output='patch') #tensor of size [D,3,3]
features = AnySat(data, scale=20, output='dense', output_modality='aerial') #tensor of size [D,30,30]
```
**Explanation for the size of the dense map:** `d=10` for 'aerial' which has a 0.2m resolution, the sub-patches are 2x2 m.

# Advanced Installation

## Install from source

```bash
# clone project
git clone https://github.com/gastruc/anysat
cd anysat

# [OPTIONAL] create conda environment
conda create -n anysat python=3.9
conda activate anysat

# install requirements
pip install -r requirements.txt

# Create data folder where you can put your datasets
mkdir data
# Create logs folder
mkdir logs
```

## Run Locally

To load the model locally, you can use the following code:
```python

from hubconf import AnySat

AnySat = AnySat.from_pretrained('base', flash_attn=False) #Set flash_attn=True if you have flash-attn module installed
#For now, only base is available.
#device = "cuda" If you want to run on GPU default is cpu
```

Every experience of the paper has its config file. Feel free to explore `configs/exp` folder.

```bash
# Run AnySat pretraining on GeoPlex
python src/train.py exp=GeoPlex_AnySAT

# Run AnySat finetuning on BraDD-S1TS
python src/train.py exp=BraDD_AnySAT_FT

# Run AnySat linear probing on BraDD-S1TS
python src/train.py exp=BraDD_AnySAT_LP
```

# Supported Datasets

Our implementation already supports 9 datasets:

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/662b7fba68ed7bbf40bfb0df/nGGz8kiDdeTJqIPSdrlSz.png" alt="AnySat Datasets" width="500">
</p>

## GeoPlex Datasets

1. **TreeSatAI-TS**
   - **Description**: Multimodal dataset for tree species identification.
   - **Extent**: 50,381 tiles covering 180 km² with multi-label annotations across 20 classes.
   - **Modalities**: VHR images (0.2 m), Sentinel-2 time series, Sentinel-1 time series.
   - **Tasks**: Tree species classification.

2. **PASTIS-HD**
   - **Description**: Crop mapping dataset with delineated agricultural parcels.
   - **Extent**: 2,433 tiles covering 3986 km² with annotations across 18 crop types.
   - **Modalities**: SPOT6/7 VHR imagery (1.5 m), Sentinel-2 time series, Sentinel-1 time series.
   - **Tasks**: Classification, semantic segmentation, panoptic segmentation.

3. **FLAIR**
   - **Description**: Land cover dataset combining VHR aerial imagery with Sentinel-2 time series.
   - **Extent**: 77,762 tiles covering 815 km² with annotations across 13 land cover classes.
   - **Modalities**: VHR images (0.2 m), Sentinel-2 time series.
   - **Tasks**: Land cover mapping.

4. **PLANTED**
   - **Description**: Global forest dataset for tree species identification.
   - **Extent**: 1,346,662 tiles covering 33,120 km² with annotations across 40 classes.
   - **Modalities**: Sentinel-2, Landsat-7, MODIS, Sentinel-1, ALOS-2.
   - **Tasks**: Tree species classification.

5. **S2NAIP-URBAN**
   - **Description**: Urban dataset with high-resolution imagery and time series data.
   - **Extent**: 515,270 tiles covering 211,063 km² with NAIP, Sentinel-2, Sentinel-1, and Landsat-8/9 data.
   - **Modalities**: NAIP (1.25 m), Sentinel-2 time series, Sentinel-1 time series, Landsat-8/9.
   - **Tasks**: Pretraining only (no official labels).

## External Evaluation Datasets

1. **BraDD-S1TS**
   - **Description**: Change detection dataset for deforestation in the Amazon rainforest.
   - **Extent**: 13,234 tiles with Sentinel-1 time series.
   - **Tasks**: Change detection (deforestation segmentation).

2. **SICKLE**
   - **Description**: Multimodal crop mapping dataset from India.
   - **Extent**: 34,848 tiles with Sentinel-1, Sentinel-2, and Landsat-8 time series.
   - **Tasks**: Crop type classification (paddy/non-paddy).

3. **TimeSen2Crop**
   - **Description**: Crop mapping dataset from Slovenia.
   - **Extent**: 1,212,224 single-pixel Sentinel-2 time series.
   - **Tasks**: Crop type classification.
  
4. **Sen1Flood11**
   - **Description**: Flood mapping dataset with global scope.
   - **Extent**: 4.8K Sentinel-1/2 time series.
   - **Tasks**: Flood classification (flooded/ not flooded).
  
# Reference

Please use the following bibtex:
```bibtex
@article{astruc2024anysat,
  title={{AnySat: An Earth} Observation Model for Any Resolutions, Scales, and Modalities},
  author={Astruc, Guillaume and Gonthier, Nicolas and Mallet, Clement and Landrieu, Loic},
  journal={arXiv preprint arXiv:2412.14123},
  year={2024}
}
```