Update README.md
Browse files
README.md
CHANGED
@@ -11,5 +11,65 @@ pinned: false
|
|
11 |
|
12 |
# M3LEO: A Multi-Modal Multi-Label Earth Observation Dataset
|
13 |
|
14 |
-
|
|
|
15 |
For a preview of our work, check out the [M3LEO miniset](https://huggingface.co/M3LEO-miniset).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
# M3LEO: A Multi-Modal Multi-Label Earth Observation Dataset
|
13 |
|
14 |
+
This repository contains information about the multi-modal multi-label, wide area Earth Observation (EO) datasets collated during the [2023 Frontier Development Lab](https://fdleurope.org/fdl-europe-2023).
|
15 |
+
It contains 17.5 TB of co-aligned machine learning ready data tiles, spanning 9 EO datasets and 6 geographic regions.
|
16 |
For a preview of our work, check out the [M3LEO miniset](https://huggingface.co/M3LEO-miniset).
|
17 |
+
|
18 |
+
## Tile Definitions
|
19 |
+
|
20 |
+
Each data tile covers an area of 4480m x 4480m (448x448 pixels at 10m/pixel) and is labelled with a unique identifier based on location.
|
21 |
+
|
22 |
+

|
23 |
+
|
24 |
+
## Areas of Interest
|
25 |
+
|
26 |
+
Our areas of interest (AOIs) span China, Conus, Europe, the Middle East, Pakin, and South America. For each region, we randomly sample 5000 data tiles (3000 for training, and 1000 for validation and testing) to create the M3LEO-miniset.
|
27 |
+
Each AOI has a '.geojson' file associated with the geometries and identifiers of each data tile.
|
28 |
+
|
29 |
+

|
30 |
+
|
31 |
+
## Train-Test-Validation Splits
|
32 |
+
|
33 |
+
For each geopgraphic area, we provide '.csv' files with predefined train, test and validation splits that can be used for repeatability and comparability of experiments.
|
34 |
+
60% of tiles are allocated for training, 20% for validation, and 20% for testing.
|
35 |
+
|
36 |
+
## Temporal Coverage
|
37 |
+
|
38 |
+
As of now, M3LEO contains data from 2018 to 2020 for SAR and optical imagery and 2020 for the case of the labelled datasets. Future iterations might extend the dataset to other years.
|
39 |
+
|
40 |
+
## Datasets
|
41 |
+
|
42 |
+
The M3LEO dataset spans 9 diverse EO data types, covering input EO imagery and associated labels.
|
43 |
+
|
44 |
+

|
45 |
+
|
46 |
+
#### Synthetic Aperture Radar Datasets
|
47 |
+
|
48 |
+
- [`s1grd-2020`](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD): Sentinel-1 SAR GRD: C-band Synthetic Aperture Radar Ground Range Detected, three channels (vv, vh, vv/vh) at 10m resolution, taking the seasonal median (4 seasons per year) for both ascending and descending modes.
|
49 |
+
|
50 |
+
- [`gssic`](https://asf.alaska.edu/datasets/derived/global-seasonal-sentinel-1-interferometric-coherence-and-backscatter-dataset/): Global Seasonal Sentinel-1 Interferometric Coherence and Backscatter dataset at aroud 90m resolution.
|
51 |
+
|
52 |
+
- [`gunw-dateinit_dateend`](https://asf.alaska.edu/data-sets/derived-data-sets/sentinel-1-interferograms/): ARIA Sentinel-1 Geocoded Unwrapped Interferograms at 90m resolution, selecting within the [dateinit, datend] period the date that has most interferometric pairs as first date.
|
53 |
+
|
54 |
+
#### Optical Imagery
|
55 |
+
|
56 |
+
- [`s2rgbm-2020`](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED): Harmonized Sentinel-2 Level 2A, three channels (red, green, blue) monthly cloudless median at 10m resolution.
|
57 |
+
|
58 |
+
#### Labelled Datasets
|
59 |
+
|
60 |
+
- [`biomass-2020`](https://climate.esa.int/en/projects/biomass/): ESA CCI Above Ground Biomass annual maps at 90m resolution.
|
61 |
+
|
62 |
+
- [`esaworldcover-2020`](https://developers.google.com/earth-engine/datasets/catalog/ESA_WorldCover_v100): ESA World Cover land cover maps at 10m resolution.
|
63 |
+
|
64 |
+
- [`modis44b006veg`](https://developers.google.com/earth-engine/datasets/catalog/MODIS_006_MOD44B): MODIS Vegetation Continuous Field annual maps at 250m resolution.
|
65 |
+
|
66 |
+
- [`ghsbuilts-2020`](https://human-settlement.emergency.copernicus.eu/download.php?ds=bu): EU JRC Global Human Settlement Layer Builtup Surface at 100m resolution.
|
67 |
+
|
68 |
+
#### Digital Elevation Model
|
69 |
+
|
70 |
+
- [`srtmdem`](https://developers.google.com/earth-engine/datasets/catalog/CGIAR_SRTM90_V4): NASA SRTM digital elevetation model at 30m resolution.
|
71 |
+
|
72 |
+
# Acknowledgements<br>
|
73 |
+
This work has been enabled by [Frontier Development Lab Europe](https://fdleurope.org) a public / private partnership between the European Space Agency (ESA), Trillium Technologies, the University of Oxford and leaders in commercial AI supported by Google Cloud and NVIDIA, developing open science for all Humankind.
|
74 |
+
|
75 |
+
|