Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,10 @@ This repository contains information about the multi-modal multi-label, wide are
|
|
15 |
It contains around 40 TB of co-aligned machine learning ready data tiles, spanning 9 EO datasets and 6 geographic regions. For ease of access, the dataset has been compressed as parquet files.
|
16 |
For a smaller (uncompressed) version of our dataset, check out the [M3LEO miniset](https://huggingface.co/M3LEO-miniset).
|
17 |
|
|
|
|
|
|
|
|
|
18 |
## Tile Definitions
|
19 |
|
20 |
Each data tile covers an area of 4480m x 4480m (448x448 pixels at 10m/pixel) and is labelled with a unique identifier based on location.
|
|
|
15 |
It contains around 40 TB of co-aligned machine learning ready data tiles, spanning 9 EO datasets and 6 geographic regions. For ease of access, the dataset has been compressed as parquet files.
|
16 |
For a smaller (uncompressed) version of our dataset, check out the [M3LEO miniset](https://huggingface.co/M3LEO-miniset).
|
17 |
|
18 |
+
# Decompression
|
19 |
+
If you need to decompress the files, please see the main README at [the github repo](https://github.com/spaceml-org/M3LEO/).</br>
|
20 |
+
If you want to use them directly from the parquet files, the original .tif/.nc files were read into the rows as [binary file data sources](https://spark.apache.org/docs/3.5.3/sql-data-sources-binaryFile.html)
|
21 |
+
|
22 |
## Tile Definitions
|
23 |
|
24 |
Each data tile covers an area of 4480m x 4480m (448x448 pixels at 10m/pixel) and is labelled with a unique identifier based on location.
|