|
--- |
|
license: other |
|
license_name: adobe-license |
|
license_link: LICENSE |
|
datasets: |
|
- Major-TOM/Core-S2L2A |
|
- Major-TOM/Core-DEM |
|
--- |
|
|
|
<h1 align="center">MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data </h1> |
|
<p align="center"><a href="https://www.linkedin.com/in/paul-bp-cs/" target="_blank">Paul Borne--Pons</a>, <a href="https://mikonvergence.github.io/" target="_blank">Mikolaj Czerkawski</a>,<a href="https://research.adobe.com/person/rosalie-martin/" target="_blank">Rosalie Martin</a>, |
|
<a href="https://research.adobe.com/person/romain-rouffet/" target="_blank">Romain Rouffet</a></p> |
|
|
|
<p align="center"><a href="https://sites.google.com/view/morse2025" target="_blank">CVPR 2025 Workshop MORSE</a> </p> |
|
|
|
MESA is a novel generative model based on latent denoising diffusion capable of generating 2.5D representations of terrain based on the text prompt conditioning supplied via natural language. The model produces two co-registered modalities of optical and depth maps. This model is a finetune of [stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) and is builds upon Hugging Face’s [Diffusers](https://github.com/huggingface/diffusers) library. |
|
|
|
## Model Description |
|
- **Paper:** [MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data](https://arxiv.org/abs/2504.07210) |
|
- **Github:** <https://github.com/PaulBorneP/MESA> |
|
- **Project page:** <https://paulbornep.github.io/mesa-terrain/> |
|
|
|
## Installation |
|
```sh |
|
# Clone the repository |
|
git clone https://github.com/PaulBorneP/MESA |
|
cd MESA |
|
# using python 3.11.12 |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Model Download |
|
|
|
```sh |
|
mkdir weights |
|
huggingface-cli download NewtNewt/MESA --local-dir ./weights |
|
``` |
|
|
|
|
|
## Usage |
|
```python |
|
from MESA.pipeline_terrain import TerrainDiffusionPipeline |
|
import torch |
|
|
|
pipe = TerrainDiffusionPipeline.from_pretrained("./weights", torch_dtype=torch.float16) |
|
pipe.to("cuda"); |
|
|
|
prompt = "A sentinel-2 image of montane forests and mountains in Mexico in August" |
|
image,dem = pipe(prompt, num_inference_steps=50, guidance_scale=7.5) |
|
``` |
|
|
|
## Citation |
|
```latex |
|
@inproceedings{mesa2025, |
|
title={MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data}, |
|
author={Paul Borne--Pons and Mikolaj Czerkawski and Rosalie Martin and Romain Rouffet}, |
|
year={2025}, |
|
booktitle={MORSE Workshop at CVPR 2025}, |
|
eprint={2504.07210}, |
|
url={https://arxiv.org/abs/2504.07210},} |
|
``` |
|
|
|
## Acknowledgements |
|
|
|
This model is the product of a collaboration between [Φ-lab, European Space Agency (ESA)](https://philab.esa.int/) and the [Adobe Research (Paris, France)](https://research.adobe.com/careers/paris/). |