Text-to-Image
Diffusers
Safetensors
StableDiffusionPipeline
File size: 2,683 Bytes
be09a38
 
 
 
 
 
 
0458da6
 
 
 
 
 
 
 
72ffbb1
0458da6
 
 
 
9fbff1a
0458da6
 
 
 
a985c2a
0458da6
39069d2
a985c2a
0458da6
 
 
 
 
2689ec0
0458da6
 
 
72ffbb1
 
 
 
e09c89f
72ffbb1
 
 
 
b49e3ba
72ffbb1
 
 
 
0458da6
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: other
license_name: adobe-license
license_link: LICENSE
datasets:
- Major-TOM/Core-S2L2A
- Major-TOM/Core-DEM
---

<h1 align="center">MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data </h1>
<p align="center"><a href="https://www.linkedin.com/in/paul-bp-cs/" target="_blank">Paul Borne--Pons</a>, <a href="https://mikonvergence.github.io/" target="_blank">Mikolaj Czerkawski</a>,<a href="https://research.adobe.com/person/rosalie-martin/" target="_blank">Rosalie Martin</a>,
<a href="https://research.adobe.com/person/romain-rouffet/" target="_blank">Romain Rouffet</a></p>

<p align="center"><a href="https://sites.google.com/view/morse2025" target="_blank">CVPR 2025 Workshop MORSE</a> </p>

MESA is a novel generative model based on latent denoising diffusion capable of generating 2.5D representations of terrain based on the text prompt conditioning supplied via natural language. The model produces two co-registered modalities of optical and depth maps. This model is a finetune of [stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1) and is builds upon Hugging Face’s [Diffusers](https://github.com/huggingface/diffusers) library.

## Model Description
- **Paper:** [MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data](https://arxiv.org/abs/2504.07210)
- **Github:** <https://github.com/PaulBorneP/MESA>
- **Project page:** <https://paulbornep.github.io/mesa-terrain/>

## Installation
```sh
# Clone the repository
git clone https://github.com/PaulBorneP/MESA
cd MESA
# using python 3.11.12
pip install -r requirements.txt
```

## Model Download

```sh
mkdir weights
huggingface-cli download NewtNewt/MESA --local-dir ./weights
```


## Usage
```python
from MESA.pipeline_terrain import TerrainDiffusionPipeline
import torch

pipe = TerrainDiffusionPipeline.from_pretrained("./weights", torch_dtype=torch.float16)
pipe.to("cuda");

prompt = "A sentinel-2 image of montane forests and mountains in Mexico in August"
image,dem = pipe(prompt, num_inference_steps=50, guidance_scale=7.5)
```

## Citation
```latex
@inproceedings{mesa2025,
title={MESA: Text-Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data},
author={Paul Borne--Pons and Mikolaj Czerkawski and Rosalie Martin and Romain Rouffet},
year={2025},
booktitle={MORSE Workshop at CVPR 2025},
eprint={2504.07210},
url={https://arxiv.org/abs/2504.07210},}
```

## Acknowledgements

This model is the product of a collaboration between [Φ-lab, European Space Agency (ESA)](https://philab.esa.int/) and the [Adobe Research (Paris, France)](https://research.adobe.com/careers/paris/).