DarthReca
/

CLOSP-Visual

text-to-image-retrieval

crisis-management

earth-observation

contrastive-learning

Model card Files Files and versions

CLOSP-Visual / README.md

DarthReca's picture

Update README.md

6820897 verified 7 days ago

|

history blame contribute delete

1.51 kB

	---
	license: openrail
	datasets:
	- DarthReca/crisislandmark
	language:
	- en
	library_name: torchgeo
	tags:
	- remote-sensing
	- text-to-image-retrieval
	- multimodal
	- geospatial
	- SAR
	- multispectral
	- crisis-management
	- earth-observation
	- contrastive-learning
	---
	# CLOSP

	CLOSP (Contrastive Language Optical SAR Pretraining) is a multimodal architecture designed for text-to-image retrieval.
	It creates a unified embedding space for text, Sentinel-2 (MSI), and Sentinel-1 (SAR) data.

	This repository contains all the separate visual encoders in PyTorch format.

	## Model Details
	The model uses three separate encoders: one for text, one for Sentinel-1 (SAR) data, and one for Sentinel-2 (MSI) data.
	During training, it uses a contrastive objective to align the textual embeddings with the corresponding visual embeddings (either SAR or MSI).


	- Developed by: Daniele Rege Cambrin
	- Model type: CLOSP
	- Language(s) (NLP): english
	- License: OpenRAIL
	- Repository: [GitHub](https://github.com/DarthReca/closp)
	- Paper: [ArXiv](https://arxiv.org/abs/2507.10403)

	## Citation

	```bibtex
	@misc{cambrin2025texttoremotesensingimageretrievalrgbsources,
	title={Text-to-Remote-Sensing-Image Retrieval beyond RGB Sources},
	author={Daniele Rege Cambrin and Lorenzo Vaiani and Giuseppe Gallipoli and Luca Cagliero and Paolo Garza},
	year={2025},
	eprint={2507.10403},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2507.10403},
	}
	```