Papers
arxiv:2311.00566

CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders

Published on Nov 1, 2023
Authors:
,
,

Abstract

A vital and rapidly growing application, remote sensing offers vast yet sparsely labeled, <PRE_TAG>spatially aligned</POST_TAG> <PRE_TAG>multimodal data</POST_TAG>; this makes self-supervised learning algorithms invaluable. We present CROMA: a framework that combines contrastive and reconstruction self-supervised objectives to learn rich unimodal and <PRE_TAG>multimodal representations</POST_TAG>. Our method separately encodes masked-out multispectral optical and synthetic aperture radar samples -- aligned in space and time -- and performs cross-modal <PRE_TAG>contrastive learning</POST_TAG>. Another encoder fuses these sensors, producing joint <PRE_TAG>multimodal encodings</POST_TAG> that are used to predict the masked patches via a lightweight decoder. We show that these objectives are complementary when leveraged on <PRE_TAG>spatially aligned</POST_TAG> <PRE_TAG>multimodal data</POST_TAG>. We also introduce X- and 2D-ALiBi, which spatially biases our cross- and <PRE_TAG>self-attention matrices</POST_TAG>. These strategies improve representations and allow our models to effectively extrapolate to images up to 17.6x larger at test-time. CROMA outperforms the current SoTA multispectral model, evaluated on: four classification benchmarks -- finetuning (avg. 1.8%), linear (avg. 2.4%) and non<PRE_TAG>linear</POST_TAG> (avg. 1.4%) probing, kNN classification (avg. 3.5%), and K-means clustering (avg. 8.4%); and three segmentation benchmarks (avg. 6.4%). CROMA's rich, optionally <PRE_TAG>multimodal representations</POST_TAG> can be widely leveraged across remote sensing applications.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.00566 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2311.00566 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.