Spaces:
Running
A newer version of the Gradio SDK is available:
5.32.1
RDD: Robust Feature Detector and Descriptor using Deformable Transformer (CVPR 2025)
Gonglin Chen · Tianwen Fu · Haiwei Chen · Wenbin Teng · Hanyuan Xiao · Yajie Zhao
Table of Contents
Updates
SfM reconstruction through COLMAP added. We provide a ready-to-use notebook for a simple example. Code adopted from hloc.
Training code and new weights released.
We have updated the training code compared to what was described in the paper. In the original setup, the RDD was trained on the MegaDepth and Air-to-Ground datasets by resizing all images to the training resolution. In this release, we retrained RDD on MegaDepth only, using a combination of resizing and cropping, a strategy used by ALIKE. This change significantly improves robustness.
MegaDepth-1500 | MegaDepth-View | Air-to-Ground | |||||||
---|---|---|---|---|---|---|---|---|---|
AUC 5° | AUC 10° | AUC 20° | AUC 5° | AUC 10° | AUC 20° | AUC 5° | AUC 10° | AUC 20° | |
RDD-v2 | 52.4 | 68.5 | 80.1 | 52.0 | 67.1 | 78.2 | 45.8 | 58.6 | 71.0 |
RDD-v1 | 48.2 | 65.2 | 78.3 | 38.3 | 53.1 | 65.6 | 41.4 | 56.0 | 67.8 |
RDD-v2+LG | 53.3 | 69.8 | 82.0 | 59.0 | 74.2 | 84.0 | 54.8 | 69.0 | 79.1 |
RDD-v1+LG | 52.3 | 68.9 | 81.8 | 54.2 | 69.3 | 80.3 | 55.1 | 68.9 | 78.9 |
Installation
git clone --recursive https://github.com/xtcpete/rdd
cd RDD
# Create conda env
conda create -n rdd python=3.10 pip
conda activate rdd
# Install CUDA
conda install -c nvidia/label/cuda-11.8.0 cuda-toolkit
# Install torch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118
# Install all dependencies
pip install -r requirements.txt
# Compile custom operations
cd ./RDD/models/ops
pip install -e .
We provide the download link to:
- the MegaDepth-1500 test set
- the MegaDepth-View test set
- the Air-to-Ground test set
- 2 pretrained models, RDD and LightGlue for matching RDD
Create and unzip downloaded test data to the data
folder.
Create and add weights to the weights
folder and you are ready to go.
Usage
For your convenience, we provide a ready-to-use notebook for some examples.
Inference
from RDD.RDD import build
RDD_model = build()
output = RDD_model.extract(torch.randn(1, 3, 480, 640))
Evaluation
Please note that due to the different GPU architectures and the stochastic nature of RANSAC, you may observe slightly different results; however, they should be very close to those reported in the paper. To reproduce the number in paper, use v1 weights instead.
Results can be visualized by passing argument --plot
MegaDepth-1500
# Sparse matching
python ./benchmarks/mega_1500.py
# Dense matching
python ./benchmarks/mega_1500.py --method dense
# LightGlue
python ./benchmarks/mega_1500.py --method lightglue
MegaDepth-View
# Sparse matching
python ./benchmarks/mega_view.py
# Dense matching
python ./benchmarks/mega_view.py --method dense
# LightGlue
python ./benchmarks/mega_view.py --method lightglue
Air-to-Ground
# Sparse matching
python ./benchmarks/air_ground.py
# Dense matching
python ./benchmarks/air_ground.py --method dense
# LightGlue
python ./benchmarks/air_ground.py --method lightglue
Training
- Download MegaDepth dataset using download.sh and megadepth_indices from LoFTR. Then the MegaDepth root folder should look like the following:
./data/megadepth/megadepth_indices # indices
./data/megadepth/depth_undistorted # depth maps
./data/megadepth/Undistorted_SfM # images and poses
./data/megadepth/scene_info # indices for training LightGlue
- Then you can train RDD in two steps; Descriptor first
# distributed training with 8 gpus
python -m training.train --ckpt_save_path ./ckpt_descriptor --distributed --batch_size 32
# single gpu
python -m training.train --ckpt_save_path ./ckpt_descriptor
and then Detector
python -m training.train --ckpt_save_path ./ckpt_detector --weights ./ckpt_descriptor/RDD_best.pth --train_detector --training_res 480
I am working on recollecting the Air-to-Ground dataset because of licensing issues.
Citation
@inproceedings{gonglin2025rdd,
title = {RDD: Robust Feature Detector and Descriptor using Deformable Transformer},
author = {Chen, Gonglin and Fu, Tianwen and Chen, Haiwei and Teng, Wenbin and Xiao, Hanyuan and Zhao, Yajie},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2025}
}
License
Acknowledgements
We thank these great repositories: ALIKE, LoFTR, DeDoDe, XFeat, LightGlue, Kornia, and Deformable DETR, and many other inspiring works in the community.
LightGlue is trained with Glue Factory.
Supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number 140D0423C0075. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government. We would like to thank Yayue Chen for her help with visualization.