File size: 19,288 Bytes
4ad7f0b 205a7af 4ad7f0b 205a7af 4ad7f0b 205a7af 4ad7f0b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 |
---
title: GeoCalib
app_file: gradio_app.py
sdk: gradio
sdk_version: 4.38.1
---
<p align="center">
<h1 align="center"><ins>GeoCalib</ins> 馃摳<br>Single-image Calibration with Geometric Optimization</h1>
<p align="center">
<a href="https://www.linkedin.com/in/alexander-veicht/">Alexander Veicht</a>
路
<a href="https://psarlin.com/">Paul-Edouard Sarlin</a>
路
<a href="https://www.linkedin.com/in/philipplindenberger/">Philipp Lindenberger</a>
路
<a href="https://www.microsoft.com/en-us/research/people/mapoll/">Marc Pollefeys</a>
</p>
<h2 align="center">
<p>ECCV 2024</p>
<a href="" align="center">Paper</a> | <!--TODO: update link-->
<a href="https://colab.research.google.com/drive/1oMzgPGppAPAIQxe-s7SRd_q8r7dVfnqo#scrollTo=etdzQZQzoo-K" align="center">Colab</a> |
<a href="https://huggingface.co/spaces/veichta/GeoCalib" align="center">Demo 馃</a>
</h2>
</p>
<p align="center">
<a href=""><img src="assets/teaser.gif" alt="example" width=80%></a> <!--TODO: update link-->
<br>
<em>
GeoCalib accurately estimates the camera intrinsics and gravity direction from a single image
<br>
by combining geometric optimization with deep learning.
</em>
</p>
##
GeoCalib is a an algoritm for single-image calibration: it estimates the camera intrinsics and gravity direction from a single image only. By combining geometric optimization with deep learning, GeoCalib provides a more flexible and accurate calibration compared to previous approaches. This repository hosts the [inference](#setup-and-demo), [evaluation](#evaluation), and [training](#training) code for GeoCalib and instructions to download our training set [OpenPano](#openpano-dataset).
## Setup and demo
[](https://colab.research.google.com/drive/1oMzgPGppAPAIQxe-s7SRd_q8r7dVfnqo#scrollTo=etdzQZQzoo-K)
[](https://huggingface.co/spaces/veichta/GeoCalib)
We provide a small inference package [`geocalib`](geocalib) that requires only minimal dependencies and Python >= 3.9. First clone the repository and install the dependencies:
```bash
git clone https://github.com/cvg/GeoCalib.git && cd GeoCalib
python -m pip install -e .
# OR
python -m pip install -e "git+https://github.com/cvg/GeoCalib#egg=geocalib"
```
Here is a minimal usage example:
```python
from geocalib import GeoCalib
device = "cuda" if torch.cuda.is_available() else "cpu"
model = GeoCalib().to(device)
# load image as tensor in range [0, 1] with shape [C, H, W]
img = model.load_image("path/to/image.jpg").to(device)
result = model.calibrate(img)
print("camera:", result["camera"])
print("gravity:", result["gravity"])
```
When either the intrinsics or the gravity are already know, they can be provided:
```python
# known intrinsics:
result = model.calibrate(img, priors={"focal": focal_length_tensor})
# known gravity:
result = model.calibrate(img, priors={"gravity": gravity_direction_tensor})
```
The default model is optimized for pinhole images. To handle lens distortion, use the following:
```python
model = GeoCalib(weights="distorted") # default is "pinhole"
result = model.calibrate(img, camera_model="simple_radial") # or pinhole, simple_divisional
```
Check out our [demo notebook](demo.ipynb) for a full working example.
<details>
<summary><b>[Interactive demo for your webcam - click to expand]</b></summary>
Run the following command:
```bash
python -m geocalib.interactive_demo --camera_id 0
```
The demo will open a window showing the camera feed and the calibration results. If `--camera_id` is not provided, the demo will ask for the IP address of a [droidcam](https://droidcam.app) camera.
Controls:
>Toggle the different features using the following keys:
>
>- ```h```: Show the estimated horizon line
>- ```u```: Show the estimated up-vectors
>- ```l```: Show the estimated latitude heatmap
>- ```c```: Show the confidence heatmap for the up-vectors and latitudes
>- ```d```: Show undistorted image, will overwrite the other features
>- ```g```: Shows a virtual grid of points
>- ```b```: Shows a virtual box object
>
>Change the camera model using the following keys:
>
>- ```1```: Pinhole -> Simple and fast
>- ```2```: Simple Radial -> For small distortions
>- ```3```: Simple Divisional -> For large distortions
>
>Press ```q``` to quit the demo.
</details>
<details>
<summary><b>[Load GeoCalib with torch hub - click to expand]</b></summary>
```python
model = torch.hub.load("cvg/GeoCalib", "GeoCalib", trust_repo=True)
```
</details>
## Evaluation
The full evaluation and training code is provided in the single-image calibration library [`siclib`](siclib), which can be installed as:
```bash
python -m pip install -e siclib
```
Running the evaluation commands will write the results to `outputs/results/`.
### LaMAR
Running the evaluation commands will download the dataset to ```data/lamar2k``` which will take around 400 MB of disk space.
<details>
<summary>[Evaluate GeoCalib]</summary>
To evaluate GeoCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.lamar2k --conf geocalib-pinhole --tag geocalib --overwrite
```
</details>
<details>
<summary>[Evaluate DeepCalib]</summary>
To evaluate DeepCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.lamar2k --conf deepcalib --tag deepcalib --overwrite
```
</details>
<details>
<summary>[Evaluate Perspective Fields]</summary>
Coming soon!
</details>
<details>
<summary>[Evaluate UVP]</summary>
To evaluate UVP, install the [VP-Estimation-with-Prior-Gravity](https://github.com/cvg/VP-Estimation-with-Prior-Gravity) under ```third_party/VP-Estimation-with-Prior-Gravity```. Then run:
```bash
python -m siclib.eval.lamar2k --conf uvp --tag uvp --overwrite data.preprocessing.edge_divisible_by=null
```
</details>
<details>
<summary>[Evaluate your own model]</summary>
If you have trained your own model, you can evaluate it by running:
```bash
python -m siclib.eval.lamar2k --checkpoint <experiment name> --tag <eval name> --overwrite
```
</details>
<details>
<summary>[Results]</summary>
Here are the results for the Area Under the Curve (AUC) for the roll, pitch and field of view (FoV) errors at 1/5/10 degrees for the different methods:
| Approach | Roll | Pitch | FoV |
| --------- | ------------------ | ------------------ | ------------------ |
| DeepCalib | 44.1 / 73.9 / 84.8 | 10.8 / 28.3 / 49.8 | 0.7 / 13.0 / 24.0 |
| ParamNet | 51.7 / 77.0 / 86.0 | 27.0 / 52.7 / 70.2 | 02.8 / 06.8 / 14.3 |
| UVP | 72.7 / 81.8 / 85.7 | 42.3 / 59.9 / 69.4 | 15.6 / 30.6 / 43.5 |
| GeoCalib | 86.4 / 92.5 / 95.0 | 55.0 / 76.9 / 86.2 | 19.1 / 41.5 / 60.0 |
</details>
### MegaDepth
Running the evaluation commands will download the dataset to ```data/megadepth2k``` or ```data/memegadepth2k-radial``` which will take around 2.1 GB and 1.47 GB of disk space respectively.
<details>
<summary>[Evaluate GeoCalib]</summary>
To evaluate GeoCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.megadepth2k --conf geocalib-pinhole --tag geocalib --overwrite
```
To run the eval on the radial distorted images, run:
```bash
python -m siclib.eval.megadepth2k_radial --conf geocalib-pinhole --tag geocalib --overwrite model.camera_model=simple_radial
```
</details>
<details>
<summary>[Evaluate DeepCalib]</summary>
To evaluate DeepCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.megadepth2k --conf deepcalib --tag deepcalib --overwrite
```
</details>
<details>
<summary>[Evaluate Perspective Fields]</summary>
Coming soon!
</details>
<details>
<summary>[Evaluate UVP]</summary>
To evaluate UVP, install the [VP-Estimation-with-Prior-Gravity](https://github.com/cvg/VP-Estimation-with-Prior-Gravity) under ```third_party/VP-Estimation-with-Prior-Gravity```. Then run:
```bash
python -m siclib.eval.megadepth2k --conf uvp --tag uvp --overwrite data.preprocessing.edge_divisible_by=null
```
</details>
<details>
<summary>[Evaluate your own model]</summary>
If you have trained your own model, you can evaluate it by running:
```bash
python -m siclib.eval.megadepth2k --checkpoint <experiment name> --tag <eval name> --overwrite
```
</details>
<details>
<summary>[Results]</summary>
Here are the results for the Area Under the Curve (AUC) for the roll, pitch and field of view (FoV) errors at 1/5/10 degrees for the different methods:
| Approach | Roll | Pitch | FoV |
| --------- | ------------------ | ------------------ | ------------------ |
| DeepCalib | 34.6 / 65.4 / 79.4 | 11.9 / 27.8 / 44.8 | 5.6 / 12.1 / 22.9 |
| ParamNet | 43.4 / 70.7 / 82.2 | 15.4 / 34.5 / 53.3 | 3.2 / 10.1 / 21.3 |
| UVP | 69.2 / 81.6 / 86.9 | 21.6 / 36.2 / 47.4 | 8.2 / 18.7 / 29.8 |
| GeoCalib | 82.6 / 90.6 / 94.0 | 32.4 / 53.3 / 67.5 | 13.6 / 31.7 / 48.2 |
</details>
### TartanAir
Running the evaluation commands will download the dataset to ```data/tartanair``` which will take around 1.85 GB of disk space.
<details>
<summary>[Evaluate GeoCalib]</summary>
To evaluate GeoCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.tartanair --conf geocalib-pinhole --tag geocalib --overwrite
```
</details>
<details>
<summary>[Evaluate DeepCalib]</summary>
To evaluate DeepCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.tartanair --conf deepcalib --tag deepcalib --overwrite
```
</details>
<details>
<summary>[Evaluate Perspective Fields]</summary>
Coming soon!
</details>
<details>
<summary>[Evaluate UVP]</summary>
To evaluate UVP, install the [VP-Estimation-with-Prior-Gravity](https://github.com/cvg/VP-Estimation-with-Prior-Gravity) under ```third_party/VP-Estimation-with-Prior-Gravity```. Then run:
```bash
python -m siclib.eval.tartanair --conf uvp --tag uvp --overwrite data.preprocessing.edge_divisible_by=null
```
</details>
<details>
<summary>[Evaluate your own model]</summary>
If you have trained your own model, you can evaluate it by running:
```bash
python -m siclib.eval.tartanair --checkpoint <experiment name> --tag <eval name> --overwrite
```
</details>
<details>
<summary>[Results]</summary>
Here are the results for the Area Under the Curve (AUC) for the roll, pitch and field of view (FoV) errors at 1/5/10 degrees for the different methods:
| Approach | Roll | Pitch | FoV |
| --------- | ------------------ | ------------------ | ------------------ |
| DeepCalib | 24.7 / 55.4 / 71.5 | 16.3 / 38.8 / 58.5 | 1.5 / 8.8 / 27.2 |
| ParamNet | 34.5 / 59.2 / 73.9 | 19.4 / 42.0 / 60.3 | 6.0 / 16.8 / 31.6 |
| UVP | 52.1 / 64.8 / 71.9 | 36.2 / 48.8 / 58.6 | 15.8 / 25.8 / 35.7 |
| GeoCalib | 71.3 / 83.8 / 89.8 | 38.2 / 62.9 / 76.6 | 14.1 / 30.4 / 47.6 |
</details>
### Stanford2D3D
Before downloading and running the evaluation, you will need to agree to the [terms of use](https://docs.google.com/forms/d/e/1FAIpQLScFR0U8WEUtb7tgjOhhnl31OrkEs73-Y8bQwPeXgebqVKNMpQ/viewform?c=0&w=1) for the Stanford2D3D dataset.
Running the evaluation commands will download the dataset to ```data/stanford2d3d``` which will take around 885 MB of disk space.
<details>
<summary>[Evaluate GeoCalib]</summary>
To evaluate GeoCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.stanford2d3d --conf geocalib-pinhole --tag geocalib --overwrite
```
</details>
<details>
<summary>[Evaluate DeepCalib]</summary>
To evaluate DeepCalib trained on the OpenPano dataset, run:
```bash
python -m siclib.eval.stanford2d3d --conf deepcalib --tag deepcalib --overwrite
```
</details>
<details>
<summary>[Evaluate Perspective Fields]</summary>
Coming soon!
</details>
<details>
<summary>[Evaluate UVP]</summary>
To evaluate UVP, install the [VP-Estimation-with-Prior-Gravity](https://github.com/cvg/VP-Estimation-with-Prior-Gravity) under ```third_party/VP-Estimation-with-Prior-Gravity```. Then run:
```bash
python -m siclib.eval.stanford2d3d --conf uvp --tag uvp --overwrite data.preprocessing.edge_divisible_by=null
```
</details>
<details>
<summary>[Evaluate your own model]</summary>
If you have trained your own model, you can evaluate it by running:
```bash
python -m siclib.eval.stanford2d3d --checkpoint <experiment name> --tag <eval name> --overwrite
```
</details>
<details>
<summary>[Results]</summary>
Here are the results for the Area Under the Curve (AUC) for the roll, pitch and field of view (FoV) errors at 1/5/10 degrees for the different methods:
| Approach | Roll | Pitch | FoV |
| --------- | ------------------ | ------------------ | ------------------ |
| DeepCalib | 33.8 / 63.9 / 79.2 | 21.6 / 46.9 / 65.7 | 8.1 / 20.6 / 37.6 |
| ParamNet | 44.6 / 73.9 / 84.8 | 29.2 / 56.7 / 73.1 | 5.8 / 14.3 / 27.8 |
| UVP | 65.3 / 74.6 / 79.1 | 51.2 / 63.0 / 69.2 | 22.2 / 39.5 / 51.3 |
| GeoCalib | 83.1 / 91.8 / 94.8 | 52.3 / 74.8 / 84.6 | 17.4 / 40.0 / 59.4 |
</details>
### Evaluation options
If you want to provide priors during the evaluation, you can add one or multiple of the following flags:
```bash
python -m siclib.eval.<benchmark> --conf <config> \
--tag <tag> \
data.use_prior_focal=true \
data.use_prior_gravity=true \
data.use_prior_k1=true
```
<details>
<summary>[Visual inspection]</summary>
To visually inspect the results of the evaluation, you can run the following command:
```bash
python -m siclib.eval.inspect <benchmark> <one or multiple tags>
```
For example, to inspect the results of the evaluation of the GeoCalib model on the LaMAR dataset, you can run:
```bash
python -m siclib.eval.inspect lamar2k geocalib
```
</details>
## OpenPano Dataset
The OpenPano dataset is a new dataset for single-image calibration which contains about 2.8k panoramas from various sources, namely [HDRMAPS](https://hdrmaps.com/hdris/), [PolyHaven](https://polyhaven.com/hdris), and the [Laval Indoor HDR dataset](http://hdrdb.com/indoor/#presentation). While this dataset is smaller than previous ones, it is publicly available and it provides a better balance between indoor and outdoor scenes.
<details>
<summary>[Downloading and preparing the dataset]</summary>
In order to assemble the training set, first download the Laval dataset following the instructions on [the corresponding project page](http://hdrdb.com/indoor/#presentation) and place the panoramas in ```data/indoorDatasetCalibrated```. Then, tonemap the HDR images using the following command:
```bash
python -m siclib.datasets.utils.tonemapping --hdr_dir data/indoorDatasetCalibrated --out_dir data/laval-tonemap
```
We provide a script to download the PolyHaven and HDRMAPS panos. The script will create folders ```data/openpano/panoramas/{split}``` containing the panoramas specified by the ```{split}_panos.txt``` files. To run the script, execute the following commands:
```bash
python -m siclib.datasets.utils.download_openpano --name openpano --laval_dir data/laval-tonemap
```
Alternatively, you can download the PolyHaven and HDRMAPS panos from [here](https://cvg-data.inf.ethz.ch/GeoCalib_ECCV2024/).
After downloading the panoramas, you can create the training set by running the following command:
```bash
python -m siclib.datasets.create_dataset_from_pano --config-name openpano
```
The dataset creation can be sped up by using multiple workers and a GPU. To do so, add the following arguments to the command:
```bash
python -m siclib.datasets.create_dataset_from_pano --config-name openpano n_workers=10 device=cuda
```
This will create the training set in ```data/openpano/openpano``` with about 37k images for training, 2.1k for validation, and 2.1k for testing.
<details>
<summary>[Distorted OpenPano]</summary>
To create the OpenPano dataset with radial distortion, run the following command:
```bash
python -m siclib.datasets.create_dataset_from_pano --config-name openpano_radial
```
</details>
</details>
## Training
As for the evaluation, the training code is provided in the single-image calibration library [`siclib`](siclib), which can be installed by:
```bash
python -m pip install -e siclib
```
Once the [OpenPano Dataset](#openpano-dataset) has been downloaded and prepared, we can train GeoCalib with it:
First download the pre-trained weights for the [MSCAN-B](https://cloud.tsinghua.edu.cn/d/c15b25a6745946618462/) backbone:
```bash
mkdir weights
wget "https://cloud.tsinghua.edu.cn/d/c15b25a6745946618462/files/?p=%2Fmscan_b.pth&dl=1" -O weights/mscan_b.pth
```
Then, start the training with the following command:
```bash
python -m siclib.train geocalib-pinhole-openpano --conf geocalib --distributed
```
Feel free to use any other experiment name. By default, the checkpoints will be written to ```outputs/training/```. The default batch size is 24 which requires 2x 4090 GPUs with 24GB of VRAM each. Configurations are managed by [Hydra](https://hydra.cc/) and can be overwritten from the command line.
For example, to train GeoCalib on a single GPU with a batch size of 5, run:
```bash
python -m siclib.train geocalib-pinhole-openpano \
--conf geocalib \
data.train_batch_size=5 # for 1x 2080 GPU
```
Be aware that this can impact the overall performance. You might need to adjust the learning rate and number of training steps accordingly.
If you want to log the training progress to [tensorboard](https://www.tensorflow.org/tensorboard) or [wandb](https://wandb.ai/), you can set the ```train.writer``` option:
```bash
python -m siclib.train geocalib-pinhole-openpano \
--conf geocalib \
--distributed \
train.writer=tensorboard
```
The model can then be evaluated using its experiment name:
```bash
python -m siclib.eval.<benchmark> --checkpoint geocalib-pinhole-openpano \
--tag geocalib-retrained
```
<details>
<summary>[Training DeepCalib]</summary>
To train DeepCalib on the OpenPano dataset, run:
```bash
python -m siclib.train deepcalib-openpano --conf deepcalib --distributed
```
Make sure that you have generated the [OpenPano Dataset](#openpano-dataset) with radial distortion or add
the flag ```data=openpano``` to the command to train on the pinhole images.
</details>
<details>
<summary>[Training Perspective Fields]</summary>
Coming soon!
</details>
## BibTeX citation
If you use any ideas from the paper or code from this repo, please consider citing:
```bibtex
@inproceedings{veicht2024geocalib,
author = {Alexander Veicht and
Paul-Edouard Sarlin and
Philipp Lindenberger and
Marc Pollefeys},
title = {{GeoCalib: Single-image Calibration with Geometric Optimization}},
booktitle = {ECCV},
year = {2024}
}
```
## License
The code is provided under the [Apache-2.0 License](LICENSE) while the weights of the trained model are provided under the [Creative Commons Attribution 4.0 International Public License](https://creativecommons.org/licenses/by/4.0/legalcode). Thanks to the authors of the [Laval Indoor HDR dataset](http://hdrdb.com/indoor/#presentation) for allowing this.
|