File size: 2,508 Bytes
79a1f97
08788d8
 
 
 
79a1f97
 
8c97006
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
 
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
 
79a1f97
08788d8
79a1f97
08788d8
79a1f97
08788d8
 
 
 
79a1f97
08788d8
 
79a1f97
08788d8
 
 
79a1f97
08788d8
 
 
 
 
79a1f97
08788d8
79a1f97
08788d8
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
license: mit
tags:
- vision
pipeline_tag: depth-estimation
---

# ZoeDepth (fine-tuned on NYU and KITTI) 

ZoeDepth model fine-tuned on the NYU and KITTI datasets. It was introduced in the paper [ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth](https://arxiv.org/abs/2302.12288) by Shariq et al. and first released in [this repository](https://github.com/isl-org/ZoeDepth).

ZoeDepth extends the [DPT](https://huggingface.co/docs/transformers/en/model_doc/dpt) framework for metric (also called absolute) depth estimation, obtaining state-of-the-art results.

Disclaimer: The team releasing ZoeDepth did not write a model card for this model so this model card has been written by the Hugging Face team.

## Model description

ZoeDepth adapts [DPT](https://huggingface.co/docs/transformers/en/model_doc/dpt), a model for relative depth estimation, for so-called metric (also called absolute) depth estimation.

This means that the model is able to estimate depth in actual metric values.

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/zoedepth_architecture_bis.png"
alt="drawing" width="600"/>

<small> ZoeDepth architecture. Taken from the <a href="https://arxiv.org/abs/2302.12288">original paper.</a> </small>

## Intended uses & limitations

You can use the raw model for tasks like zero-shot monocular depth estimation. See the [model hub](https://huggingface.co/models?search=Intel/zoedepth) to look for
other versions on a task that interests you.

### How to use

The easiest is to leverage the pipeline API which abstracts away the complexity for the user:

```python
from transformers import pipeline
from PIL import Image
import requests

# load pipe
depth_estimator = pipeline(task="depth-estimation", model="Intel/zoedepth-nyu-kitti")

# load image
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

# inference
outputs = depth_estimator(image)
depth = outputs.depth
```
For more code examples, we refer to the [documentation](https://huggingface.co/transformers/main/model_doc/zoedepth.html#).

### BibTeX entry and citation info

```bibtex
@misc{bhat2023zoedepth,
      title={ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth}, 
      author={Shariq Farooq Bhat and Reiner Birkl and Diana Wofk and Peter Wonka and Matthias Müller},
      year={2023},
      eprint={2302.12288},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```