Spaces:
Paused
A newer version of the Gradio SDK is available:
5.24.0
DEPTH
Monocular depth estimation is a computer vision task that involves predicting the depth information of a scene from a single image. In other words, it is the process of estimating the distance of objects in a scene from a single camera viewpoint.
Monocular depth estimation has various applications, including 3D reconstruction, augmented reality, autonomous driving, and robotics. It is a challenging task as it requires the model to understand the complex relationships between objects in the scene and the corresponding depth information, which can be affected by factors such as lighting conditions, occlusion, and texture.
Usage
from image_gen_aux import DepthPreprocessor
from image_gen_aux.utils import load_image
input_image = load_image("https://huggingface.co/datasets/OzzyGT/testing-resources/resolve/main/depth/coffee_ship.png")
depth_preprocessor = DepthPreprocessor.from_pretrained("depth-anything/Depth-Anything-V2-Large-hf").to("cuda")
image = depth_preprocessor(input_image)[0]
image.save("depth.png")
Models
The Depth Preprocessor supports any depth estimation model that the transformers
library supports that doesn't have a fixed image size restriction, but we mainly recommend and ensure the correct functionality for these models:
Model | License | Project Page |
---|---|---|
Depth Anything V2 | CC-BY-NC-4.0 | https://depth-anything-v2.github.io/ |
ZoeDepth | MIT | https://github.com/isl-org/ZoeDepth |
Each model has different variations:
Depth Anything V2
Variation | Repo ID |
---|---|
Small | depth-anything/Depth-Anything-V2-Small-hf |
Base | depth-anything/Depth-Anything-V2-Base-hf |
Large | depth-anything/Depth-Anything-V2-Large-hf |
ZoeDepth
Variation | Repo ID |
---|---|
NYU | Intel/zoedepth-nyu |
KITTI | Intel/zoedepth-kitti |
NYU and KITTI | Intel/zoedepth-nyu-kitti |