qitaoz commited on
Commit
1b3d870
·
verified ·
1 Parent(s): 23d12d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -142
README.md CHANGED
@@ -16,148 +16,6 @@ This repository contains the official implementation for **Sparse-view Pose Esti
16
 
17
  ### [Project Page](https://qitaozhao.github.io/SparseAGS) | [arXiv (Coming Soon)](https://qitaozhao.github.io/SparseAGS)
18
 
19
- ### News
20
-
21
- - 2024.12.02: Initial code release.
22
-
23
- ## Introduction
24
-
25
- **tl;dr** Given a set of unposed input images, **SparseAGS** jointly infers the corresponding camera poses and underlying 3D, allowing high-fidelity 3D inference in the wild.
26
-
27
- **Abstract.** Inferring the 3D structure underlying a set of multi-view images typically requires solving two co-dependent tasks -- accurate 3D reconstruction requires precise camera poses, and predicting camera poses relies on (implicitly or explicitly) modeling the underlying 3D. The classical framework of analysis by synthesis casts this inference as a joint optimization seeking to explain the observed pixels, and recent instantiations learn expressive 3D representations (e.g., Neural Fields) with gradient-descent-based pose refinement of initial pose estimates. However, given a sparse set of observed views, the observations may not provide sufficient direct evidence to obtain complete and accurate 3D. Moreover, large errors in pose estimation may not be easily corrected and can further degrade the inferred 3D. To allow robust 3D reconstruction and pose estimation in this challenging setup, we propose *SparseAGS*, a method that adapts this analysis-by-synthesis approach by: a) including novel-view-synthesis-based generative priors in conjunction with photometric objectives to improve the quality of the inferred 3D, and b) explicitly reasoning about outliers and using a discrete search with a continuous optimization-based strategy to correct them. We validate our framework across real-world and synthetic datasets in combination with several off-the-shelf pose estimation systems as initialization. We find that it significantly improves the base systems' pose accuracy while yielding high-quality 3D reconstructions that outperform the results from current multi-view reconstruction baselines.
28
-
29
- ![teasert](assets/teaser.gif)
30
-
31
- ## Install
32
-
33
- 1. Clone SparseAGS:
34
-
35
- ```bash
36
- git clone --recursive https://github.com/QitaoZhao/SparseAGS.git
37
- cd SparseAGS
38
- # if you have already cloned sparseags:
39
- # git submodule update --init --recursive
40
- ```
41
-
42
- 2. Create the environment and install packages:
43
-
44
- ```bash
45
- conda create -n sparseags python=3.9
46
- conda activate sparseags
47
-
48
- # enable nvcc
49
- conda install -c conda-forge cudatoolkit-dev
50
-
51
- ### torch
52
- # CUDA 11.7
53
- pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
54
-
55
- # CUDA 12.1
56
- pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
57
-
58
- pip install -r requirements.txt
59
-
60
- ### pytorch3D
61
- # CUDA 11.7
62
- conda install https://anaconda.org/pytorch3d/pytorch3d/0.7.5/download/linux-64/pytorch3d-0.7.5-py39_cu117_pyt1130.tar.bz2
63
-
64
- # CUDA 12.1
65
- conda install https://anaconda.org/pytorch3d/pytorch3d/0.7.8/download/linux-64/pytorch3d-0.7.8-py39_cu121_pyt210.tar.bz2
66
-
67
- # liegroups (minor modification to https://github.com/utiasSTARS/liegroups)
68
- pip install ./liegroups
69
-
70
- # simple-knn
71
- pip install ./simple-knn
72
-
73
- # a modified gaussian splatting that enables camera pose optimization
74
- pip install ./diff-gaussian-rasterization-camera
75
- ```
76
-
77
- Tested on:
78
-
79
- - Ubuntu 20.04 with torch 1.13 & CUDA 11.7 on an A5000 GPU.
80
- - Springdale Linux 8.6 with torch 2.1.0 & CUDA 12.1 on an A5000 GPU.
81
- - Red Hat Enterprise Linux 8.10 with torch 1.13 & CUDA 11.7 on a V100 GPU.
82
-
83
- Note: Look at this [issue](https://github.com/graphdeco-inria/gaussian-splatting/issues/993) or try `sudo apt-get install libglm-dev` if you encounter `fatal error: glm/glm.hpp: No such file or directory` when doing `pip install ./diff-gaussian-rasterization-camera`.
84
-
85
- 3. Download our 6-DoF Zero123 [checkpoint](https://drive.google.com/file/d/1JJ4wjaJ4IkUERRZYRrlNoQ-tXvftEYJD/view?usp=sharing) and place it in `SparseAGS/checkpoints`.
86
-
87
- ```bash
88
- mkdir checkpoints
89
- cd checkpoints/
90
- pip install gdown
91
- gdown "https://drive.google.com/uc?id=1JJ4wjaJ4IkUERRZYRrlNoQ-tXvftEYJD"
92
- cd ..
93
- ```
94
-
95
- ## Usage
96
-
97
- (1) **Gradio Demo** (recommended, where you can upload your own images or use our preprocessed examples interactively):
98
-
99
- ```bash
100
- # first-time running may take a longer time
101
- python gradio_app.py
102
- ```
103
-
104
- (2) Use command lines:
105
-
106
- ```bash
107
- ### preprocess
108
- # background removal and recentering, save rgba at 256x256
109
- python process.py data/name.jpg
110
-
111
- # save at a larger resolution
112
- python process.py data/name.jpg --size 512
113
-
114
- # process all jpg images under a dir
115
- python process.py data
116
-
117
- ### sparse-view 3D reconstruction
118
- # here we have some preprocessed examples in 'data/demo', with dust3r pose initialization
119
- # the output will be saved in 'output/demo/{category}'
120
- # valid category-num_views options are {[toy, 4], [butter, 6], [jordan, 8], [robot, 8], [eagle, 8]}
121
-
122
- # run single 3D reconstruction (w/o outlier removal & correction)
123
- python run.py --category jordan --num_views 8
124
-
125
- # if you find the command above does not give you nice 3D, try enbaling loop-based outlier removal & correction (which takes more time)
126
- python run.py --category jordan --num_views 8 --enable_loop
127
- ```
128
-
129
- Note: Actually, we include the `eagle` example to showcase how our full method works (we found in our experiments that dust3r gives one bad pose for this example). For other examples, you are supposed to see reasonable 3D with a single 3D reconstruction.
130
-
131
- ## Tips
132
-
133
- * The world & camera coordinate system is the same as OpenGL:
134
- ```
135
- World Camera
136
-
137
- +y up target
138
- | | /
139
- | | /
140
- |______+x |/______right
141
- / /
142
- / /
143
- / /
144
- +z forward
145
-
146
- elevation: in (-90, 90), from +y to -y is (-90, 90)
147
- azimuth: in (-180, 180), from +z to +x is (0, 90)
148
- ```
149
-
150
- ## Acknowledgments
151
-
152
- from https://github.com/ashawkey/diff-gaussian-rasterization
153
-
154
- This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
155
-
156
- - [gaussian-splatting](https://github.com/graphdeco-inria/gaussian-splatting) and [diff-gaussian-rasterization](https://github.com/graphdeco-inria/diff-gaussian-rasterization)
157
- - [threestudio](https://github.com/threestudio-project/threestudio)
158
- - [nvdiffrast](https://github.com/NVlabs/nvdiffrast)
159
- - [dearpygui](https://github.com/hoffstadt/DearPyGui)
160
-
161
  ## Citation
162
 
163
  ```
 
16
 
17
  ### [Project Page](https://qitaozhao.github.io/SparseAGS) | [arXiv (Coming Soon)](https://qitaozhao.github.io/SparseAGS)
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## Citation
20
 
21
  ```