Spaces:
Running
Running
File size: 22,338 Bytes
7088d16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 |
# Introduction
Implicitron is a PyTorch3D-based framework for new-view synthesis via modeling the neural-network based representations.
# License
Implicitron is distributed as part of PyTorch3D under the [BSD license](https://github.com/facebookresearch/pytorch3d/blob/main/LICENSE).
It includes code from the [NeRF](https://github.com/bmild/nerf), [SRN](http://github.com/vsitzmann/scene-representation-networks) and [IDR](http://github.com/lioryariv/idr) repos.
See [LICENSE-3RD-PARTY](https://github.com/facebookresearch/pytorch3d/blob/main/LICENSE-3RD-PARTY) for their licenses.
# Installation
There are three ways to set up Implicitron, depending on the flexibility level required.
If you only want to train or evaluate models as they are implemented changing only the parameters, you can just install the package.
Implicitron also provides a flexible API that supports user-defined plug-ins;
if you want to re-implement some of the components without changing the high-level pipeline, you need to create a custom launcher script.
The most flexible option, though, is cloning PyTorch3D repo and building it from sources, which allows changing the code in arbitrary ways.
Below, we descibe all three options in more details.
## [Option 1] Running an executable from the package
This option allows you to use the code as is without changing the implementations.
Only configuration can be changed (see [Configuration system](#configuration-system)).
For this setup, install the dependencies and PyTorch3D from conda following [the guide](https://github.com/facebookresearch/pytorch3d/blob/master/INSTALL.md#1-install-with-cuda-support-from-anaconda-cloud-on-linux-only). Then, install implicitron-specific dependencies:
```shell
pip install "hydra-core>=1.1" visdom lpips matplotlib accelerate
```
Runner executable is available as `pytorch3d_implicitron_runner` shell command.
See [Running](#running) section below for examples of training and evaluation commands.
## [Option 2] Supporting custom implementations
To plug in custom implementations, for example, of renderer or implicit-function protocols, you need to create your own runner script and import the plug-in implementations there.
First, install PyTorch3D and Implicitron dependencies as described in the previous section.
Then, implement the custom script; copying `pytorch3d/projects/implicitron_trainer` is a good place to start.
See [Custom plugins](#custom-plugins) for more information on how to import implementations and enable them in the configs.
## [Option 3] Cloning PyTorch3D repo
This is the most flexible way to set up Implicitron as it allows changing the code directly.
It allows modifying the high-level rendering pipeline or implementing yet-unsupported loss functions.
Please follow the instructions to [install PyTorch3D from a local clone](https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md#2-install-from-a-local-clone).
Then, install Implicitron-specific dependencies:
```shell
pip install "hydra-core>=1.1" visdom lpips matplotlib accelerate
```
You are still encouraged to implement custom plugins as above where possible as it makes reusing the code easier.
The executable is located in `pytorch3d/projects/implicitron_trainer`.
> **_NOTE:_** Both `pytorch3d_implicitron_runner` and `pytorch3d_implicitron_visualizer`
executables (mentioned below) are not available when using local clone.
Instead users should use the python scripts `experiment.py` and `visualize_reconstruction.py` (see the [Running](Running) section below).
# Running
This section assumes that you use the executable provided by the installed package
(Option 1 / Option 2 in [#Installation](Installation) above),
i.e. `pytorch3d_implicitron_runner` and `pytorch3d_implicitron_visualizer` are available.
> **_NOTE:_** If the executables are not available (e.g. when using a local clone - Option 3 in [#Installation](Installation)),
users should directly use the `experiment.py` and `visualize_reconstruction.py` python scripts
which correspond to the executables as follows:
- `pytorch3d_implicitron_runner` corresponds to `<pytorch3d_root>/projects/implicitron_trainer/experiment.py`
- `pytorch3d_implicitron_visualizer` corresponds to `<pytorch3d_root>/projects/implicitron_trainer/visualize_reconstruction.py`
For instance, in order to directly execute training with the python script, users can call:
```shell
cd <pytorch3d_root>/projects/
python -m implicitron_trainer.experiment <args>`
```
If you have a custom `experiment.py` or `visualize_reconstruction.py` script
(as in the Option 2 [above](#Installation)), replace the executable with the path to your script.
## Training
To run training, pass a yaml config file, followed by a list of overridden arguments.
For example, to train NeRF on the first skateboard sequence from CO3D dataset, you can run:
```shell
dataset_args=data_source_ImplicitronDataSource_args.dataset_map_provider_JsonIndexDatasetMapProvider_args
pytorch3d_implicitron_runner --config-path ./configs/ --config-name repro_singleseq_nerf \
$dataset_args.dataset_root=<DATASET_ROOT> $dataset_args.category='skateboard' \
$dataset_args.test_restrict_sequence_id=0 test_when_finished=True exp_dir=<CHECKPOINT_DIR>
```
Here, `--config-path` points to the config path relative to `pytorch3d_implicitron_runner` location;
`--config-name` picks the config (in this case, `repro_singleseq_nerf.yaml`);
`test_when_finished` will launch evaluation script once training is finished.
Replace `<DATASET_ROOT>` with the location where the dataset in Implicitron format is stored
and `<CHECKPOINT_DIR>` with a directory where checkpoints will be dumped during training.
Other configuration parameters can be overridden in the same way.
See [Configuration system](#configuration-system) section for more information on this.
### Visdom logging
Note that the training script logs its progress to Visdom. Make sure to start a visdom server before the training commences:
```
python -m visdom.server
```
> In case a Visdom server is not started, the console will get flooded with `requests.exceptions.ConnectionError` errors signalling that a Visdom server is not available. Note that these errors <b>will NOT interrupt</b> the program and the training will still continue without issues.
## Evaluation
To run evaluation on the latest checkpoint after (or during) training, simply add `eval_only=True` to your training command.
E.g. for executing the evaluation on the NeRF skateboard sequence, you can run:
```shell
dataset_args=data_source_ImplicitronDataSource_args.dataset_map_provider_JsonIndexDatasetMapProvider_args
pytorch3d_implicitron_runner --config-path ./configs/ --config-name repro_singleseq_nerf \
$dataset_args.dataset_root=<CO3D_DATASET_ROOT> $dataset_args.category='skateboard' \
$dataset_args.test_restrict_sequence_id=0 exp_dir=<CHECKPOINT_DIR> eval_only=True
```
Evaluation prints the metrics to `stdout` and dumps them to a json file in `exp_dir`.
## Visualisation
The script produces a video of renders by a trained model assuming a pre-defined camera trajectory.
In order for it to work, `ffmpeg` needs to be installed:
```shell
conda install ffmpeg
```
Here is an example of calling the script:
```shell
pytorch3d_implicitron_visualizer exp_dir=<CHECKPOINT_DIR> \
visdom_show_preds=True n_eval_cameras=40 render_size="[64,64]" video_size="[256,256]"
```
The argument `n_eval_cameras` sets the number of renderring viewpoints sampled on a trajectory, which defaults to a circular fly-around;
`render_size` sets the size of a render passed to the model, which can be resized to `video_size` before writing.
Rendered videos of images, masks, and depth maps will be saved to `<CHECKPOINT_DIR>/video`.
# Configuration system
We use hydra and OmegaConf to parse the configs.
The config schema and default values are defined by the dataclasses implementing the modules.
More specifically, if a class derives from `Configurable`, its fields can be set in config yaml files or overridden in CLI.
For example, `GenericModel` has a field `render_image_width` with the default value 400.
If it is specified in the yaml config file or in CLI command, the new value will be used.
Configurables can form hierarchies.
For example, `GenericModel` has a field `raysampler: RaySampler`, which is also Configurable.
In the config, inner parameters can be propagated using `_args` postfix, e.g. to change `raysampler.n_pts_per_ray_training` (the number of sampled points per ray), the node `raysampler_args.n_pts_per_ray_training` should be specified.
### Top-level configuration class: `Experiment`
<b>The root of the hierarchy is defined by `Experiment` Configurable in `<pytorch3d_root>/projects/implicitron_trainer/experiment.py`.</b>
It has top-level fields like `seed`, which seeds the random number generator.
Additionally, it has non-leaf nodes like `model_factory_ImplicitronModelFactory_args.model_GenericModel_args`, which dispatches the config parameters to `GenericModel`.
Thus, changing the model parameters may be achieved in two ways: either by editing the config file, e.g.
```yaml
model_factory_ImplicitronModelFactory_args:
model_GenericModel_args:
render_image_width: 800
raysampler_args:
n_pts_per_ray_training: 128
```
or, equivalently, by adding the following to `pytorch3d_implicitron_runner` arguments:
```shell
model_args=model_factory_ImplicitronModelFactory_args.model_GenericModel_args
$model_args.render_image_width=800 $model_args.raysampler_args.n_pts_per_ray_training=128
```
See the documentation in `pytorch3d/implicitron/tools/config.py` for more details.
## Replaceable implementations
Sometimes changing the model parameters does not provide enough flexibility, and you want to provide a new implementation for a building block.
The configuration system also supports it!
Abstract classes like `BaseRenderer` derive from `ReplaceableBase` instead of `Configurable`.
This means that other Configurables can refer to them using the base type, while the specific implementation is chosen in the config using `_class_type`-postfixed node.
In that case, `_args` node name has to include the implementation type.
More specifically, to change renderer settings, the config will look like this:
```yaml
model_factory_ImplicitronModelFactory_args:
model_GenericModel_args:
renderer_class_type: LSTMRenderer
renderer_LSTMRenderer_args:
num_raymarch_steps: 10
hidden_size: 16
```
See the documentation in `pytorch3d/implicitron/tools/config.py` for more details on the configuration system.
## Custom plugins
If you have an idea for another implementation of a replaceable component, it can be plugged in without changing the core code.
For that, you need to set up Implicitron through option 2 or 3 above.
Let's say you want to implement a renderer that accumulates opacities similar to an X-ray machine.
First, create a module `x_ray_renderer.py` with a class deriving from `BaseRenderer`:
```python
from pytorch3d.implicitron.tools.config import registry
@registry.register
class XRayRenderer(BaseRenderer, torch.nn.Module):
n_pts_per_ray: int = 64
def __post_init__(self):
# custom initialization
def forward(
self,
ray_bundle,
implicit_functions=[],
evaluation_mode: EvaluationMode = EvaluationMode.EVALUATION,
**kwargs,
) -> RendererOutput:
...
```
Please note `@registry.register` decorator that registers the plug-in as an implementation of `Renderer`.
IMPORTANT: In order for it to run, the class (or its enclosing module) has to be imported in your launch script.
Additionally, this has to be done before parsing the root configuration class `ExperimentConfig`.
Simply add `import .x_ray_renderer` in the beginning of `experiment.py`.
After that, you should be able to change the config with:
```yaml
model_factory_ImplicitronModelFactory_args:
model_GenericModel_args:
renderer_class_type: XRayRenderer
renderer_XRayRenderer_args:
n_pts_per_ray: 128
```
to replace the implementation and potentially override the parameters.
# Code and config structure
The main object for this trainer loop is `Experiment`. It has four top-level replaceable components.
* `data_source`: This is a `DataSourceBase` which defaults to `ImplicitronDataSource`.
It constructs the data sets and dataloaders.
* `model_factory`: This is a `ModelFactoryBase` which defaults to `ImplicitronModelFactory`.
It constructs the model, which is usually an instance of `OverfitModel` (for NeRF-style training with overfitting to one scene) or `GenericModel` (that is able to generalize to multiple scenes by NeRFormer-style conditioning on other scene views), and can load its weights from a checkpoint.
* `optimizer_factory`: This is an `OptimizerFactoryBase` which defaults to `ImplicitronOptimizerFactory`.
It constructs the optimizer and can load its weights from a checkpoint.
* `training_loop`: This is a `TrainingLoopBase` which defaults to `ImplicitronTrainingLoop` and defines the main training loop.
As per above, the config structure is parsed automatically from the module hierarchy.
In particular, for ImplicitronModelFactory with generic model, model parameters are contained in the `model_factory_ImplicitronModelFactory_args.model_GenericModel_args` node, and dataset parameters in `data_source_ImplicitronDataSource_args` node.
Here is the class structure of GenericModel (single-line edges show aggregation, while double lines show available implementations):
```
model_GenericModel_args: GenericModel
β-- global_encoder_*_args: GlobalEncoderBase
β== SequenceAutodecoder
β-- autodecoder_args: Autodecoder
β== HarmonicTimeEncoder
β-- raysampler_*_args: RaySampler
β== AdaptiveRaysampler
β== NearFarRaysampler
β-- renderer_*_args: BaseRenderer
β== MultiPassEmissionAbsorptionRenderer
β== LSTMRenderer
β== SignedDistanceFunctionRenderer
β-- ray_tracer_args: RayTracing
β-- ray_normal_coloring_network_args: RayNormalColoringNetwork
β-- implicit_function_*_args: ImplicitFunctionBase
β== NeuralRadianceFieldImplicitFunction
β== SRNImplicitFunction
β-- raymarch_function_args: SRNRaymarchFunction
β-- pixel_generator_args: SRNPixelGenerator
β== SRNHyperNetImplicitFunction
β-- hypernet_args: SRNRaymarchHyperNet
β-- pixel_generator_args: SRNPixelGenerator
β== IdrFeatureField
β-- image_feature_extractor_*_args: FeatureExtractorBase
β== ResNetFeatureExtractor
β-- view_pooler_args: ViewPooler
β-- view_sampler_args: ViewSampler
β-- feature_aggregator_*_args: FeatureAggregatorBase
β== IdentityFeatureAggregator
β== AngleWeightedIdentityFeatureAggregator
β== AngleWeightedReductionFeatureAggregator
β== ReductionFeatureAggregator
```
Here is the class structure of OverfitModel:
```
model_OverfitModel_args: OverfitModel
β-- raysampler_*_args: RaySampler
β== AdaptiveRaysampler
β== NearFarRaysampler
β-- renderer_*_args: BaseRenderer
β== MultiPassEmissionAbsorptionRenderer
β== LSTMRenderer
β== SignedDistanceFunctionRenderer
β-- ray_tracer_args: RayTracing
β-- ray_normal_coloring_network_args: RayNormalColoringNetwork
β-- implicit_function_*_args: ImplicitFunctionBase
β== NeuralRadianceFieldImplicitFunction
β== SRNImplicitFunction
β-- raymarch_function_args: SRNRaymarchFunction
β-- pixel_generator_args: SRNPixelGenerator
β== SRNHyperNetImplicitFunction
β-- hypernet_args: SRNRaymarchHyperNet
β-- pixel_generator_args: SRNPixelGenerator
β== IdrFeatureField
β-- coarse_implicit_function_*_args: ImplicitFunctionBase
β== NeuralRadianceFieldImplicitFunction
β== SRNImplicitFunction
β-- raymarch_function_args: SRNRaymarchFunction
β-- pixel_generator_args: SRNPixelGenerator
β== SRNHyperNetImplicitFunction
β-- hypernet_args: SRNRaymarchHyperNet
β-- pixel_generator_args: SRNPixelGenerator
β== IdrFeatureField
```
OverfitModel has been introduced to create a simple class to disantagle Nerfs which the overfit pattern
from the GenericModel.
Please look at the annotations of the respective classes or functions for the lists of hyperparameters.
`tests/experiment.yaml` shows every possible option if you have no user-defined classes.
# Implementations of existing methods
We provide configuration files that implement several existing works.
<b>The configuration files live in `pytorch3d/projects/implicitron_trainer/configs`.</b>
## NeRF
The following config file corresponds to training of a vanilla NeRF on Blender Synthetic dataset
(see https://arxiv.org/abs/2003.08934 for details of the method):
`./configs/repro_singleseq_nerf_blender.yaml`
### Downloading Blender-Synthetic
Training requires the Blender Synthetic dataset.
To download the dataset, visit the [gdrive bucket](https://drive.google.com/file/d/18JxhpWD-4ZmuFKLzKlAw-w5PpzZxXOcG/view?usp=share_link)
and click Download.
Then unpack the downloaded .zip file to a folder which we call `<BLENDER_DATASET_ROOT_FOLDER>`.
### Launching NeRF training
In order to train NeRF on the "drums" scene, execute the following command:
```shell
export BLENDER_DATASET_ROOT="<BLENDER_DATASET_ROOT_FOLDER>" \
export BLENDER_SINGLESEQ_CLASS="drums" \
pytorch3d_implicitron_runner --config-path ./configs/ --config-name repro_singleseq_nerf_blender
```
Note that the training scene is selected by setting the environment variable `BLENDER_SINGLESEQ_CLASS`
appropriately (one of `"chair"`, `"drums"`, `"ficus"`, `"hotdog"`, `"lego"`, `"materials"`, `"mic"`, `"ship"`).
By default, the training outputs will be stored to `"./data/nerf_blender_repro/$BLENDER_SINGLESEQ_CLASS/"`
### Visualizing trained NeRF
```shell
pytorch3d_implicitron_visualizer exp_dir=<CHECKPOINT_DIR> \
visdom_show_preds=True n_eval_cameras=40 render_size="[64,64]" video_size="[256,256]"
```
where `<CHECKPOINT_DIR>` corresponds to the directory with the training outputs (defaults to `"./data/nerf_blender_repro/$BLENDER_SINGLESEQ_CLASS/"`).
The script will output a rendered video of the learned radiance field to `"./data/nerf_blender_repro/$BLENDER_SINGLESEQ_CLASS/"` (requires `ffmpeg`).
> **_NOTE:_** Recall that, if `pytorch3d_implicitron_runner`/`pytorch3d_implicitron_visualizer` are not available, replace the calls
with `cd <pytorch3d_root>/projects/; python -m implicitron_trainer.[experiment|visualize_reconstruction]`
## CO3D experiments
Common Objects in 3D (CO3D) is a large-scale dataset of videos of rigid objects grouped into 50 common categories.
Implicitron provides implementations and config files to reproduce the results from [the paper](https://arxiv.org/abs/2109.00512).
Please follow [the link](https://github.com/facebookresearch/co3d#automatic-batch-download) for the instructions to download the dataset.
In training and evaluation scripts, use the download location as `<DATASET_ROOT>`.
It is also possible to define environment variable `CO3D_DATASET_ROOT` instead of specifying it.
To reproduce the experiments from the paper, use the following configs.
For single-sequence experiments:
| Method | config file |
|-----------------|-------------------------------------|
| NeRF | repro_singleseq_nerf.yaml |
| NeRF + WCE | repro_singleseq_nerf_wce.yaml |
| NerFormer | repro_singleseq_nerformer.yaml |
| IDR | repro_singleseq_idr.yaml |
| SRN | repro_singleseq_srn_noharm.yaml |
| SRN + Ξ³ | repro_singleseq_srn.yaml |
| SRN + WCE | repro_singleseq_srn_wce_noharm.yaml |
| SRN + WCE + Ξ³ | repro_singleseq_srn_wce_noharm.yaml |
For multi-sequence autodecoder experiments (without generalization to new sequences):
| Method | config file |
|-----------------|--------------------------------------------|
| NeRF + AD | repro_multiseq_nerf_ad.yaml |
| SRN + AD | repro_multiseq_srn_ad_hypernet_noharm.yaml |
| SRN + Ξ³ + AD | repro_multiseq_srn_ad_hypernet.yaml |
For multi-sequence experiments (with generalization to new sequences):
| Method | config file |
|-----------------|--------------------------------------|
| NeRF + WCE | repro_multiseq_nerf_wce.yaml |
| NerFormer | repro_multiseq_nerformer.yaml |
| SRN + WCE | repro_multiseq_srn_wce_noharm.yaml |
| SRN + WCE + Ξ³ | repro_multiseq_srn_wce.yaml |
## CO3Dv2 experiments
The following config files implement training on the second version of CO3D, `CO3Dv2`.
In order to launch trainings, set the `CO3DV2_DATASET_ROOT` environment variable
to the root folder of the dataset (note that the name of the env. variable differs from the CO3Dv1 version).
Single-sequence experiments:
| Method | config file |
|-----------------|-------------------------------------|
| NeRF | repro_singleseq_v2_nerf.yaml |
| NerFormer | repro_singleseq_v2_nerformer.yaml |
| IDR | repro_singleseq_v2_idr.yaml |
| SRN | repro_singleseq_v2_srn_noharm.yaml |
Multi-sequence autodecoder experiments (without generalization to new sequences):
| Method | config file |
|-----------------|--------------------------------------------|
| NeRF + AD | repro_multiseq_v2_nerf_ad.yaml |
| SRN + Ξ³ + AD | repro_multiseq_v2_srn_ad_hypernet.yaml |
Multi-sequence experiments (with generalization to new sequences):
| Method | config file |
|-----------------|----------------------------------------|
| NeRF + WCE | repro_multiseq_v2_nerf_wce.yaml |
| NerFormer | repro_multiseq_v2_nerformer.yaml |
| SRN + WCE | repro_multiseq_v2_srn_wce_noharm.yaml |
| SRN + WCE + Ξ³ | repro_multiseq_v2_srn_wce.yaml |
|