snicolau's picture
Upload 772 files
500565b verified
|
raw
history blame
16.5 kB

Continuous Surface Embeddings for Dense Pose Estimation for Humans and Animals

Overview

The pipeline uses Faster R-CNN with Feature Pyramid Network meta architecture outlined in Figure 1. For each detected object, the model predicts its coarse segmentation S (2 channels: foreground / background) and the embedding E (16 channels). At the same time, the embedder produces vertex embeddings Ê for the corresponding mesh. Universal positional embeddings E and vertex embeddings Ê are matched to derive for each pixel its continuous surface embedding.

Figure 1. DensePose continuous surface embeddings architecture based on Faster R-CNN with Feature Pyramid Network (FPN).

Datasets

For more details on datasets used for training and validation of continuous surface embeddings models, please refer to the DensePose Datasets page.

Model Zoo and Baselines

Human CSE Models

Continuous surface embeddings models for humans trained using the protocols from Neverova et al, 2020.

Models trained with hard assignment loss β„’:

Name lr
sched
train
time
(s/iter)
inference
time
(s/im)
train
mem
(GB)
box
AP
segm
AP
dp. AP
GPS
dp. AP
GPSm
model id download
R_50_FPN_s1x s1x 0.349 0.060 6.3 61.1 67.1 64.4 65.7 251155172 model | metrics
R_101_FPN_s1x s1x 0.461 0.071 7.4 62.3 67.2 64.7 65.8 251155500 model | metrics
R_50_FPN_DL_s1x s1x 0.399 0.061 7.0 60.8 67.8 65.5 66.4 251156349 model | metrics
R_101_FPN_DL_s1x s1x 0.504 0.074 8.3 61.5 68.0 65.6 66.6 251156606 model | metrics

Models trained with soft assignment loss β„’Οƒ:

Name lr
sched
train
time
(s/iter)
inference
time
(s/im)
train
mem
(GB)
box
AP
segm
AP
dp. AP
GPS
dp. AP
GPSm
model id download
R_50_FPN_soft_s1x s1x 0.357 0.057 9.7 61.3 66.9 64.3 65.4 250533982 model | metrics
R_101_FPN_soft_s1x s1x 0.464 0.071 10.5 62.1 67.3 64.5 66.0 250712522 model | metrics
R_50_FPN_DL_soft_s1x s1x 0.427 0.062 11.3 60.8 68.0 66.1 66.7 250713703 model | metrics
R_101_FPN_DL_soft_s1x s1x 0.483 0.071 12.2 61.5 68.2 66.2 67.1 250713061 model | metrics

Animal CSE Models

Models obtained by finetuning human CSE models on animals data from ds1_train (see the DensePose LVIS section for more details on the datasets) with soft assignment loss β„’Οƒ:

Name lr
sched
train
time
(s/iter)
inference
time
(s/im)
train
mem
(GB)
box
AP
segm
AP
dp. AP
GPS
dp. AP
GPSm
model id download
R_50_FPN_soft_chimps_finetune_4k 4K 0.569 0.051 4.7 62.0 59.0 32.2 39.6 253146869 model | metrics
R_50_FPN_soft_animals_finetune_4k 4K 0.381 0.061 7.3 44.9 55.5 21.3 28.8 253145793 model | metrics
R_50_FPN_soft_animals_CA_finetune_4k 4K 0.412 0.059 7.1 53.4 59.5 25.4 33.4 253498611 model | metrics

Acronyms:

CA: class agnostic training, where all annotated instances are mapped into a single category

Models obtained by finetuning human CSE models on animals data from ds2_train dataset with soft assignment loss β„’Οƒ and, for some schedules, cycle losses. Please refer to DensePose LVIS section for details on the dataset and to Neverova et al, 2021 for details on cycle losses.

Name lr
sched
train
time
(s/iter)
inference
time
(s/im)
train
mem
(GB)
box
AP
segm
AP
dp. AP
GPS
dp. AP
GPSm
GErr GPS model id download
R_50_FPN_soft_animals_I0_finetune_16k 16k 0.386 0.058 8.4 54.2 67.0 29.0 38.6 13.2 85.4 270727112 model | metrics
R_50_FPN_soft_animals_I0_finetune_m2m_16k 16k 0.508 0.056 12.2 54.1 67.3 28.6 38.4 12.5 87.6 270982215 model | metrics
R_50_FPN_soft_animals_I0_finetune_i2m_16k 16k 0.483 0.056 9.7 54.0 66.6 28.9 38.3 11.0 88.9 270727461 model | metrics

References

If you use DensePose methods based on continuous surface embeddings, please take the references from the following BibTeX entries:

Continuous surface embeddings:

@InProceedings{Neverova2020ContinuousSurfaceEmbeddings,
    title = {Continuous Surface Embeddings},
    author = {Neverova, Natalia and Novotny, David and Khalidov, Vasil and Szafraniec, Marc and Labatut, Patrick and Vedaldi, Andrea},
    journal = {Advances in Neural Information Processing Systems},
    year = {2020},
}

Cycle Losses:

@InProceedings{Neverova2021UniversalCanonicalMaps,
    title = {Discovering Relationships between Object Categories via Universal Canonical Maps},
    author = {Neverova, Natalia and Sanakoyeu, Artsiom and Novotny, David and Labatut, Patrick and Vedaldi, Andrea},
    journal = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2021},
}