# Continuous Surface Embeddings for Dense Pose Estimation for Humans and Animals ## Overview

The pipeline uses [Faster R-CNN](https://arxiv.org/abs/1506.01497) with [Feature Pyramid Network](https://arxiv.org/abs/1612.03144) meta architecture outlined in Figure 1. For each detected object, the model predicts its coarse segmentation `S` (2 channels: foreground / background) and the embedding `E` (16 channels). At the same time, the embedder produces vertex embeddings `Ê` for the corresponding mesh. Universal positional embeddings `E` and vertex embeddings `Ê` are matched to derive for each pixel its continuous surface embedding.

Figure 1. DensePose continuous surface embeddings architecture based on Faster R-CNN with Feature Pyramid Network (FPN).

### Datasets For more details on datasets used for training and validation of continuous surface embeddings models, please refer to the [DensePose Datasets](DENSEPOSE_DATASETS.md) page. ## Model Zoo and Baselines ### Human CSE Models Continuous surface embeddings models for humans trained using the protocols from [Neverova et al, 2020](https://arxiv.org/abs/2011.12438). Models trained with hard assignment loss ℒ:

Name	lr sched	train time (s/iter)	inference time (s/im)	train mem (GB)	box AP	segm AP	dp. AP GPS	dp. AP GPSm	model id	download
R_50_FPN_s1x	s1x	0.349	0.060	6.3	61.1	67.1	64.4	65.7	251155172	model \| metrics
R_101_FPN_s1x	s1x	0.461	0.071	7.4	62.3	67.2	64.7	65.8	251155500	model \| metrics
R_50_FPN_DL_s1x	s1x	0.399	0.061	7.0	60.8	67.8	65.5	66.4	251156349	model \| metrics
R_101_FPN_DL_s1x	s1x	0.504	0.074	8.3	61.5	68.0	65.6	66.6	251156606	model \| metrics

Models trained with soft assignment loss ℒ_σ:

Name	lr sched	train time (s/iter)	inference time (s/im)	train mem (GB)	box AP	segm AP	dp. AP GPS	dp. AP GPSm	model id	download
R_50_FPN_soft_s1x	s1x	0.357	0.057	9.7	61.3	66.9	64.3	65.4	250533982	model \| metrics
R_101_FPN_soft_s1x	s1x	0.464	0.071	10.5	62.1	67.3	64.5	66.0	250712522	model \| metrics
R_50_FPN_DL_soft_s1x	s1x	0.427	0.062	11.3	60.8	68.0	66.1	66.7	250713703	model \| metrics
R_101_FPN_DL_soft_s1x	s1x	0.483	0.071	12.2	61.5	68.2	66.2	67.1	250713061	model \| metrics

### Animal CSE Models Models obtained by finetuning human CSE models on animals data from `ds1_train` (see the [DensePose LVIS](DENSEPOSE_DATASETS.md#continuous-surface-embeddings-annotations-3) section for more details on the datasets) with soft assignment loss ℒ_σ:

Name	lr sched	train time (s/iter)	inference time (s/im)	train mem (GB)	box AP	segm AP	dp. AP GPS	dp. AP GPSm	model id	download
R_50_FPN_soft_chimps_finetune_4k	4K	0.569	0.051	4.7	62.0	59.0	32.2	39.6	253146869	model \| metrics
R_50_FPN_soft_animals_finetune_4k	4K	0.381	0.061	7.3	44.9	55.5	21.3	28.8	253145793	model \| metrics
R_50_FPN_soft_animals_CA_finetune_4k	4K	0.412	0.059	7.1	53.4	59.5	25.4	33.4	253498611	model \| metrics

Acronyms: `CA`: class agnostic training, where all annotated instances are mapped into a single category Models obtained by finetuning human CSE models on animals data from `ds2_train` dataset with soft assignment loss ℒ_σ and, for some schedules, cycle losses. Please refer to [DensePose LVIS](DENSEPOSE_DATASETS.md#continuous-surface-embeddings-annotations-3) section for details on the dataset and to [Neverova et al, 2021]() for details on cycle losses.

Name	lr sched	train time (s/iter)	inference time (s/im)	train mem (GB)	box AP	segm AP	dp. AP GPS	dp. AP GPSm	GErr	GPS	model id	download
R_50_FPN_soft_animals_I0_finetune_16k	16k	0.386	0.058	8.4	54.2	67.0	29.0	38.6	13.2	85.4	270727112	model \| metrics
R_50_FPN_soft_animals_I0_finetune_m2m_16k	16k	0.508	0.056	12.2	54.1	67.3	28.6	38.4	12.5	87.6	270982215	model \| metrics
R_50_FPN_soft_animals_I0_finetune_i2m_16k	16k	0.483	0.056	9.7	54.0	66.6	28.9	38.3	11.0	88.9	270727461	model \| metrics

## References If you use DensePose methods based on continuous surface embeddings, please take the references from the following BibTeX entries: Continuous surface embeddings: ``` @InProceedings{Neverova2020ContinuousSurfaceEmbeddings, title = {Continuous Surface Embeddings}, author = {Neverova, Natalia and Novotny, David and Khalidov, Vasil and Szafraniec, Marc and Labatut, Patrick and Vedaldi, Andrea}, journal = {Advances in Neural Information Processing Systems}, year = {2020}, } ``` Cycle Losses: ``` @InProceedings{Neverova2021UniversalCanonicalMaps, title = {Discovering Relationships between Object Categories via Universal Canonical Maps}, author = {Neverova, Natalia and Sanakoyeu, Artsiom and Novotny, David and Labatut, Patrick and Vedaldi, Andrea}, journal = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2021}, } ```