Spaces:
Running
Running
File size: 6,170 Bytes
b9be4e6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild
## Introduction
This is the code of paper [Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild](https://arxiv.org/abs/2003.03771). We propose a novel facial landmark detector, PIPNet, that is **fast**, **accurate**, and **robust**. PIPNet can be trained under two settings: (1) supervised learning; (2) generalizable semi-supervised learning (GSSL). With GSSL, PIPNet has better cross-domain generalization performance by utilizing massive amounts of unlabeled data across domains.
<img src="images/speed.png" alt="speed" width="640px">
Figure 1. Comparison to existing methods on speed-accuracy tradeoff, tested on WFLW full test set (closer to bottom-right corner is better).<br><br>
<img src="images/detection_heads.png" alt="det_heads" width="512px">
Figure 2. Comparison of different detection heads.<br>
## Installation
1. Install Python3 and PyTorch >= v1.1
2. Clone this repository.
```Shell
git clone https://github.com/jhb86253817/PIPNet.git
```
3. Install the dependencies in requirements.txt.
```Shell
pip install -r requirements.txt
```
## Demo
1. We use a [modified version](https://github.com/jhb86253817/FaceBoxesV2) of [FaceBoxes](https://github.com/zisianw/FaceBoxes.PyTorch) as the face detector, so go to folder `FaceBoxesV2/utils`, run `sh make.sh` to build for NMS.
2. Back to folder `PIPNet`, create two empty folders `logs` and `snapshots`. For PIPNets, you can download our trained models from [here](https://drive.google.com/drive/folders/17OwDgJUfuc5_ymQ3QruD8pUnh5zHreP2?usp=sharing), and put them under folder `snapshots/DATA_NAME/EXPERIMENT_NAME/`.
3. Edit `run_demo.sh` to choose the config file and input source you want, then run `sh run_demo.sh`. We support image, video, and camera as the input. Some sample predictions can be seen as follows.
* PIPNet-ResNet18 trained on WFLW, with image `images/1.jpg` as the input:
<img src="images/1_out_WFLW_model.jpg" alt="1_out_WFLW_model" width="400px">
* PIPNet-ResNet18 trained on WFLW, with a snippet from *Shaolin Soccer* as the input:
<img src="videos/shaolin_soccer.gif" alt="shaolin_soccer" width="400px">
* PIPNet-ResNet18 trained on WFLW, with video `videos/002.avi` as the input:
<img src="videos/002_out_WFLW_model.gif" alt="002_out_WFLW_model" width="512px">
* PIPNet-ResNet18 trained on 300W+CelebA (GSSL), with video `videos/007.avi` as the input:
<img src="videos/007_out_300W_CELEBA_model.gif" alt="007_out_300W_CELEBA_model" width="512px">
## Training
### Supervised Learning
Datasets: [300W](https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/), [COFW](http://www.vision.caltech.edu/xpburgos/ICCV13/), [WFLW](https://wywu.github.io/projects/LAB/WFLW.html), [AFLW](https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/aflw/)
1. Download the datasets from official sources, then put them under folder `data`. The folder structure should look like this:
````
PIPNet
-- FaceBoxesV2
-- lib
-- experiments
-- logs
-- snapshots
-- data
|-- data_300W
|-- afw
|-- helen
|-- ibug
|-- lfpw
|-- COFW
|-- COFW_train_color.mat
|-- COFW_test_color.mat
|-- WFLW
|-- WFLW_images
|-- WFLW_annotations
|-- AFLW
|-- flickr
|-- AFLWinfo_release.mat
````
2. Go to folder `lib`, preprocess a dataset by running ```python preprocess.py DATA_NAME```. For example, to process 300W:
```
python preprocess.py data_300W
```
3. Back to folder `PIPNet`, edit `run_train.sh` to choose the config file you want. Then, train the model by running:
```
sh run_train.sh
```
### Generalizable Semi-supervised Learning
Datasets:
* data_300W_COFW_WFLW: 300W + COFW-68 (unlabeled) + WFLW-68 (unlabeled)
* data_300W_CELEBA: 300W + CelebA (unlabeled)
1. Download 300W, COFW, and WFLW as in the supervised learning setting. Download annotations of COFW-68 test from [here](https://github.com/golnazghiasi/cofw68-benchmark). For 300W+CelebA, you also need to download the in-the-wild CelebA images from [here](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html), and the [face bounding boxes](https://drive.google.com/drive/folders/17OwDgJUfuc5_ymQ3QruD8pUnh5zHreP2?usp=sharing) detected by us. The folder structure should look like this:
````
PIPNet
-- FaceBoxesV2
-- lib
-- experiments
-- logs
-- snapshots
-- data
|-- data_300W
|-- afw
|-- helen
|-- ibug
|-- lfpw
|-- COFW
|-- COFW_train_color.mat
|-- COFW_test_color.mat
|-- WFLW
|-- WFLW_images
|-- WFLW_annotations
|-- data_300W_COFW_WFLW
|-- cofw68_test_annotations
|-- cofw68_test_bboxes.mat
|-- CELEBA
|-- img_celeba
|-- celeba_bboxes.txt
|-- data_300W_CELEBA
|-- cofw68_test_annotations
|-- cofw68_test_bboxes.mat
````
2. Go to folder `lib`, preprocess a dataset by running ```python preprocess_gssl.py DATA_NAME```.
To process data_300W_COFW_WFLW, run
```
python preprocess_gssl.py data_300W_COFW_WFLW
```
To process data_300W_CELEBA, run
```
python preprocess_gssl.py CELEBA
```
and
```
python preprocess_gssl.py data_300W_CELEBA
```
3. Back to folder `PIPNet`, edit `run_train.sh` to choose the config file you want. Then, train the model by running:
```
sh run_train.sh
```
## Evaluation
1. Edit `run_test.sh` to choose the config file you want. Then, test the model by running:
```
sh run_test.sh
```
## Citation
````
@article{JLS21,
title={Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild},
author={Haibo Jin and Shengcai Liao and Ling Shao},
journal={International Journal of Computer Vision},
publisher={Springer Science and Business Media LLC},
ISSN={1573-1405},
url={http://dx.doi.org/10.1007/s11263-021-01521-4},
DOI={10.1007/s11263-021-01521-4},
year={2021},
month={Sep}
}
````
## Acknowledgement
We thank the following great works:
* [human-pose-estimation.pytorch](https://github.com/microsoft/human-pose-estimation.pytorch)
* [HRNet-Facial-Landmark-Detection](https://github.com/HRNet/HRNet-Facial-Landmark-Detection)
|