preprocess/humanparsing/mhp_extension/detectron2/configs/Detectron1-Comparisons/README.md · Smiley0707/OOTDiffusion at e98d6ffcde3a97df8a8094c166b814a7ab536469

Detectron2 model zoo's experimental settings and a few implementation details are different from Detectron.

The differences in implementation details are shared in Compatibility with Other Libraries.

The differences in model zoo's experimental settings include:

Use scale augmentation during training. This improves AP with lower training cost.
Use L1 loss instead of smooth L1 loss for simplicity. This sometimes improves box AP but may affect other AP.
Use POOLER_SAMPLING_RATIO=0 instead of 2. This does not significantly affect AP.
Use ROIAlignV2. This does not significantly affect AP.

In this directory, we provide a few configs that do not have the above changes. They mimic Detectron's behavior as close as possible, and provide a fair comparison of accuracy and speed against Detectron.

Name	lr sched	train time (s/iter)	inference time (s/im)	train mem (GB)	box AP	mask AP	kp. AP	model id	download
Faster R-CNN	1x	0.219	0.038	3.1	36.9			137781054	model \| metrics
Keypoint R-CNN	1x	0.313	0.071	5.0	53.1		64.2	137781195	model \| metrics
Mask R-CNN	1x	0.273	0.043	3.4	37.8	34.9		137781281	model \| metrics

Comparisons:

Faster R-CNN: Detectron's AP is 36.7, similar to ours.
Keypoint R-CNN: Detectron's AP is box 53.6, keypoint 64.2. Fixing a Detectron's bug lead to a drop in box AP, and can be compensated back by some parameter tuning.
Mask R-CNN: Detectron's AP is box 37.7, mask 33.9. We're 1 AP better in mask AP, due to more correct implementation.

For speed comparison, see benchmarks.