Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Introduction
[ALGORITHM]
@inproceedings{deeplabv3plus2018,
title={Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
author={Liang-Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam},
booktitle={ECCV},
year={2018}
}
Results and models
Note:
D-8
/D-16
here corresponding to the output stride 8/16 setting for DeepLab series.
MG-124
stands for multi-grid dilation in the last stage of ResNet.
Cityscapes
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
download |
DeepLabV3+ |
R-50-D8 |
512x1024 |
40000 |
7.5 |
3.94 |
79.61 |
81.01 |
model | log |
DeepLabV3+ |
R-101-D8 |
512x1024 |
40000 |
11 |
2.60 |
80.21 |
81.82 |
model | log |
DeepLabV3+ |
R-50-D8 |
769x769 |
40000 |
8.5 |
1.72 |
78.97 |
80.46 |
model | log |
DeepLabV3+ |
R-101-D8 |
769x769 |
40000 |
12.5 |
1.15 |
79.46 |
80.50 |
model | log |
DeepLabV3+ |
R-18-D8 |
512x1024 |
80000 |
2.2 |
14.27 |
76.89 |
78.76 |
model | log |
DeepLabV3+ |
R-50-D8 |
512x1024 |
80000 |
- |
- |
80.09 |
81.13 |
model | log |
DeepLabV3+ |
R-101-D8 |
512x1024 |
80000 |
- |
- |
80.97 |
82.03 |
model | log |
DeepLabV3+ |
R-18-D8 |
769x769 |
80000 |
2.5 |
5.74 |
76.26 |
77.91 |
model | log |
DeepLabV3+ |
R-50-D8 |
769x769 |
80000 |
- |
- |
79.83 |
81.48 |
model | log |
DeepLabV3+ |
R-101-D8 |
769x769 |
80000 |
- |
- |
80.98 |
82.18 |
model | log |
DeepLabV3+ |
R-101-D16-MG124 |
512x1024 |
40000 |
5.8 |
7.48 |
79.09 |
80.36 |
model | log |
DeepLabV3+ |
R-101-D16-MG124 |
512x1024 |
80000 |
9.9 |
- |
79.90 |
81.33 |
model | log |
DeepLabV3+ |
R-18b-D8 |
512x1024 |
80000 |
2.1 |
14.95 |
75.87 |
77.52 |
model | log |
DeepLabV3+ |
R-50b-D8 |
512x1024 |
80000 |
7.4 |
3.94 |
80.28 |
81.44 |
model | log |
DeepLabV3+ |
R-101b-D8 |
512x1024 |
80000 |
10.9 |
2.60 |
80.16 |
81.41 |
model | log |
DeepLabV3+ |
R-18b-D8 |
769x769 |
80000 |
2.4 |
5.96 |
76.36 |
78.24 |
model | log |
DeepLabV3+ |
R-50b-D8 |
769x769 |
80000 |
8.4 |
1.72 |
79.41 |
80.56 |
model | log |
DeepLabV3+ |
R-101b-D8 |
769x769 |
80000 |
12.3 |
1.10 |
79.88 |
81.46 |
model | log |
ADE20K
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
download |
DeepLabV3+ |
R-50-D8 |
512x512 |
80000 |
10.6 |
21.01 |
42.72 |
43.75 |
model | log |
DeepLabV3+ |
R-101-D8 |
512x512 |
80000 |
14.1 |
14.16 |
44.60 |
46.06 |
model | log |
DeepLabV3+ |
R-50-D8 |
512x512 |
160000 |
- |
- |
43.95 |
44.93 |
model | log |
DeepLabV3+ |
R-101-D8 |
512x512 |
160000 |
- |
- |
45.47 |
46.35 |
model | log |
Pascal VOC 2012 + Aug
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
download |
DeepLabV3+ |
R-50-D8 |
512x512 |
20000 |
7.6 |
21 |
75.93 |
77.50 |
model | log |
DeepLabV3+ |
R-101-D8 |
512x512 |
20000 |
11 |
13.88 |
77.22 |
78.59 |
model | log |
DeepLabV3+ |
R-50-D8 |
512x512 |
40000 |
- |
- |
76.81 |
77.57 |
model | log |
DeepLabV3+ |
R-101-D8 |
512x512 |
40000 |
- |
- |
78.62 |
79.53 |
model | log |
Pascal Context
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
download |
DeepLabV3+ |
R-101-D8 |
480x480 |
40000 |
- |
9.09 |
47.30 |
48.47 |
model | log |
DeepLabV3+ |
R-101-D8 |
480x480 |
80000 |
- |
- |
47.23 |
48.26 |
model | log |