Image Segmentation
Transformers
PyTorch
upernet
Inference Endpoints
mccaly's picture
Upload 660 files
b13b124
|
raw
history blame
18.7 kB

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Introduction

[ALGORITHM]

@inproceedings{deeplabv3plus2018,
  title={Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
  author={Liang-Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam},
  booktitle={ECCV},
  year={2018}
}

Results and models

Note: D-8/D-16 here corresponding to the output stride 8/16 setting for DeepLab series. MG-124 stands for multi-grid dilation in the last stage of ResNet.

Cityscapes

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
DeepLabV3+ R-50-D8 512x1024 40000 7.5 3.94 79.61 81.01 model | log
DeepLabV3+ R-101-D8 512x1024 40000 11 2.60 80.21 81.82 model | log
DeepLabV3+ R-50-D8 769x769 40000 8.5 1.72 78.97 80.46 model | log
DeepLabV3+ R-101-D8 769x769 40000 12.5 1.15 79.46 80.50 model | log
DeepLabV3+ R-18-D8 512x1024 80000 2.2 14.27 76.89 78.76 model | log
DeepLabV3+ R-50-D8 512x1024 80000 - - 80.09 81.13 model | log
DeepLabV3+ R-101-D8 512x1024 80000 - - 80.97 82.03 model | log
DeepLabV3+ R-18-D8 769x769 80000 2.5 5.74 76.26 77.91 model | log
DeepLabV3+ R-50-D8 769x769 80000 - - 79.83 81.48 model | log
DeepLabV3+ R-101-D8 769x769 80000 - - 80.98 82.18 model | log
DeepLabV3+ R-101-D16-MG124 512x1024 40000 5.8 7.48 79.09 80.36 model | log
DeepLabV3+ R-101-D16-MG124 512x1024 80000 9.9 - 79.90 81.33 model | log
DeepLabV3+ R-18b-D8 512x1024 80000 2.1 14.95 75.87 77.52 model | log
DeepLabV3+ R-50b-D8 512x1024 80000 7.4 3.94 80.28 81.44 model | log
DeepLabV3+ R-101b-D8 512x1024 80000 10.9 2.60 80.16 81.41 model | log
DeepLabV3+ R-18b-D8 769x769 80000 2.4 5.96 76.36 78.24 model | log
DeepLabV3+ R-50b-D8 769x769 80000 8.4 1.72 79.41 80.56 model | log
DeepLabV3+ R-101b-D8 769x769 80000 12.3 1.10 79.88 81.46 model | log

ADE20K

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
DeepLabV3+ R-50-D8 512x512 80000 10.6 21.01 42.72 43.75 model | log
DeepLabV3+ R-101-D8 512x512 80000 14.1 14.16 44.60 46.06 model | log
DeepLabV3+ R-50-D8 512x512 160000 - - 43.95 44.93 model | log
DeepLabV3+ R-101-D8 512x512 160000 - - 45.47 46.35 model | log

Pascal VOC 2012 + Aug

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
DeepLabV3+ R-50-D8 512x512 20000 7.6 21 75.93 77.50 model | log
DeepLabV3+ R-101-D8 512x512 20000 11 13.88 77.22 78.59 model | log
DeepLabV3+ R-50-D8 512x512 40000 - - 76.81 77.57 model | log
DeepLabV3+ R-101-D8 512x512 40000 - - 78.62 79.53 model | log

Pascal Context

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
DeepLabV3+ R-101-D8 480x480 40000 - 9.09 47.30 48.47 model | log
DeepLabV3+ R-101-D8 480x480 80000 - - 47.23 48.26 model | log