Spaces:
Runtime error
Prepare Datasets for OVSeg
This doc is a modification/extension of MaskFormer following Detectron2 fromat.
A dataset can be used by accessing DatasetCatalog
for its data, or MetadataCatalog for its metadata (class names, etc).
This document explains how to setup the builtin datasets so they can be used by the above APIs.
Use Custom Datasets gives a deeper dive on how to use DatasetCatalog
and MetadataCatalog
,
and how to add new datasets to them.
OVSeg has builtin support for a few datasets.
The datasets are assumed to exist in a directory specified by the environment variable
DETECTRON2_DATASETS
.
Under this directory, detectron2 will look for datasets in the structure described below, if needed.
$DETECTRON2_DATASETS/
coco/ # COCOStuff-171
ADEChallengeData2016/ # ADE20K-150
ADE20K_2021_17_01/ # ADE20K-847
VOCdevkit/
VOC2012/ # PASCALVOC-20
VOC2010/ # PASCALContext-59, PASCALContext-459
You can set the location for builtin datasets by export DETECTRON2_DATASETS=/path/to/datasets
.
If left unset, the default is ./datasets
relative to your current working directory.
Without specific notifications, our model is trained on COCOStuff-171 and evlauted on ADE20K-150, ADE20K-847, PASCALVOC-20, PASCALContext-59 and PASCALContext-459.
dataset | split | # images | # categories |
---|---|---|---|
COCO Stuff | train2017 | 118K | 171 |
ADE20K | val | 2K | 150/847 |
Pascal VOC | val | 1.5K | 20 |
Pascal Context | val | 5K | 59/459 |
Expected dataset structure for COCO Stuff:
coco/
train2017/ # http://images.cocodataset.org/zips/train2017.zip
annotations/ # http://images.cocodataset.org/annotations/annotations_trainval2017.zip
stuffthingmaps/
stuffthingmaps_trainval2017.zip # http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip
train2017/
# below are generated
stuffthingmaps_detectron2/
train2017/
The directory stuffthingmaps_detectron2
is generated by running python datasets/prepare_coco_stuff_sem_seg.py
.
Expected dataset structure for ADE20k Scene Parsing (ADE20K-150):
ADEChallengeData2016/
annotations/
images/
objectInfo150.txt
# below are generated
annotations_detectron2/
The directory annotations_detectron2
is generated by running python datasets/prepare_ade20k_sem_seg.py
.
Expected dataset structure for ADE20k-Full (ADE20K-847):
ADE20K_2021_17_01/
images/
index_ade20k.pkl
objects.txt
# below are generated
images_detectron2/
annotations_detectron2/
The directories images_detectron2
and annotations_detectron2
are generated by running python datasets/prepare_ade20k_full_sem_seg.py
.
Expected dataset structure for Pascal VOC 2012 (PASCALVOC-20):
VOCdevkit/VOC2012/
Annotations/
ImageSets/
JPEGImages/
SegmentationClass/
SegmentationObject/
SegmentationClassAug/ # https://github.com/kazuto1011/deeplab-pytorch/blob/master/data/datasets/voc12/README.md
# below are generated
images_detectron2/
annotations_detectron2/
It starts with a tar file VOCtrainval_11-May-2012.tar
.
We use SBD augmentated training data as SegmentationClassAug
following Deeplab
The directories images_detectron2
and annotations_detectron2
are generated by running python datasets/prepare_voc_sem_seg.py
.
Expected dataset structure for Pascal Context:
VOCdevkit/VOC2010/
Annotations/
ImageSets/
JPEGImages/
SegmentationClass/
SegmentationObject/
# below are from https://www.cs.stanford.edu/~roozbeh/pascal-context/trainval.tar.gz
trainval/
labels.txt
59_labels.txt # https://www.cs.stanford.edu/~roozbeh/pascal-context/59_labels.txt
pascalcontext_val.txt # https://drive.google.com/file/d/1BCbiOKtLvozjVnlTJX51koIveUZHCcUh/view?usp=sharing
# below are generated
annotations_detectron2/
pc459_val
pc59_val
It starts with a tar file VOCtrainval_03-May-2010.tar
. You may want to download the 5K validation set here.
The directory annotations_detectron2
is generated by running python datasets/prepare_pascal_context.py
.