pawandev
first push
bfea304

We use various synthetic and real datasets. More info is in Appendix F of the supplementary material. Some preprocessing scripts are included in tools/.

Dataset Type Remarks
MJSynth synthetic Case-sensitive annotations were extracted from the image filenames
SynthText synthetic Processed with crop_by_word_bb_syn90k.py
IC13 real Three archives: 857, 1015, 1095 (full)
IC15 real Two archives: 1811, 2077 (full)
CUTE80 real [1]
IIIT5k real [1]
SVT real [1]
SVTP real [1]
ArT real [2]
LSVT real [2]
MLT19 real [2]
RCTW17 real [2]
ReCTS real [2]
Uber-Text real [2]
COCO-Text v1.4 real Processed with coco_text_converter.py
COCO-Text v2.0 real Processed with coco_2_converter.py
OpenVINO real Annotations for a subset of Open Images. Processed with openvino_converter.py.
TextOCR real Annotations for a subset of Open Images. Processed with textocr_converter.py. A horizontal version can be generated by passing --rectify_pose.

[1] Case-sensitive annotations from Long and Yao + our corrections. Processed with case_sensitive_str_datasets_converter.py
[2] Archives used as-is from Baek et al. They are included in the dataset release for convenience. Please refer to their work for more info about the datasets.

The preprocessed archives are available here: val + test + most of train, TextOCR + OpenVINO

The expected filesystem structure is as follows:

data
β”œβ”€β”€ test
β”‚   β”œβ”€β”€ ArT
β”‚   β”œβ”€β”€ COCOv1.4
β”‚   β”œβ”€β”€ CUTE80
β”‚   β”œβ”€β”€ IC13_1015
β”‚   β”œβ”€β”€ IC13_1095  # Full IC13 test set. Typically not used for benchmarking but provided here for convenience.
β”‚   β”œβ”€β”€ IC13_857
β”‚   β”œβ”€β”€ IC15_1811
β”‚   β”œβ”€β”€ IC15_2077
β”‚   β”œβ”€β”€ IIIT5k
β”‚   β”œβ”€β”€ SVT
β”‚   β”œβ”€β”€ SVTP
β”‚   └── Uber
β”œβ”€β”€ train
β”‚   β”œβ”€β”€ real
β”‚   β”‚   β”œβ”€β”€ ArT
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   β”œβ”€β”€ COCOv2.0
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   β”œβ”€β”€ LSVT
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   β”œβ”€β”€ MLT19
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   β”œβ”€β”€ OpenVINO
β”‚   β”‚   β”‚   β”œβ”€β”€ train_1
β”‚   β”‚   β”‚   β”œβ”€β”€ train_2
β”‚   β”‚   β”‚   β”œβ”€β”€ train_5
β”‚   β”‚   β”‚   β”œβ”€β”€ train_f
β”‚   β”‚   β”‚   └── validation
β”‚   β”‚   β”œβ”€β”€ RCTW17
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   β”œβ”€β”€ ReCTS
β”‚   β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   β”œβ”€β”€ TextOCR
β”‚   β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   └── val
β”‚   β”‚   └── Uber
β”‚   β”‚       β”œβ”€β”€ train
β”‚   β”‚       └── val
β”‚   └── synth
β”‚       β”œβ”€β”€ MJ
β”‚       β”‚   β”œβ”€β”€ test
β”‚       β”‚   β”œβ”€β”€ train
β”‚       β”‚   └── val
β”‚       └── ST
└── val
   β”œβ”€β”€ IC13
   β”œβ”€β”€ IC15
   β”œβ”€β”€ IIIT5k
   └── SVT