Type-R official repository

This repository contains model weights and data resources used in the Type-R project. The dataset is designed to support text-to-image generation, OCR, text erasing, editing, and evaluation pipelines used in the Type-R system.

📘 Directory structure

⚠️ The code in the repository is designed to operate directly using this structure.

resources/
├── weight/ 
│     ├── ocr/ # OCR-related model weights
│     │    ├── solo.pth # ⚠️Required Manual Downloads
│     │    ├── masktextspotterv3.pth # ⚠️Required Manual Downloads
│     │    ├── modelscope
│     │    ├── craft
│     │    ├── clova
│     │    └── hisam_weight
│     ├── text_eraser/ # Text erasure model weights 
│     │    ├── big-lama.pt
│     │    └── garnet.pth
│     ├── text_editor/ # Text editing model weights 
│     │    ├── anytext.ckpt
│     │    └── udifftext
│     └── t2i/ # Text-to-image model weights 
│          ├── (weights will be cached here)
│          ~ 
├── data/ 
│     ├── marioevalbench/ # Mario-Eval benchmark dataset 
|     │    └── hfds
│     ├── arial_unicode_ms.ttf # ⚠️Required Manual Downloads
│     └── LiberationSans-Regular.ttf 
└── prompt
      └── example.txt

📘 ⚠️Manual download required data⚠️

resources/weight/ocr/solo.pth
- Please download this weight from the official Deeosolo implementation.[link]
- This weight has ViTAEv2-S as its backbone and is trained on Synth150K+Total-Text+MLT17+IC13+IC15+TextOCR.
resources/weight/ocr/masktextspotterv3.pth
- Please download this weight from the official MaskTextSpotterV3 implementation. [link]
resources/data/arial_unicode_ms.ttf
- Since the Arial font cannot be redistributed, please obtain it through your operating system or another legal source. As an alternative, you may use an open font such as Liberation Sans (resources/data/LiberationSans-Regular.ttf). However, please note that we have observed a drop of 1–2 points in OCR accuracy on the Mario-Eval benchmark when using AnyText with Liberation Sans under our best configuration.

📘 Dataset details

weight/
- This dicrectory contains pretrained weights used for various modules in the Type-R pipeline
- ocr/: Models for OCR detection/recognition.
- text_eraser/: Inpainting or erasure modules for removing text.
- text_editor/: Models for rendering text into images.
- t2i/: Large text-to-image models.
  - If the T2I model requires authentication, make sure to log in to Hugging Face (e.g., using huggingface-cli login) before executing the pipeline.
data/marioevalbench/
- The dataset containing prompts and reference images for evaluating Type-R
- hfds/: includes prompts, augmented prompts, and images of the Mario-Eval Benchmark

📘 License

Weights

DeepSolo: resources/weight/ocr/solo.pth — Licensed under Adelaidet
MaskTextSpotterV3: resources/weight/ocr/masktextspotterv3.pth — Licensed under Creative commons
Paddle: resources/weight/ocr/modelscope — Licensed under Apache 2.0
CRAFT: resources/weight/ocr/craft — Licensed under MIT License
Clova Recognition: resources/weight/ocr/clova — Licensed under Apache 2.0
Hi-SAM: resources/weight/ocr/hisam_weight — Licensed under Apache 2.0
Lama: resources/weight/text_eraser/big-lama.pt — Licensed under Apache 2.0
Garnet: resources/weight/text_eraser/garnet.pth — Licensed under Apache 2.0
AnyText: resources/weight/text_editor/anytext.ckpt — Licensed under Apache 2.0
UDiffText: resources/weight/text_editor/udifftext — Licensed under MIT License

Data

Mario-Eval Benchmark: resources/data/marioevalbench — Licensed under MIT License
Arial Font: resources/data/arial_unicode_ms.ttf - Licensed under License Microsoft fonts
Liberation Sans: resources/data/LiberationSans-Regular.ttf - Licensed under OFL 1.1