--- license: apache-2.0 language: - ar --- # Surya OCR Arabic This repository contains the `surya-ocr-arabic-segment` model, which is based on a modified SegFormer architecture. The model was fine-tuned for document segmentation tasks. ## Setup Instructions ### Clone the Surya OCR GitHub Repository To use the `SegformerForRegressionMask` class, you need to clone the Surya OCR GitHub repository: ```bash git clone https://github.com/vikp/surya.git cd surya ``` ### Switch to v0.4.14 ```bash git checkout f7c6c04 ``` ### Install Dependencies You can install the required dependencies using the following command: ```bash pip install -r requirements.txt ``` ### Import and Use the Model You can load and use the `surya-ocr-arabic-segment` model as follows: ```python #we are importing `SegformerForRegressionMask` from the folder of surya OCR repo. from surya.surya.model.detection.segformer import SegformerForRegressionMask import torch device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = SegformerForRegressionMask.from_pretrained("ketanmore/surya-ocr-arabic-segment", torch_dtype=torch.float32).to(device) ```