OCR-ReportLab-Notebooks

1.png

OCR-ReportLab is a collection of Colab notebooks designed to perform Optical Character Recognition (OCR) on images and generate DOCX or PDF documents containing both the original image and the extracted text. It supports multiple state-of-the-art vision-language models for experimentation and practical use.

All the notebooks are here : https://huggingface.co/prithivMLmods/OCR-ReportLab-Notebooks/tree/main

Notebooks

You can launch and run the following notebooks directly in Google Colab:

Features

  • Extracts text from input images using various OCR models
  • Embeds the image and extracted text into DOCX or PDF formats
  • Designed for quick deployment via Google Colab

Supported Models

The repository currently supports the following OCR implementations:

  • Nanonets OCR
  • Monkey OCR
  • OCRFlux 3B
  • Typhoon OCR 3B
  • and more ...

Installation

No installation is required. Simply click on the links above to run the notebooks in Google Colab. Make sure to upload your image file(s) when prompted and follow the instructions in the notebook.


Other Images


OCR

OCR

Caption

Caption

Screenshot 2025-07-27 at 23-42-00 Gradio.png Screenshot 2025-07-27 at 23-37-02 Gradio.png

222


Dependencies

The notebooks are built using:

  • Python
  • PyTorch
  • Hugging Face Transformers
  • ReportLab
  • Gradio (for UI)
  • (Qwen2.5-VL based)

All dependencies are automatically installed in the Colab environment.

Github

OCR-ReportLab-Notebooks : https://github.com/PRITHIVSAKTHIUR/OCR-ReportLab-Notebooks

Author

Created and maintained by PRITHIVSAKTHIUR

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support