OCR on image

#28

by glitchyordis - opened Sep 10, 2024

Discussion

glitchyordis

Sep 10, 2024

Obtaining key information is quite straightforward but Is there a way to obtain bbox locations from texts detected?

glitchyordis changed discussion title from OCR text to OCR on image Sep 10, 2024

maxiw

Sep 10, 2024

You can prompt the model to return bbox locations (see here: https://huggingface.co/spaces/maxiw/Qwen2-VL-Detection). I also tried "detect all texts" but the results are not super precise.

bingw5

Sep 29, 2024

I tried OCR on a not-that-clear text screenshot, it's working nearly perfectly. But the model seems not good at recognize twisted text. E.g. words on bottle.

qinghuiyyds

Dec 3, 2024

This comment has been hidden

qinghuiyyds

Dec 3, 2024

测试

mikeleatila

Jan 16

•

edited Jan 17

Has anyone managed to get OCR text detections and their corresponding bounding boxes using QWEN2-VL-7B-Instruct model accurately? I am able to get OCR detections correctly but not the boxes. The boxes are quite misplaced and random I'd say.

computerlover

Feb 21

Has anyone managed to get OCR text detections and their corresponding bounding boxes using QWEN2-VL-7B-Instruct model accurately? I am able to get OCR detections correctly but not the boxes. The boxes are quite misplaced and random I'd say.

what's your settings especially your system prompts? I wonder how to get the ocr text without bbox.Thanks

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment