simple script

#4
by adnanp - opened

Hello,

I am trying to use your magiv3 model, but I am running into errors and cannot get it to work. I would be very grateful for your help.

I try code

model = AutoModelForCausalLM.from_pretrained("ragavsachdeva/magiv3", torch_dtype=torch.float16, trust_remote_code=True).cuda().eval()
processor = AutoProcessor.from_pretrained("ragavsachdeva/magiv3", trust_remote_code=True)

model.predict_detections_and_associations(images, processor)
model.predict_ocr(images, processor)
model.predict_character_grounding(images, captions, processor)

but i dont understund parametars, dependecies, requiments.tct, nothing :(

When I try to use the model with the standard Florence-2 API, I get a TypeError on the processor. It seems to not accept text, task_prompt, or images as keyword arguments.

Here is the code I am using:


import torch
from PIL import Image
from transformers import AutoModelForCausalLM, AutoProcessor

Load Model and Processor

model = AutoModelForCausalLM.from_pretrained(
"ragavsachdeva/magiv3",
torch_dtype=torch.float16,
trust_remote_code=True
).cuda().eval()

processor = AutoProcessor.from_pretrained(
"ragavsachdeva/magiv3",
trust_remote_code=True
)

Prepare Data

image_paths = ["path/to/page1.png"] # I use my real paths here
images = [Image.open(path).convert("RGB") for path in image_paths]
ocr_prompt = ""

Attempt to run OCR

try:
# This line causes a TypeError
inputs = processor(images=images, text=ocr_prompt, return_tensors="pt").to("cuda", torch.float16)

generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(generated_text)

except Exception as e:
print(f"An error occurred: {e}")

The error I get is: TypeError: Florence2Processor.__call__() got an unexpected keyword argument 'text' (or 'images', or 'task_prompt').

Could you please provide a simple, working code snippet that shows the correct way to run OCR with magiv3?

Thank you.

Hi, let me look into this and get back to you this weekend.

Sign up or log in to comment