Batch inputs (image, prompt)

#10

by jeeyungk - opened May 8, 2024

Discussion

jeeyungk

May 8, 2024

Can we use a batch of image as an input to LLaVA?

RaushanTurganbay

Llava Hugging Face org May 8, 2024

Hi! Yes Llava-1.5 can take batched inputs, see the code snippet below:

import requests
from PIL import Image

import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration

model = LlavaForConditionalGeneration.from_pretrained("llava-hf/llava-1.5-7b-hf", torch_dtype=torch.float16, device_map="auto")
processor = AutoProcessor.from_pretrained("llava-hf/llava-1.5-7b-hf")

prompts = [
        "USER: <image>\nWhat are the things I should be cautious about when I visit this place? What should I bring with me? ASSISTANT:",
        "USER: <image>\nWhat is this? ASSISTANT:",
  ]
 
image1 = Image.open(requests.get("https://llava-vl.github.io/static/images/view.jpg", stream=True).raw)
image2 = Image.open(requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", stream=True).raw)

inputs = processor(prompts, images=[image1, image2], padding=True, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=20)
print(output)

ppbrown

Jul 7, 2024

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

nielsr

Llava Hugging Face org Jul 7, 2024

Hi,

You need to place the inputs on the GPU as well, so the snippet above needs to add:

inputs = processor(prompts, images=[image1, image2], padding=True, return_tensors="pt").to("cuda")

ZIHANGDU18

Jul 15, 2024

Why can it only recognize the first picture and not reply to the two pictures?

RaushanTurganbay

Llava Hugging Face org Jul 15, 2024

@ZIHANGDU18 the models was not trained with multi-image setting and thus may perform poorly without proper fine-tuning. Try out the new llava series, tuned with multi-image dataset :)

https://huggingface.co/collections/llava-hf/llava-interleave-668e19a97da0036aad4a2f19

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment