Fine-tuning ContactDoctor/Bio-Medical-MultiModal-Llama-3-8B-V1
#5
by
eltorio
- opened
Hi @SrikanthChellappa ,
I started to fine-tune a PEFT model for radiology. As a proof of concept, I created eltorio/IDEFICS3_ROCO based on Idefics3. Rather than fine-tuning Idefics3 with medical knowledge before using the medical imagery dataset, I’d like to test fine-tuning with your ContactDoctor/Bio-Medical-MultiModal-Llama-3-8B-V1 model.
My problem is that I cannot correctly tokenize the image. My original DataCollector was:
class MyDataCollator:
def __init__(self, processor):
self.processor = processor
self.image_token_id = processor.tokenizer.additional_special_tokens_ids[
processor.tokenizer.additional_special_tokens.index("<image>")
]
def __call__(self, samples):
texts = []
images = []
for sample in samples:
image = sample["image"]
answer = sample["caption"]
messages = [
{
"role": "system",
"content": [
{"type": "text", "text": prompt}
]
},
{
"role": "user",
"content": [
{"type": "image"},
]
},
{
"role": "assistant",
"content": [
{"type": "text", "text": answer}
]
}
]
text = processor.apply_chat_template(messages, add_generation_prompt=False)
texts.append(text.strip())
images.append([image.convert('RGB')])
batch = processor(text=texts, images=images, return_tensors="pt", padding=True)
labels = batch["input_ids"].clone()
labels[labels == processor.tokenizer.pad_token_id] = self.image_token_id
batch["labels"] = labels
return batch
Obviously, the image_token_id is different, but your tokenizer never finds any image. The tokenizer returns an error because re
returns nothing.
There is something I misunderstood.
Do you have any sample DataCollector with images for your model? Thank you for sharing your model with the community.
Best,
Ronan