SohanAnisetty
/

ofa-vqa-tiny

Inference Endpoints

Model card Files Files and versions Community

Sohan Anisetty commited on Mar 7, 2023

Commit

3ef4848

•

1 Parent(s): dc57090

add readme

Files changed (1) hide show

README.md +49 -0

README.md ADDED Viewed

	@@ -0,0 +1,49 @@

+---
+license: apache-2.0
+---
+# OFA-tiny
+## Introduction
+This is the **tiny** version of OFA pretrained model finetuned on vqaV2.
+The directory includes 4 files, namely `config.json` which consists of model configuration, `vocab.json` and `merge.txt` for our OFA tokenizer, and lastly `pytorch_model.bin` which consists of model weights.
+## How to use
+Download the models as shown below.
+```bash
+git clone https://github.com/sohananisetty/OFA_VQA.git
+git clone https://huggingface.co/SohanAnisetty/ofa-vqa-tiny
+```
+After, refer the path to ofa-vqa-tiny to `ckpt_dir`, and prepare an image for the testing example below.
+```python
+>>> from PIL import Image
+>>> from torchvision import transforms
+>>> from transformers import OFATokenizer, OFAModelForVQA
+>>> mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]
+>>> resolution = 256
+>>> patch_resize_transform = transforms.Compose([
+        lambda image: image.convert("RGB"),
+        transforms.Resize((resolution, resolution), interpolation=Image.BICUBIC),
+        transforms.ToTensor(),
+        transforms.Normalize(mean=mean, std=std)
+    ])
+>>> tokenizer = OFATokenizer.from_pretrained(ckpt_dir)
+>>> txt = " what does the image describe?"
+>>> inputs = tokenizer([txt], return_tensors="pt").input_ids
+>>> img = Image.open(path_to_image)
+>>> patch_img = patch_resize_transform(img).unsqueeze(0)
+>>> model = OFAModel.from_pretrained(ckpt_dir, use_cache=False)
+>>> gen = model.generate(inputs, patch_images=patch_img, num_beams=5, no_repeat_ngram_size=3)
+>>> print(tokenizer.batch_decode(gen, skip_special_tokens=True))
+```