File size: 892 Bytes

b927a0f

# Visual Question Answering using BLIP pre-trained model!

This implementation applies the BLIP pre-trained model to solve the icon domain task. 
![The BLIP model for VQA task](https://i.postimg.cc/ncnxSnJw/image.png)
|  ![enter image description here](https://i.postimg.cc/1zSYsrmm/image.png)|  |
|--|--|
| How many dots are there? | 36 |

# Description
**Note: The test dataset does not have labels. I evaluated the model via Kaggle competition and got 96% in accuracy manner. Obviously, you can use a partition of the training set as a testing set.
## Create data folder

Copy all data following the example form
You can download data [here](https://drive.google.com/file/d/1tt6qJbOgevyPpfkylXpKYy-KaT4_aCYZ/view?usp=sharing)

## Install requirements.txt

    pip install -r requirements.txt

## Run finetuning code

    python finetuning.py

## Run prediction

    python predicting.py