Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ metrics:
|
|
6 |
pipeline_tag: image-to-text
|
7 |
---
|
8 |
|
9 |
-
A Persian image captioning model constructed from a ViT + RoBERTa architecture trained on flickr30k-fa.
|
10 |
The encoder (ViT) was initialized from https://huggingface.co/google/vit-base-patch16-224 and the decoder (RoBERTa) was initialized
|
11 |
from https://huggingface.co/HooshvareLab/roberta-fa-zwnj-base .
|
12 |
|
|
|
6 |
pipeline_tag: image-to-text
|
7 |
---
|
8 |
|
9 |
+
A Persian image captioning model constructed from a ViT + RoBERTa architecture trained on [flickr30k-fa](https://www.kaggle.com/datasets/sajjadayobi360/flickrfa) (created by Sajjad Ayoubi).
|
10 |
The encoder (ViT) was initialized from https://huggingface.co/google/vit-base-patch16-224 and the decoder (RoBERTa) was initialized
|
11 |
from https://huggingface.co/HooshvareLab/roberta-fa-zwnj-base .
|
12 |
|