Update README.md
Browse files
README.md
CHANGED
@@ -1,15 +1,12 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
3 |
widget:
|
4 |
-
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
example_title: Football Match
|
10 |
-
- src: >-
|
11 |
-
https://huggingface.co/datasets/mishig/sample_images/resolve/main/airport.jpg
|
12 |
-
example_title: Airport
|
13 |
language:
|
14 |
- en
|
15 |
metrics:
|
@@ -21,9 +18,10 @@ tags:
|
|
21 |
- image_to_text
|
22 |
- COCO
|
23 |
- image-captioning
|
|
|
24 |
pipeline_tag: image-to-text
|
25 |
---
|
26 |
-
|
27 |
|
28 |
# vit-gpt2-image-captioning_COCO_FineTuned
|
29 |
This repository contains the fine-tuned ViT-GPT2 model for image captioning, trained on the COCO dataset. The model combines a Vision Transformer (ViT) for image feature extraction and GPT-2 for text generation to create descriptive captions from images.
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
|
4 |
widget:
|
5 |
+
- type: image-to-text
|
6 |
+
example:
|
7 |
+
image_url: "tiger.jpg"
|
8 |
+
prompt: "Describe this image in one sentence."
|
9 |
+
|
|
|
|
|
|
|
|
|
10 |
language:
|
11 |
- en
|
12 |
metrics:
|
|
|
18 |
- image_to_text
|
19 |
- COCO
|
20 |
- image-captioning
|
21 |
+
|
22 |
pipeline_tag: image-to-text
|
23 |
---
|
24 |
+
|
25 |
|
26 |
# vit-gpt2-image-captioning_COCO_FineTuned
|
27 |
This repository contains the fine-tuned ViT-GPT2 model for image captioning, trained on the COCO dataset. The model combines a Vision Transformer (ViT) for image feature extraction and GPT-2 for text generation to create descriptive captions from images.
|