ashok2216 commited on
Commit
6b49fcc
1 Parent(s): f3ab247

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -10
README.md CHANGED
@@ -1,15 +1,12 @@
1
  ---
2
  license: apache-2.0
 
3
  widget:
4
- - src: >-
5
- https://huggingface.co/datasets/mishig/sample_images/resolve/main/savanna.jpg
6
- example_title: Savanna
7
- - src: >-
8
- https://huggingface.co/datasets/mishig/sample_images/resolve/main/football-match.jpg
9
- example_title: Football Match
10
- - src: >-
11
- https://huggingface.co/datasets/mishig/sample_images/resolve/main/airport.jpg
12
- example_title: Airport
13
  language:
14
  - en
15
  metrics:
@@ -21,9 +18,10 @@ tags:
21
  - image_to_text
22
  - COCO
23
  - image-captioning
 
24
  pipeline_tag: image-to-text
25
  ---
26
-
27
 
28
  # vit-gpt2-image-captioning_COCO_FineTuned
29
  This repository contains the fine-tuned ViT-GPT2 model for image captioning, trained on the COCO dataset. The model combines a Vision Transformer (ViT) for image feature extraction and GPT-2 for text generation to create descriptive captions from images.
 
1
  ---
2
  license: apache-2.0
3
+
4
  widget:
5
+ - type: image-to-text
6
+ example:
7
+ image_url: "tiger.jpg"
8
+ prompt: "Describe this image in one sentence."
9
+
 
 
 
 
10
  language:
11
  - en
12
  metrics:
 
18
  - image_to_text
19
  - COCO
20
  - image-captioning
21
+
22
  pipeline_tag: image-to-text
23
  ---
24
+
25
 
26
  # vit-gpt2-image-captioning_COCO_FineTuned
27
  This repository contains the fine-tuned ViT-GPT2 model for image captioning, trained on the COCO dataset. The model combines a Vision Transformer (ViT) for image feature extraction and GPT-2 for text generation to create descriptive captions from images.