NikshepShetty commited on
Commit
7c5d9f0
·
verified ·
1 Parent(s): 37bf3ba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -10,6 +10,27 @@ tags:
10
  - adapter
11
  - image-captioning
12
  - peft
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
  # Florence-2 PixelProse LoRA Adapter
@@ -57,4 +78,20 @@ This code demonstrates how to:
57
  2. Load the LoRA adapter
58
  3. Process an image and generate a detailed caption
59
 
60
- Note: Make sure you have the required libraries installed: transformers, peft, einops, flash_attn, timm, Pillow, and requests.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  - adapter
11
  - image-captioning
12
  - peft
13
+ model-index:
14
+ - name: Florence-2-DOCCI-FT
15
+ results:
16
+ - task:
17
+ type: image-to-text
18
+ name: Image Captioning
19
+ dataset:
20
+ name: foundation-multimodal-models/DetailCaps-4870
21
+ type: other
22
+ metrics:
23
+ - type: meteor
24
+ value: 0.250
25
+ - type: bleu
26
+ value: 0.155
27
+ - type: cider
28
+ value: 0.039
29
+ - type: capture
30
+ value: 0.555
31
+ - type: rouge-l
32
+ value: 0.298
33
+
34
  ---
35
 
36
  # Florence-2 PixelProse LoRA Adapter
 
78
  2. Load the LoRA adapter
79
  3. Process an image and generate a detailed caption
80
 
81
+ Note: Make sure you have the required libraries installed: transformers, peft, einops, flash_attn, timm, Pillow, and requests.
82
+
83
+ ## Evaluation results
84
+
85
+ Our LoRA adapter shows improvements over the base Florence-2 model across all metrics for MORE_DETAILED_CAPTION tag for 1000 images on the foundation-multimodal-models/DetailCaps-4870 dataset:
86
+
87
+
88
+ | Metric | Base Model | PixelProse Adapter | Improvement |
89
+ |---------|------------|---------------------|-------------|
90
+ | METEOR | 0.213 | 0.250 | +17.4% |
91
+ | BLEU | 0.110 | 0.155 | +40.9% |
92
+ | CIDEr | 0.031 | 0.039 | +25.8% |
93
+ | CAPTURE | 0.546 | 0.555 | +1.6% |
94
+ | ROUGE-L | 0.275 | 0.298 | +8.4% |
95
+
96
+
97
+ These results demonstrate that our LoRA adapter enhances the image captioning capabilities of the Florence-2 base model, particularly in generating more detailed and accurate captions.