metadata
license: apache-2.0
base_model: nlpconnect/vit-gpt2-image-captioning
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: image-captioning-Vit-GPT2-Flickr8k
results: []
image-captioning-Vit-GPT2-Flickr8k
This model is a fine-tuned version of nlpconnect/vit-gpt2-image-captioning on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4624
- Rouge1: 38.4598
- Rouge2: 14.1356
- Rougel: 35.4001
- Rougelsum: 35.4044
- Gen Len: 12.1355
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.5495 | 0.06 | 500 | 0.4942 | 35.0813 | 11.7169 | 32.4184 | 32.4273 | 11.5738 |
0.4945 | 0.12 | 1000 | 0.4903 | 35.4868 | 12.037 | 32.835 | 32.8388 | 11.8682 |
0.4984 | 0.19 | 1500 | 0.4862 | 35.3878 | 11.996 | 32.8196 | 32.8268 | 12.0544 |
0.4783 | 0.25 | 2000 | 0.4808 | 36.1063 | 12.3478 | 33.4632 | 33.4783 | 11.3468 |
0.4736 | 0.31 | 2500 | 0.4772 | 35.9266 | 12.3362 | 33.5046 | 33.5103 | 11.1066 |
0.4685 | 0.37 | 3000 | 0.4708 | 36.9089 | 13.0915 | 34.2896 | 34.2995 | 11.4739 |
0.4687 | 0.43 | 3500 | 0.4704 | 36.1844 | 12.5731 | 33.4609 | 33.4733 | 11.9201 |
0.4709 | 0.49 | 4000 | 0.4696 | 36.1774 | 12.8262 | 33.3824 | 33.3814 | 12.1733 |
0.4575 | 0.56 | 4500 | 0.4675 | 37.4417 | 13.7581 | 34.5386 | 34.5523 | 12.6302 |
0.4484 | 0.62 | 5000 | 0.4662 | 36.6864 | 13.0727 | 33.9056 | 33.9339 | 12.6007 |
0.4507 | 0.68 | 5500 | 0.4656 | 36.5144 | 12.7924 | 34.0484 | 34.0759 | 11.4316 |
0.4445 | 0.74 | 6000 | 0.4628 | 37.0553 | 13.3404 | 34.4096 | 34.4153 | 12.3211 |
0.4557 | 0.8 | 6500 | 0.4594 | 37.3241 | 13.1468 | 34.45 | 34.4658 | 12.2522 |
0.4451 | 0.87 | 7000 | 0.4600 | 37.33 | 13.5726 | 34.6534 | 34.6635 | 12.0494 |
0.4381 | 0.93 | 7500 | 0.4588 | 37.6255 | 13.8048 | 34.817 | 34.8252 | 12.1347 |
0.4357 | 0.99 | 8000 | 0.4571 | 37.2088 | 13.4177 | 34.3316 | 34.3372 | 12.2670 |
0.3869 | 1.05 | 8500 | 0.4612 | 37.7054 | 13.683 | 34.9637 | 34.9821 | 11.3216 |
0.377 | 1.11 | 9000 | 0.4616 | 37.2701 | 13.2182 | 34.3249 | 34.3396 | 12.3221 |
0.3736 | 1.17 | 9500 | 0.4607 | 37.2101 | 13.1285 | 34.3812 | 34.3767 | 11.8274 |
0.3801 | 1.24 | 10000 | 0.4617 | 37.9963 | 13.7537 | 35.2402 | 35.2374 | 11.6079 |
0.3816 | 1.3 | 10500 | 0.4599 | 37.3247 | 13.619 | 34.6494 | 34.6538 | 12.2101 |
0.377 | 1.36 | 11000 | 0.4619 | 37.2827 | 13.4471 | 34.3588 | 34.3861 | 12.3911 |
0.3745 | 1.42 | 11500 | 0.4604 | 37.5469 | 13.3948 | 34.5403 | 34.5613 | 12.2747 |
0.3785 | 1.48 | 12000 | 0.4568 | 38.085 | 14.0087 | 35.0549 | 35.0564 | 12.3179 |
0.3675 | 1.54 | 12500 | 0.4587 | 37.6241 | 13.8529 | 34.7614 | 34.7853 | 11.8732 |
0.3731 | 1.61 | 13000 | 0.4554 | 38.4418 | 14.1464 | 35.6658 | 35.6502 | 11.4294 |
0.3731 | 1.67 | 13500 | 0.4548 | 37.9045 | 13.7524 | 34.9001 | 34.9092 | 12.1241 |
0.371 | 1.73 | 14000 | 0.4542 | 38.412 | 14.212 | 35.473 | 35.4781 | 12.1014 |
0.3615 | 1.79 | 14500 | 0.4551 | 38.0734 | 14.1066 | 35.1289 | 35.1552 | 12.1135 |
0.3687 | 1.85 | 15000 | 0.4550 | 38.1762 | 14.1402 | 35.288 | 35.2936 | 12.2255 |
0.3711 | 1.92 | 15500 | 0.4532 | 37.6439 | 13.611 | 34.7558 | 34.7601 | 12.1632 |
0.3685 | 1.98 | 16000 | 0.4515 | 38.5682 | 14.5305 | 35.552 | 35.5703 | 11.9162 |
0.3333 | 2.04 | 16500 | 0.4626 | 38.4527 | 14.4649 | 35.6252 | 35.6307 | 11.9506 |
0.3129 | 2.1 | 17000 | 0.4660 | 38.203 | 14.0699 | 35.1626 | 35.1595 | 12.3313 |
0.3155 | 2.16 | 17500 | 0.4674 | 37.8903 | 13.9159 | 34.9097 | 34.9101 | 12.4853 |
0.3134 | 2.22 | 18000 | 0.4644 | 38.1489 | 13.9448 | 35.0351 | 35.0351 | 11.9748 |
0.3167 | 2.29 | 18500 | 0.4653 | 37.8449 | 13.9106 | 34.7773 | 34.7854 | 12.5273 |
0.322 | 2.35 | 19000 | 0.4673 | 37.9832 | 14.0115 | 34.8505 | 34.8597 | 12.4680 |
0.312 | 2.41 | 19500 | 0.4641 | 38.4627 | 14.2528 | 35.4297 | 35.4377 | 11.9315 |
0.3173 | 2.47 | 20000 | 0.4654 | 38.1591 | 13.9126 | 35.1114 | 35.1042 | 12.4845 |
0.3081 | 2.53 | 20500 | 0.4640 | 38.6969 | 14.3244 | 35.6933 | 35.692 | 11.8932 |
0.3093 | 2.6 | 21000 | 0.4633 | 38.2944 | 14.103 | 35.2407 | 35.2629 | 11.8932 |
0.3154 | 2.66 | 21500 | 0.4637 | 38.0668 | 13.7427 | 35.0547 | 35.0585 | 12.1310 |
0.3096 | 2.72 | 22000 | 0.4630 | 38.3647 | 14.0445 | 35.2568 | 35.2511 | 12.2591 |
0.3101 | 2.78 | 22500 | 0.4627 | 38.6366 | 14.3013 | 35.4955 | 35.4956 | 12.2836 |
0.309 | 2.84 | 23000 | 0.4620 | 38.3486 | 14.0403 | 35.3173 | 35.3265 | 12.3281 |
0.312 | 2.9 | 23500 | 0.4623 | 38.423 | 14.0759 | 35.3766 | 35.3853 | 12.2208 |
0.3135 | 2.97 | 24000 | 0.4624 | 38.4598 | 14.1356 | 35.4001 | 35.4044 | 12.1355 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.2