git-base-captioning / README.md
Guspfc's picture
End of training
99473a3 verified
|
raw
history blame
2.88 kB
metadata
license: mit
base_model: microsoft/git-base
tags:
  - generated_from_trainer
datasets:
  - imagefolder
model-index:
  - name: git-base-captioning
    results: []

git-base-captioning

This model is a fine-tuned version of microsoft/git-base on the imagefolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3817
  • Wer Score: 2.8621

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
7.3416 0.4202 50 4.5198 4.7633
2.4704 0.8403 100 0.7015 0.8610
0.4735 1.2605 150 0.3923 0.8164
0.3669 1.6807 200 0.3762 0.8198
0.3075 2.1008 250 0.3680 0.8062
0.2837 2.5210 300 0.3683 0.8090
0.274 2.9412 350 0.3640 0.8401
0.2393 3.3613 400 0.3692 2.8282
0.2498 3.7815 450 0.3655 2.0712
0.2198 4.2017 500 0.3698 3.2164
0.2034 4.6218 550 0.3688 2.5853
0.1925 5.0420 600 0.3698 2.9119
0.1779 5.4622 650 0.3729 3.1333
0.1734 5.8824 700 0.3727 1.7605
0.1696 6.3025 750 0.3749 3.5226
0.15 6.7227 800 0.3773 2.8932
0.1595 7.1429 850 0.3762 2.7842
0.1507 7.5630 900 0.3803 1.0266
0.135 7.9832 950 0.3802 3.6090
0.1385 8.4034 1000 0.3801 3.3169
0.1311 8.8235 1050 0.3800 3.3966
0.1398 9.2437 1100 0.3815 2.1915
0.1293 9.6639 1150 0.3817 2.8621

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1