lenashamseldin's picture
End of training
b65d864 verified
|
raw
history blame
5.73 kB
metadata
license: mit
base_model: microsoft/git-base
tags:
  - generated_from_trainer
model-index:
  - name: git-base-floors615images
    results: []

git-base-floors615images

This model is a fine-tuned version of microsoft/git-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0971
  • Wer Score: 4.3530

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
7.0263 0.72 50 4.1468 2.9871
1.9906 1.45 100 0.2971 0.4260
0.1401 2.17 150 0.0827 2.2259
0.0772 2.9 200 0.0704 3.9564
0.0567 3.62 250 0.0677 3.6360
0.0501 4.35 300 0.0678 3.2940
0.0443 5.07 350 0.0664 3.3094
0.0386 5.8 400 0.0666 3.0884
0.035 6.52 450 0.0681 3.3818
0.0329 7.25 500 0.0667 3.2474
0.0297 7.97 550 0.0681 3.6648
0.0271 8.7 600 0.0700 3.9405
0.0257 9.42 650 0.0697 3.0393
0.0243 10.14 700 0.0701 3.5973
0.0233 10.87 750 0.0695 3.0565
0.0213 11.59 800 0.0721 3.6980
0.0203 12.32 850 0.0730 3.8883
0.0206 13.04 900 0.0718 4.9104
0.0187 13.77 950 0.0729 4.4850
0.0192 14.49 1000 0.0743 4.6366
0.0184 15.22 1050 0.0729 4.9141
0.0169 15.94 1100 0.0754 4.6446
0.0167 16.67 1150 0.0769 4.8232
0.0162 17.39 1200 0.0766 4.7569
0.0165 18.12 1250 0.0768 4.3266
0.0157 18.84 1300 0.0776 4.1375
0.0144 19.57 1350 0.0778 3.9724
0.015 20.29 1400 0.0790 4.7041
0.0146 21.01 1450 0.0780 4.5003
0.0137 21.74 1500 0.0794 4.2167
0.0141 22.46 1550 0.0792 4.4856
0.0137 23.19 1600 0.0805 4.2634
0.0137 23.91 1650 0.0817 4.4162
0.0127 24.64 1700 0.0804 4.0319
0.0127 25.36 1750 0.0829 4.3628
0.013 26.09 1800 0.0826 4.4211
0.0121 26.81 1850 0.0823 4.8932
0.0119 27.54 1900 0.0835 4.6636
0.012 28.26 1950 0.0842 3.8926
0.0118 28.99 2000 0.0844 3.9994
0.011 29.71 2050 0.0833 4.1743
0.0109 30.43 2100 0.0864 4.4217
0.0108 31.16 2150 0.0851 4.8029
0.0103 31.88 2200 0.0855 4.0694
0.01 32.61 2250 0.0871 4.3198
0.0102 33.33 2300 0.0863 4.4082
0.0099 34.06 2350 0.0871 4.2112
0.0094 34.78 2400 0.0872 4.1774
0.0092 35.51 2450 0.0887 3.9742
0.009 36.23 2500 0.0882 4.1958
0.0088 36.96 2550 0.0893 4.2591
0.0084 37.68 2600 0.0885 4.2983
0.0079 38.41 2650 0.0894 4.6550
0.008 39.13 2700 0.0904 4.1277
0.0076 39.86 2750 0.0908 3.6771
0.0072 40.58 2800 0.0912 4.1252
0.0072 41.3 2850 0.0908 4.5660
0.0069 42.03 2900 0.0917 3.9441
0.0062 42.75 2950 0.0924 4.2259
0.006 43.48 3000 0.0924 4.2167
0.0059 44.2 3050 0.0937 4.6047
0.0055 44.93 3100 0.0945 4.4408
0.0048 45.65 3150 0.0950 3.9871
0.0048 46.38 3200 0.0952 4.2259
0.0046 47.1 3250 0.0962 4.2204
0.0042 47.83 3300 0.0963 4.2750
0.0037 48.55 3350 0.0971 4.1891
0.0039 49.28 3400 0.0970 4.3100
0.0036 50.0 3450 0.0971 4.3530

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.0.1
  • Datasets 2.18.0
  • Tokenizers 0.15.2