git-base-isg-288

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0937
  • Wer Score: 2.7076

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Wer Score
13.4828 5.5882 50 4.2850 16.7473
3.9992 11.1176 100 0.3655 0.7942
0.2368 16.7059 150 0.0692 0.6679
0.0533 22.2353 200 0.0733 0.7004
0.0339 27.8235 250 0.0765 0.8520
0.0249 33.3529 300 0.0795 1.8592
0.0165 38.9412 350 0.0821 2.3827
0.0074 44.4706 400 0.0861 2.0542
0.0034 50.0 450 0.0885 3.0361
0.0023 55.5882 500 0.0909 2.4946
0.0018 61.1176 550 0.0920 2.6426
0.0016 66.7059 600 0.0930 2.6354
0.0015 72.2353 650 0.0930 2.2527
0.0013 77.8235 700 0.0935 2.6859
0.0013 83.3529 750 0.0937 2.7726
0.0012 88.9412 800 0.0937 2.7076

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
177M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for ssalvo41/git-base-isg-288

Base model

microsoft/git-base
Finetuned
(106)
this model