results__fullrun__0710-151659

This model is a fine-tuned version of google/paligemma-3b-mix-224 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.6543 0.9952 180 2.5976
2.3815 1.9959 361 2.4431
2.202 2.9965 542 2.3760
2.0724 3.9972 723 2.3569
1.9581 4.9979 904 2.3635
1.8454 5.9986 1085 2.3840
1.7332 6.9993 1266 2.4254
1.6564 8.0 1447 2.4647
1.5437 8.9952 1627 2.5172
1.4799 9.9959 1808 2.5699
1.4155 10.9965 1989 2.6371
1.3326 11.9972 2170 2.7292
1.2682 12.9979 2351 2.7818
1.2501 13.9986 2532 2.8409
1.1692 14.9993 2713 2.8931
1.166 16.0 2894 2.9522
1.0736 16.9952 3074 3.0303
1.0849 17.9959 3255 3.0626
1.0453 18.9965 3436 3.1208
0.9778 19.9972 3617 3.1514
0.9626 20.9979 3798 3.2182
0.9285 21.9986 3979 3.2926
0.9047 22.9993 4160 3.3494
0.8471 24.0 4341 3.3960
0.8123 24.9952 4521 3.4674
0.7798 25.9959 4702 3.5216
0.762 26.9965 4883 3.6214
0.7284 27.9972 5064 3.6831
0.6922 28.9979 5245 3.6883
0.6732 29.9986 5426 3.7731
0.6575 30.9993 5607 3.8512
0.608 32.0 5788 3.9272
0.5817 32.9952 5968 3.9129
0.5567 33.9959 6149 4.0213
0.5448 34.9965 6330 4.0896
0.5159 35.9972 6511 4.1160
0.4992 36.9979 6692 4.1914
0.4644 37.9986 6873 4.2932
0.4412 38.9993 7054 4.3430
0.4619 40.0 7235 4.4002
0.4155 40.9952 7415 4.3920
0.3948 41.9959 7596 4.5190
0.3714 42.9965 7777 4.5420
0.3536 43.9972 7958 4.6237
0.3402 44.9979 8139 4.6591
0.3336 45.9986 8320 4.7309
0.3026 46.9993 8501 4.7661
0.2914 48.0 8682 4.8383
0.2711 48.9952 8862 4.8839
0.2498 49.7581 9000 4.9365

Framework versions

  • PEFT 0.13.0
  • Transformers 4.45.1
  • Pytorch 2.3.0.post101
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for adishourya/results__fullrun__0710-151659

Adapter
(20)
this model