adishourya's picture
adishourya/medpix_pg
28a2ab5 verified
|
raw
history blame
2.52 kB
metadata
base_model: google/paligemma-3b-mix-448
library_name: peft
license: gemma
tags:
  - generated_from_trainer
model-index:
  - name: results__fullrun__2110-104610
    results: []

results__fullrun__2110-104610

This model is a fine-tuned version of google/paligemma-3b-mix-448 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2421

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.6855 0.9952 180 2.6276
2.4044 1.9959 361 2.4716
2.2275 2.9965 542 2.4070
2.092 3.9972 723 2.3871
1.9761 4.9979 904 2.3929
1.8674 5.9986 1085 2.4194
1.7501 6.9993 1266 2.4726
1.6706 8.0 1447 2.5062
1.5599 8.9952 1627 2.5492
1.4896 9.9959 1808 2.6080
1.4289 10.9965 1989 2.6687
1.3458 11.9972 2170 2.7300
1.2746 12.9979 2351 2.7933
1.2656 13.9986 2532 2.8295
1.1751 14.9993 2713 2.9203
1.1792 16.0 2894 2.9811
1.0851 16.9952 3074 3.0481
1.0966 17.9959 3255 3.0981
1.0581 18.9965 3436 3.1394
1.0055 19.9032 3600 3.2421

Framework versions

  • PEFT 0.13.0
  • Transformers 4.45.1
  • Pytorch 2.3.0.post101
  • Datasets 2.19.1
  • Tokenizers 0.20.0