metadata

base_model: google/paligemma-3b-mix-224
library_name: peft
license: gemma
tags:
  - generated_from_trainer
model-index:
  - name: results__fullrun__0710-151659
    results: []

resultsfullrun0710-151659

This model is a fine-tuned version of google/paligemma-3b-mix-224 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 4.9365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_steps: 2
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.6543	0.9952	180	2.5976
2.3815	1.9959	361	2.4431
2.202	2.9965	542	2.3760
2.0724	3.9972	723	2.3569
1.9581	4.9979	904	2.3635
1.8454	5.9986	1085	2.3840
1.7332	6.9993	1266	2.4254
1.6564	8.0	1447	2.4647
1.5437	8.9952	1627	2.5172
1.4799	9.9959	1808	2.5699
1.4155	10.9965	1989	2.6371
1.3326	11.9972	2170	2.7292
1.2682	12.9979	2351	2.7818
1.2501	13.9986	2532	2.8409
1.1692	14.9993	2713	2.8931
1.166	16.0	2894	2.9522
1.0736	16.9952	3074	3.0303
1.0849	17.9959	3255	3.0626
1.0453	18.9965	3436	3.1208
0.9778	19.9972	3617	3.1514
0.9626	20.9979	3798	3.2182
0.9285	21.9986	3979	3.2926
0.9047	22.9993	4160	3.3494
0.8471	24.0	4341	3.3960
0.8123	24.9952	4521	3.4674
0.7798	25.9959	4702	3.5216
0.762	26.9965	4883	3.6214
0.7284	27.9972	5064	3.6831
0.6922	28.9979	5245	3.6883
0.6732	29.9986	5426	3.7731
0.6575	30.9993	5607	3.8512
0.608	32.0	5788	3.9272
0.5817	32.9952	5968	3.9129
0.5567	33.9959	6149	4.0213
0.5448	34.9965	6330	4.0896
0.5159	35.9972	6511	4.1160
0.4992	36.9979	6692	4.1914
0.4644	37.9986	6873	4.2932
0.4412	38.9993	7054	4.3430
0.4619	40.0	7235	4.4002
0.4155	40.9952	7415	4.3920
0.3948	41.9959	7596	4.5190
0.3714	42.9965	7777	4.5420
0.3536	43.9972	7958	4.6237
0.3402	44.9979	8139	4.6591
0.3336	45.9986	8320	4.7309
0.3026	46.9993	8501	4.7661
0.2914	48.0	8682	4.8383
0.2711	48.9952	8862	4.8839
0.2498	49.7581	9000	4.9365

Framework versions

PEFT 0.13.0
Transformers 4.45.1
Pytorch 2.3.0.post101
Datasets 2.19.1
Tokenizers 0.19.1