qwen2-2b-instruct-trl-sft-mrg

This model is a fine-tuned version of Qwen/Qwen2-VL-2B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 15

Training Loss	Epoch	Step	Validation Loss
3.0236	0.4942	32	2.9521
2.6642	0.9884	64	2.4341
2.156	1.4846	96	1.9677
1.9011	1.9788	128	1.7312
1.6955	2.4749	160	1.6093
1.5552	2.9691	192	1.5437
1.5361	3.4653	224	1.4991
1.4831	3.9595	256	1.4554
1.5036	4.4556	288	1.4261
1.3815	4.9498	320	1.3991
1.3762	5.4459	352	1.3760
1.3636	5.9402	384	1.3562
1.2826	6.4363	416	1.3424
1.3178	6.9305	448	1.3256
1.2689	7.4266	480	1.3123
1.2163	7.9208	512	1.3019
1.284	8.4170	544	1.2920
1.3356	8.9112	576	1.2862
1.2359	9.4073	608	1.2820
1.2157	9.9015	640	1.2746
1.1936	10.3977	672	1.2709
1.3181	10.8919	704	1.2659
1.2266	11.3880	736	1.2641
1.213	11.8822	768	1.2605
1.1997	12.3784	800	1.2603
1.2584	12.8726	832	1.2577
1.2547	13.3687	864	1.2576
1.2544	13.8629	896	1.2574
1.203	14.3591	928	1.2569
1.1467	14.8533	960	1.2568