qwen2-7b-instruct-trl-sft-mrg

This model is a fine-tuned version of Qwen/Qwen2-VL-7B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
2.0164	0.4947	32	1.8794
1.7004	0.9894	64	1.5669
1.4913	1.4841	96	1.4239
1.4568	1.9787	128	1.3451
1.3309	2.4734	160	1.2841
1.2276	2.9681	192	1.2427
1.2531	3.4628	224	1.2152
1.1983	3.9575	256	1.1806
1.1846	4.4522	288	1.1549
1.121	4.9469	320	1.1357
1.096	5.4415	352	1.1263
1.0818	5.9362	384	1.1133
1.0113	6.4309	416	1.1039
1.0566	6.9256	448	1.0896
1.0195	7.4203	480	1.0926
0.9407	7.9150	512	1.0759
1.0077	8.4097	544	1.0790
1.0554	8.9043	576	1.0694
0.9664	9.3990	608	1.0774
0.9157	9.8937	640	1.0643
0.8869	10.3884	672	1.0705
0.9715	10.8831	704	1.0660
0.8926	11.3778	736	1.0752
0.8906	11.8725	768	1.0718
0.8649	12.3671	800	1.0686
0.9162	12.8618	832	1.0685
0.8708	13.3565	864	1.0798
0.9127	13.8512	896	1.0702
0.8355	14.3459	928	1.0732
0.806	14.8406	960	1.0746