collapse_gemma-2-2b_hs2_accumulate_iter2_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.593	0.0266	5	1.3326	282968
1.468	0.0531	10	1.2105	570240
1.2681	0.0797	15	1.1563	856832
1.2072	0.1063	20	1.1314	1140048
1.1683	0.1328	25	1.1132	1423424
1.0607	0.1594	30	1.1129	1709744
1.1077	0.1860	35	1.1118	2001304
1.0482	0.2126	40	1.1135	2294992
1.074	0.2391	45	1.1164	2584408
0.9352	0.2657	50	1.1174	2875416
0.8739	0.2923	55	1.1168	3154376
0.8673	0.3188	60	1.1216	3443824
0.8946	0.3454	65	1.1154	3729728
0.7916	0.3720	70	1.1306	4012624
0.9486	0.3985	75	1.1155	4302744
0.721	0.4251	80	1.1205	4583480
0.8319	0.4517	85	1.1200	4859656
0.6664	0.4782	90	1.1144	5141344
0.7822	0.5048	95	1.1131	5420960
0.736	0.5314	100	1.1124	5708552
0.8007	0.5580	105	1.1126	5987736
0.6431	0.5845	110	1.1073	6271504
0.6754	0.6111	115	1.1048	6559824
0.8061	0.6377	120	1.1066	6848192
0.7043	0.6642	125	1.1044	7131224
0.6619	0.6908	130	1.1028	7410960
0.6988	0.7174	135	1.0991	7699432
0.7132	0.7439	140	1.0989	7986208
0.6748	0.7705	145	1.0963	8274264
0.7033	0.7971	150	1.0959	8560800
0.7145	0.8236	155	1.0943	8847792
0.6951	0.8502	160	1.0928	9134464
0.6958	0.8768	165	1.0908	9420928
0.6325	0.9034	170	1.0915	9697944
0.6244	0.9299	175	1.0886	9987896
0.6517	0.9565	180	1.0902	10269648
0.7256	0.9831	185	1.0874	10553360