llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the akoul_whitehorseliquidity_25c dataset. It achieves the following results on the evaluation set:

Loss: 0.0149

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss
0.0904	0.0463	5	0.0951
0.053	0.0926	10	0.0489
0.0417	0.1389	15	0.0416
0.0396	0.1852	20	0.0358
0.029	0.2315	25	0.0329
0.0311	0.2778	30	0.0309
0.0295	0.3241	35	0.0280
0.0262	0.3704	40	0.0266
0.0272	0.4167	45	0.0261
0.0224	0.4630	50	0.0249
0.0229	0.5093	55	0.0245
0.0233	0.5556	60	0.0233
0.0217	0.6019	65	0.0226
0.0247	0.6481	70	0.0229
0.0193	0.6944	75	0.0228
0.0173	0.7407	80	0.0213
0.0207	0.7870	85	0.0204
0.0213	0.8333	90	0.0199
0.0199	0.8796	95	0.0202
0.0188	0.9259	100	0.0207
0.0193	0.9722	105	0.0203
0.016	1.0185	110	0.0197
0.0166	1.0648	115	0.0199
0.0189	1.1111	120	0.0195
0.0211	1.1574	125	0.0184
0.0171	1.2037	130	0.0189
0.0191	1.25	135	0.0192
0.0184	1.2963	140	0.0186
0.0162	1.3426	145	0.0187
0.0165	1.3889	150	0.0182
0.0157	1.4352	155	0.0186
0.0159	1.4815	160	0.0189
0.0174	1.5278	165	0.0187
0.0184	1.5741	170	0.0183
0.0165	1.6204	175	0.0183
0.0173	1.6667	180	0.0178
0.0131	1.7130	185	0.0172
0.0132	1.7593	190	0.0180
0.0157	1.8056	195	0.0181
0.0154	1.8519	200	0.0171
0.0139	1.8981	205	0.0169
0.0169	1.9444	210	0.0170
0.0158	1.9907	215	0.0170
0.0139	2.0370	220	0.0170
0.0146	2.0833	225	0.0170
0.0115	2.1296	230	0.0174
0.0138	2.1759	235	0.0168
0.0138	2.2222	240	0.0171
0.0134	2.2685	245	0.0167
0.0167	2.3148	250	0.0164
0.0123	2.3611	255	0.0164
0.0139	2.4074	260	0.0163
0.0125	2.4537	265	0.0161
0.0126	2.5	270	0.0160
0.0138	2.5463	275	0.0160
0.0125	2.5926	280	0.0152
0.0133	2.6389	285	0.0162
0.0125	2.6852	290	0.0161
0.0147	2.7315	295	0.0158
0.0134	2.7778	300	0.0161
0.0124	2.8241	305	0.0158
0.0132	2.8704	310	0.0154
0.0146	2.9167	315	0.0152
0.014	2.9630	320	0.0150
0.0122	3.0093	325	0.0151
0.0118	3.0556	330	0.0155
0.0108	3.1019	335	0.0155
0.0103	3.1481	340	0.0155
0.0099	3.1944	345	0.0154
0.0115	3.2407	350	0.0155
0.0099	3.2870	355	0.0154
0.0129	3.3333	360	0.0158
0.0105	3.3796	365	0.0154
0.0121	3.4259	370	0.0155
0.0096	3.4722	375	0.0155
0.0112	3.5185	380	0.0152
0.0118	3.5648	385	0.0147
0.0082	3.6111	390	0.0145
0.0112	3.6574	395	0.0146
0.0086	3.7037	400	0.0149
0.0102	3.75	405	0.0150
0.0116	3.7963	410	0.0149
0.0126	3.8426	415	0.0147
0.0112	3.8889	420	0.0146
0.0107	3.9352	425	0.0146
0.0113	3.9815	430	0.0147
0.0091	4.0278	435	0.0147
0.0094	4.0741	440	0.0150
0.0096	4.1204	445	0.0150
0.0091	4.1667	450	0.0152
0.0089	4.2130	455	0.0155
0.0063	4.2593	460	0.0156
0.0099	4.3056	465	0.0157
0.0085	4.3519	470	0.0156
0.011	4.3981	475	0.0156
0.0067	4.4444	480	0.0154
0.0092	4.4907	485	0.0153
0.0072	4.5370	490	0.0152
0.0078	4.5833	495	0.0152
0.0091	4.6296	500	0.0152
0.007	4.6759	505	0.0152
0.0064	4.7222	510	0.0152
0.0099	4.7685	515	0.0152
0.0088	4.8148	520	0.0152
0.0089	4.8611	525	0.0152
0.0083	4.9074	530	0.0152
0.0103	4.9537	535	0.0153
0.0092	5.0	540	0.0153

Framework versions

PEFT 0.12.0
Transformers 4.46.1
Pytorch 2.4.0+cu121
Datasets 3.1.0
Tokenizers 0.20.3

sizhkhy
/

akoul_whitehorseliquidity_25c

llm3br256

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sizhkhy/akoul_whitehorseliquidity_25c

Evaluation results