llm3br256
This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the akoul_whitehorseliquidity_25c dataset. It achieves the following results on the evaluation set:
- Loss: 0.0149
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.0904 | 0.0463 | 5 | 0.0951 |
0.053 | 0.0926 | 10 | 0.0489 |
0.0417 | 0.1389 | 15 | 0.0416 |
0.0396 | 0.1852 | 20 | 0.0358 |
0.029 | 0.2315 | 25 | 0.0329 |
0.0311 | 0.2778 | 30 | 0.0309 |
0.0295 | 0.3241 | 35 | 0.0280 |
0.0262 | 0.3704 | 40 | 0.0266 |
0.0272 | 0.4167 | 45 | 0.0261 |
0.0224 | 0.4630 | 50 | 0.0249 |
0.0229 | 0.5093 | 55 | 0.0245 |
0.0233 | 0.5556 | 60 | 0.0233 |
0.0217 | 0.6019 | 65 | 0.0226 |
0.0247 | 0.6481 | 70 | 0.0229 |
0.0193 | 0.6944 | 75 | 0.0228 |
0.0173 | 0.7407 | 80 | 0.0213 |
0.0207 | 0.7870 | 85 | 0.0204 |
0.0213 | 0.8333 | 90 | 0.0199 |
0.0199 | 0.8796 | 95 | 0.0202 |
0.0188 | 0.9259 | 100 | 0.0207 |
0.0193 | 0.9722 | 105 | 0.0203 |
0.016 | 1.0185 | 110 | 0.0197 |
0.0166 | 1.0648 | 115 | 0.0199 |
0.0189 | 1.1111 | 120 | 0.0195 |
0.0211 | 1.1574 | 125 | 0.0184 |
0.0171 | 1.2037 | 130 | 0.0189 |
0.0191 | 1.25 | 135 | 0.0192 |
0.0184 | 1.2963 | 140 | 0.0186 |
0.0162 | 1.3426 | 145 | 0.0187 |
0.0165 | 1.3889 | 150 | 0.0182 |
0.0157 | 1.4352 | 155 | 0.0186 |
0.0159 | 1.4815 | 160 | 0.0189 |
0.0174 | 1.5278 | 165 | 0.0187 |
0.0184 | 1.5741 | 170 | 0.0183 |
0.0165 | 1.6204 | 175 | 0.0183 |
0.0173 | 1.6667 | 180 | 0.0178 |
0.0131 | 1.7130 | 185 | 0.0172 |
0.0132 | 1.7593 | 190 | 0.0180 |
0.0157 | 1.8056 | 195 | 0.0181 |
0.0154 | 1.8519 | 200 | 0.0171 |
0.0139 | 1.8981 | 205 | 0.0169 |
0.0169 | 1.9444 | 210 | 0.0170 |
0.0158 | 1.9907 | 215 | 0.0170 |
0.0139 | 2.0370 | 220 | 0.0170 |
0.0146 | 2.0833 | 225 | 0.0170 |
0.0115 | 2.1296 | 230 | 0.0174 |
0.0138 | 2.1759 | 235 | 0.0168 |
0.0138 | 2.2222 | 240 | 0.0171 |
0.0134 | 2.2685 | 245 | 0.0167 |
0.0167 | 2.3148 | 250 | 0.0164 |
0.0123 | 2.3611 | 255 | 0.0164 |
0.0139 | 2.4074 | 260 | 0.0163 |
0.0125 | 2.4537 | 265 | 0.0161 |
0.0126 | 2.5 | 270 | 0.0160 |
0.0138 | 2.5463 | 275 | 0.0160 |
0.0125 | 2.5926 | 280 | 0.0152 |
0.0133 | 2.6389 | 285 | 0.0162 |
0.0125 | 2.6852 | 290 | 0.0161 |
0.0147 | 2.7315 | 295 | 0.0158 |
0.0134 | 2.7778 | 300 | 0.0161 |
0.0124 | 2.8241 | 305 | 0.0158 |
0.0132 | 2.8704 | 310 | 0.0154 |
0.0146 | 2.9167 | 315 | 0.0152 |
0.014 | 2.9630 | 320 | 0.0150 |
0.0122 | 3.0093 | 325 | 0.0151 |
0.0118 | 3.0556 | 330 | 0.0155 |
0.0108 | 3.1019 | 335 | 0.0155 |
0.0103 | 3.1481 | 340 | 0.0155 |
0.0099 | 3.1944 | 345 | 0.0154 |
0.0115 | 3.2407 | 350 | 0.0155 |
0.0099 | 3.2870 | 355 | 0.0154 |
0.0129 | 3.3333 | 360 | 0.0158 |
0.0105 | 3.3796 | 365 | 0.0154 |
0.0121 | 3.4259 | 370 | 0.0155 |
0.0096 | 3.4722 | 375 | 0.0155 |
0.0112 | 3.5185 | 380 | 0.0152 |
0.0118 | 3.5648 | 385 | 0.0147 |
0.0082 | 3.6111 | 390 | 0.0145 |
0.0112 | 3.6574 | 395 | 0.0146 |
0.0086 | 3.7037 | 400 | 0.0149 |
0.0102 | 3.75 | 405 | 0.0150 |
0.0116 | 3.7963 | 410 | 0.0149 |
0.0126 | 3.8426 | 415 | 0.0147 |
0.0112 | 3.8889 | 420 | 0.0146 |
0.0107 | 3.9352 | 425 | 0.0146 |
0.0113 | 3.9815 | 430 | 0.0147 |
0.0091 | 4.0278 | 435 | 0.0147 |
0.0094 | 4.0741 | 440 | 0.0150 |
0.0096 | 4.1204 | 445 | 0.0150 |
0.0091 | 4.1667 | 450 | 0.0152 |
0.0089 | 4.2130 | 455 | 0.0155 |
0.0063 | 4.2593 | 460 | 0.0156 |
0.0099 | 4.3056 | 465 | 0.0157 |
0.0085 | 4.3519 | 470 | 0.0156 |
0.011 | 4.3981 | 475 | 0.0156 |
0.0067 | 4.4444 | 480 | 0.0154 |
0.0092 | 4.4907 | 485 | 0.0153 |
0.0072 | 4.5370 | 490 | 0.0152 |
0.0078 | 4.5833 | 495 | 0.0152 |
0.0091 | 4.6296 | 500 | 0.0152 |
0.007 | 4.6759 | 505 | 0.0152 |
0.0064 | 4.7222 | 510 | 0.0152 |
0.0099 | 4.7685 | 515 | 0.0152 |
0.0088 | 4.8148 | 520 | 0.0152 |
0.0089 | 4.8611 | 525 | 0.0152 |
0.0083 | 4.9074 | 530 | 0.0152 |
0.0103 | 4.9537 | 535 | 0.0153 |
0.0092 | 5.0 | 540 | 0.0153 |
Framework versions
- PEFT 0.12.0
- Transformers 4.46.1
- Pytorch 2.4.0+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 4
Model tree for sizhkhy/akoul_whitehorseliquidity_25c
Base model
meta-llama/Llama-3.2-3B-Instruct
Finetuned
unsloth/Llama-3.2-3B-Instruct