llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the akoul_whitehorseliquidity_25c dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0149

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.0904 0.0463 5 0.0951
0.053 0.0926 10 0.0489
0.0417 0.1389 15 0.0416
0.0396 0.1852 20 0.0358
0.029 0.2315 25 0.0329
0.0311 0.2778 30 0.0309
0.0295 0.3241 35 0.0280
0.0262 0.3704 40 0.0266
0.0272 0.4167 45 0.0261
0.0224 0.4630 50 0.0249
0.0229 0.5093 55 0.0245
0.0233 0.5556 60 0.0233
0.0217 0.6019 65 0.0226
0.0247 0.6481 70 0.0229
0.0193 0.6944 75 0.0228
0.0173 0.7407 80 0.0213
0.0207 0.7870 85 0.0204
0.0213 0.8333 90 0.0199
0.0199 0.8796 95 0.0202
0.0188 0.9259 100 0.0207
0.0193 0.9722 105 0.0203
0.016 1.0185 110 0.0197
0.0166 1.0648 115 0.0199
0.0189 1.1111 120 0.0195
0.0211 1.1574 125 0.0184
0.0171 1.2037 130 0.0189
0.0191 1.25 135 0.0192
0.0184 1.2963 140 0.0186
0.0162 1.3426 145 0.0187
0.0165 1.3889 150 0.0182
0.0157 1.4352 155 0.0186
0.0159 1.4815 160 0.0189
0.0174 1.5278 165 0.0187
0.0184 1.5741 170 0.0183
0.0165 1.6204 175 0.0183
0.0173 1.6667 180 0.0178
0.0131 1.7130 185 0.0172
0.0132 1.7593 190 0.0180
0.0157 1.8056 195 0.0181
0.0154 1.8519 200 0.0171
0.0139 1.8981 205 0.0169
0.0169 1.9444 210 0.0170
0.0158 1.9907 215 0.0170
0.0139 2.0370 220 0.0170
0.0146 2.0833 225 0.0170
0.0115 2.1296 230 0.0174
0.0138 2.1759 235 0.0168
0.0138 2.2222 240 0.0171
0.0134 2.2685 245 0.0167
0.0167 2.3148 250 0.0164
0.0123 2.3611 255 0.0164
0.0139 2.4074 260 0.0163
0.0125 2.4537 265 0.0161
0.0126 2.5 270 0.0160
0.0138 2.5463 275 0.0160
0.0125 2.5926 280 0.0152
0.0133 2.6389 285 0.0162
0.0125 2.6852 290 0.0161
0.0147 2.7315 295 0.0158
0.0134 2.7778 300 0.0161
0.0124 2.8241 305 0.0158
0.0132 2.8704 310 0.0154
0.0146 2.9167 315 0.0152
0.014 2.9630 320 0.0150
0.0122 3.0093 325 0.0151
0.0118 3.0556 330 0.0155
0.0108 3.1019 335 0.0155
0.0103 3.1481 340 0.0155
0.0099 3.1944 345 0.0154
0.0115 3.2407 350 0.0155
0.0099 3.2870 355 0.0154
0.0129 3.3333 360 0.0158
0.0105 3.3796 365 0.0154
0.0121 3.4259 370 0.0155
0.0096 3.4722 375 0.0155
0.0112 3.5185 380 0.0152
0.0118 3.5648 385 0.0147
0.0082 3.6111 390 0.0145
0.0112 3.6574 395 0.0146
0.0086 3.7037 400 0.0149
0.0102 3.75 405 0.0150
0.0116 3.7963 410 0.0149
0.0126 3.8426 415 0.0147
0.0112 3.8889 420 0.0146
0.0107 3.9352 425 0.0146
0.0113 3.9815 430 0.0147
0.0091 4.0278 435 0.0147
0.0094 4.0741 440 0.0150
0.0096 4.1204 445 0.0150
0.0091 4.1667 450 0.0152
0.0089 4.2130 455 0.0155
0.0063 4.2593 460 0.0156
0.0099 4.3056 465 0.0157
0.0085 4.3519 470 0.0156
0.011 4.3981 475 0.0156
0.0067 4.4444 480 0.0154
0.0092 4.4907 485 0.0153
0.0072 4.5370 490 0.0152
0.0078 4.5833 495 0.0152
0.0091 4.6296 500 0.0152
0.007 4.6759 505 0.0152
0.0064 4.7222 510 0.0152
0.0099 4.7685 515 0.0152
0.0088 4.8148 520 0.0152
0.0089 4.8611 525 0.0152
0.0083 4.9074 530 0.0152
0.0103 4.9537 535 0.0153
0.0092 5.0 540 0.0153

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
4
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for sizhkhy/akoul_whitehorseliquidity_25c

Adapter
(169)
this model