llm3br256

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on the asianpaints dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0114

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.0567 0.0460 5 0.0584
0.0378 0.0920 10 0.0384
0.0301 0.1379 15 0.0318
0.0248 0.1839 20 0.0281
0.0241 0.2299 25 0.0256
0.021 0.2759 30 0.0234
0.0213 0.3218 35 0.0225
0.0211 0.3678 40 0.0214
0.0185 0.4138 45 0.0200
0.0162 0.4598 50 0.0196
0.0177 0.5057 55 0.0189
0.0168 0.5517 60 0.0184
0.017 0.5977 65 0.0182
0.0143 0.6437 70 0.0177
0.0143 0.6897 75 0.0176
0.0155 0.7356 80 0.0176
0.0162 0.7816 85 0.0169
0.0164 0.8276 90 0.0164
0.0154 0.8736 95 0.0162
0.0164 0.9195 100 0.0159
0.0156 0.9655 105 0.0160
0.0145 1.0115 110 0.0159
0.0133 1.0575 115 0.0156
0.0126 1.1034 120 0.0155
0.0145 1.1494 125 0.0154
0.0125 1.1954 130 0.0150
0.0122 1.2414 135 0.0148
0.0127 1.2874 140 0.0147
0.0139 1.3333 145 0.0144
0.0122 1.3793 150 0.0144
0.0138 1.4253 155 0.0139
0.0143 1.4713 160 0.0139
0.0124 1.5172 165 0.0138
0.0124 1.5632 170 0.0135
0.0138 1.6092 175 0.0132
0.0112 1.6552 180 0.0136
0.0102 1.7011 185 0.0135
0.0135 1.7471 190 0.0133
0.01 1.7931 195 0.0135
0.0115 1.8391 200 0.0131
0.0113 1.8851 205 0.0127
0.0107 1.9310 210 0.0128
0.0122 1.9770 215 0.0128
0.0099 2.0230 220 0.0128
0.0121 2.0690 225 0.0129
0.0103 2.1149 230 0.0128
0.01 2.1609 235 0.0127
0.0089 2.2069 240 0.0127
0.0089 2.2529 245 0.0127
0.0105 2.2989 250 0.0125
0.0093 2.3448 255 0.0124
0.0097 2.3908 260 0.0126
0.0091 2.4368 265 0.0126
0.0095 2.4828 270 0.0124
0.0094 2.5287 275 0.0123
0.0092 2.5747 280 0.0119
0.0084 2.6207 285 0.0121
0.0098 2.6667 290 0.0120
0.0097 2.7126 295 0.0122
0.0093 2.7586 300 0.0121
0.0096 2.8046 305 0.0119
0.0097 2.8506 310 0.0117
0.0101 2.8966 315 0.0118
0.0088 2.9425 320 0.0118
0.0096 2.9885 325 0.0118
0.0078 3.0345 330 0.0119
0.0064 3.0805 335 0.0119
0.0073 3.1264 340 0.0121
0.0066 3.1724 345 0.0121
0.0067 3.2184 350 0.0117
0.007 3.2644 355 0.0118
0.0072 3.3103 360 0.0116
0.0074 3.3563 365 0.0117
0.0067 3.4023 370 0.0117
0.0072 3.4483 375 0.0117
0.0069 3.4943 380 0.0117
0.0076 3.5402 385 0.0116
0.0068 3.5862 390 0.0114
0.0074 3.6322 395 0.0115
0.0065 3.6782 400 0.0114
0.007 3.7241 405 0.0112
0.0064 3.7701 410 0.0112
0.0073 3.8161 415 0.0111
0.0065 3.8621 420 0.0113
0.0069 3.9080 425 0.0111
0.0065 3.9540 430 0.0111
0.0076 4.0 435 0.0111
0.0047 4.0460 440 0.0115
0.0053 4.0920 445 0.0119
0.0053 4.1379 450 0.0120
0.0055 4.1839 455 0.0119
0.0053 4.2299 460 0.0117
0.0053 4.2759 465 0.0117
0.0053 4.3218 470 0.0117
0.0058 4.3678 475 0.0116
0.0053 4.4138 480 0.0116
0.0053 4.4598 485 0.0118
0.0051 4.5057 490 0.0117
0.0053 4.5517 495 0.0117
0.0059 4.5977 500 0.0117
0.0055 4.6437 505 0.0117
0.0054 4.6897 510 0.0116
0.0055 4.7356 515 0.0117
0.0056 4.7816 520 0.0116
0.0048 4.8276 525 0.0116
0.0049 4.8736 530 0.0116
0.0043 4.9195 535 0.0116
0.0046 4.9655 540 0.0116

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.4.0+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
6
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for sizhkhy/asianpaints

Adapter
(169)
this model