---
library_name: peft
license: other
base_model: unsloth/Llama-3.2-3B-Instruct
tags:
- llama-factory
- lora
- unsloth
- generated_from_trainer
model-index:
- name: llm3br256
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# llm3br256

This model is a fine-tuned version of [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) on the rommel_importgenius_4b8 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0130

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5.0

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 0.0672        | 0.0418 | 5    | 0.0755          |
| 0.0476        | 0.0837 | 10   | 0.0452          |
| 0.0337        | 0.1255 | 15   | 0.0356          |
| 0.0333        | 0.1674 | 20   | 0.0308          |
| 0.0258        | 0.2092 | 25   | 0.0272          |
| 0.023         | 0.2510 | 30   | 0.0255          |
| 0.0202        | 0.2929 | 35   | 0.0234          |
| 0.0188        | 0.3347 | 40   | 0.0218          |
| 0.0185        | 0.3766 | 45   | 0.0208          |
| 0.0199        | 0.4184 | 50   | 0.0200          |
| 0.0198        | 0.4603 | 55   | 0.0195          |
| 0.0179        | 0.5021 | 60   | 0.0189          |
| 0.0185        | 0.5439 | 65   | 0.0186          |
| 0.0174        | 0.5858 | 70   | 0.0186          |
| 0.0157        | 0.6276 | 75   | 0.0183          |
| 0.0175        | 0.6695 | 80   | 0.0176          |
| 0.0175        | 0.7113 | 85   | 0.0176          |
| 0.0164        | 0.7531 | 90   | 0.0171          |
| 0.0182        | 0.7950 | 95   | 0.0168          |
| 0.019         | 0.8368 | 100  | 0.0167          |
| 0.0163        | 0.8787 | 105  | 0.0158          |
| 0.0145        | 0.9205 | 110  | 0.0158          |
| 0.0165        | 0.9623 | 115  | 0.0155          |
| 0.0205        | 1.0042 | 120  | 0.0152          |
| 0.0105        | 1.0460 | 125  | 0.0155          |
| 0.0147        | 1.0879 | 130  | 0.0157          |
| 0.0148        | 1.1297 | 135  | 0.0160          |
| 0.0115        | 1.1715 | 140  | 0.0153          |
| 0.0166        | 1.2134 | 145  | 0.0153          |
| 0.015         | 1.2552 | 150  | 0.0156          |
| 0.0148        | 1.2971 | 155  | 0.0157          |
| 0.0112        | 1.3389 | 160  | 0.0159          |
| 0.0128        | 1.3808 | 165  | 0.0153          |
| 0.0125        | 1.4226 | 170  | 0.0151          |
| 0.0137        | 1.4644 | 175  | 0.0150          |
| 0.0131        | 1.5063 | 180  | 0.0145          |
| 0.0105        | 1.5481 | 185  | 0.0145          |
| 0.0126        | 1.5900 | 190  | 0.0144          |
| 0.0119        | 1.6318 | 195  | 0.0145          |
| 0.016         | 1.6736 | 200  | 0.0147          |
| 0.0143        | 1.7155 | 205  | 0.0150          |
| 0.0139        | 1.7573 | 210  | 0.0150          |
| 0.0139        | 1.7992 | 215  | 0.0145          |
| 0.0161        | 1.8410 | 220  | 0.0143          |
| 0.0098        | 1.8828 | 225  | 0.0138          |
| 0.0108        | 1.9247 | 230  | 0.0140          |
| 0.0117        | 1.9665 | 235  | 0.0141          |
| 0.0109        | 2.0084 | 240  | 0.0138          |
| 0.0093        | 2.0502 | 245  | 0.0145          |
| 0.0102        | 2.0921 | 250  | 0.0143          |
| 0.0104        | 2.1339 | 255  | 0.0141          |
| 0.0108        | 2.1757 | 260  | 0.0147          |
| 0.0104        | 2.2176 | 265  | 0.0142          |
| 0.0103        | 2.2594 | 270  | 0.0144          |
| 0.0107        | 2.3013 | 275  | 0.0144          |
| 0.0104        | 2.3431 | 280  | 0.0141          |
| 0.0092        | 2.3849 | 285  | 0.0143          |
| 0.0107        | 2.4268 | 290  | 0.0140          |
| 0.0112        | 2.4686 | 295  | 0.0143          |
| 0.01          | 2.5105 | 300  | 0.0143          |
| 0.0096        | 2.5523 | 305  | 0.0138          |
| 0.0096        | 2.5941 | 310  | 0.0137          |
| 0.0099        | 2.6360 | 315  | 0.0137          |
| 0.009         | 2.6778 | 320  | 0.0138          |
| 0.0097        | 2.7197 | 325  | 0.0137          |
| 0.0097        | 2.7615 | 330  | 0.0136          |
| 0.0108        | 2.8033 | 335  | 0.0136          |
| 0.0092        | 2.8452 | 340  | 0.0132          |
| 0.0092        | 2.8870 | 345  | 0.0132          |
| 0.0095        | 2.9289 | 350  | 0.0130          |
| 0.0094        | 2.9707 | 355  | 0.0127          |
| 0.0088        | 3.0126 | 360  | 0.0127          |
| 0.0086        | 3.0544 | 365  | 0.0131          |
| 0.0094        | 3.0962 | 370  | 0.0134          |
| 0.0075        | 3.1381 | 375  | 0.0137          |
| 0.0068        | 3.1799 | 380  | 0.0136          |
| 0.0096        | 3.2218 | 385  | 0.0136          |
| 0.0088        | 3.2636 | 390  | 0.0137          |
| 0.008         | 3.3054 | 395  | 0.0138          |
| 0.0085        | 3.3473 | 400  | 0.0137          |
| 0.0091        | 3.3891 | 405  | 0.0136          |
| 0.0049        | 3.4310 | 410  | 0.0134          |
| 0.0072        | 3.4728 | 415  | 0.0131          |
| 0.0063        | 3.5146 | 420  | 0.0133          |
| 0.0076        | 3.5565 | 425  | 0.0131          |
| 0.0076        | 3.5983 | 430  | 0.0129          |
| 0.0074        | 3.6402 | 435  | 0.0130          |
| 0.0074        | 3.6820 | 440  | 0.0132          |
| 0.0067        | 3.7238 | 445  | 0.0132          |
| 0.0064        | 3.7657 | 450  | 0.0130          |
| 0.0091        | 3.8075 | 455  | 0.0130          |
| 0.0074        | 3.8494 | 460  | 0.0131          |
| 0.0076        | 3.8912 | 465  | 0.0132          |
| 0.007         | 3.9331 | 470  | 0.0132          |
| 0.0082        | 3.9749 | 475  | 0.0132          |
| 0.0059        | 4.0167 | 480  | 0.0133          |
| 0.0066        | 4.0586 | 485  | 0.0135          |
| 0.0063        | 4.1004 | 490  | 0.0140          |
| 0.0059        | 4.1423 | 495  | 0.0144          |
| 0.0066        | 4.1841 | 500  | 0.0142          |
| 0.0055        | 4.2259 | 505  | 0.0142          |
| 0.0067        | 4.2678 | 510  | 0.0142          |
| 0.0065        | 4.3096 | 515  | 0.0143          |
| 0.0062        | 4.3515 | 520  | 0.0142          |
| 0.0065        | 4.3933 | 525  | 0.0141          |
| 0.007         | 4.4351 | 530  | 0.0139          |
| 0.0058        | 4.4770 | 535  | 0.0139          |
| 0.0056        | 4.5188 | 540  | 0.0139          |
| 0.0062        | 4.5607 | 545  | 0.0139          |
| 0.0061        | 4.6025 | 550  | 0.0139          |
| 0.0061        | 4.6444 | 555  | 0.0139          |
| 0.0068        | 4.6862 | 560  | 0.0138          |
| 0.0069        | 4.7280 | 565  | 0.0139          |
| 0.0063        | 4.7699 | 570  | 0.0139          |
| 0.0065        | 4.8117 | 575  | 0.0139          |
| 0.0064        | 4.8536 | 580  | 0.0139          |
| 0.0062        | 4.8954 | 585  | 0.0139          |
| 0.0065        | 4.9372 | 590  | 0.0139          |
| 0.0055        | 4.9791 | 595  | 0.0139          |


### Framework versions

- PEFT 0.12.0
- Transformers 4.46.1
- Pytorch 2.4.0+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3