Casper0508's picture
End of training
7fc02cd verified
metadata
license: llama2
base_model: meta-llama/Llama-2-7b-chat-hf
tags:
  - generated_from_trainer
model-index:
  - name: MSc_llama2_finetuned_model_secondData6
    results: []
library_name: peft

MSc_llama2_finetuned_model_secondData6

This model is a fine-tuned version of meta-llama/Llama-2-7b-chat-hf on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6856

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • _load_in_8bit: False
  • _load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: bfloat16
  • load_in_4bit: True
  • load_in_8bit: False

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 250

Training results

Training Loss Epoch Step Validation Loss
3.9863 1.33 10 3.6593
3.3725 2.67 20 2.9649
2.6441 4.0 30 2.1968
1.9553 5.33 40 1.7116
1.6093 6.67 50 1.4445
1.317 8.0 60 1.1217
0.9709 9.33 70 0.8562
0.8196 10.67 80 0.7974
0.7604 12.0 90 0.7608
0.7056 13.33 100 0.7340
0.6698 14.67 110 0.7142
0.6319 16.0 120 0.7030
0.6102 17.33 130 0.6942
0.5813 18.67 140 0.6916
0.572 20.0 150 0.6906
0.5581 21.33 160 0.6842
0.5377 22.67 170 0.6850
0.535 24.0 180 0.6862
0.5263 25.33 190 0.6841
0.5182 26.67 200 0.6861
0.5204 28.0 210 0.6857
0.5161 29.33 220 0.6855
0.5084 30.67 230 0.6858
0.5144 32.0 240 0.6863
0.5104 33.33 250 0.6856

Framework versions

  • PEFT 0.4.0
  • Transformers 4.38.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.13.1
  • Tokenizers 0.15.2