TianyiQ's picture
Upload folder using huggingface_hub
aeb3166 verified
|
raw
history blame
3.65 kB
metadata
license: other
base_model: meta-llama/Meta-Llama-3-8B
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: C013_llama3-8b-base_instruct_20240428_005832
    results: []

C013_llama3-8b-base_instruct_20240428_005832

This model is a fine-tuned version of ./output/training_results/C013_llama3-8b-base_pretrain_20240428_005832/ on the instructions_curated dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8123

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 4.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.9805 0.0208 1 0.9737
0.9446 0.1042 5 0.9455
0.8481 0.2083 10 0.8154
0.7794 0.3125 15 0.8123
0.7798 0.4167 20 0.8411
0.8576 0.5208 25 0.8676
0.8852 0.625 30 0.8673
0.8529 0.7292 35 0.8561
0.8224 0.8333 40 0.8470
0.8536 0.9375 45 0.8378
0.662 1.0417 50 0.8294
0.437 1.1458 55 0.8531
0.4402 1.25 60 0.8569
0.4244 1.3542 65 0.8569
0.4495 1.4583 70 0.8547
0.4689 1.5625 75 0.8494
0.4309 1.6667 80 0.8461
0.4299 1.7708 85 0.8446
0.4461 1.875 90 0.8440
0.4474 1.9792 95 0.8439
0.3614 2.0833 100 0.8445
0.3861 2.1875 105 0.8457
0.3829 2.2917 110 0.8473
0.3764 2.3958 115 0.8488
0.3655 2.5 120 0.8500
0.4243 2.6042 125 0.8511
0.3884 2.7083 130 0.8520
0.3634 2.8125 135 0.8528
0.3846 2.9167 140 0.8537
0.3872 3.0208 145 0.8547
0.3869 3.125 150 0.8558
0.3876 3.2292 155 0.8566
0.3844 3.3333 160 0.8573
0.3535 3.4375 165 0.8579
0.3488 3.5417 170 0.8588
0.3464 3.6458 175 0.8598
0.361 3.75 180 0.8607
0.3674 3.8542 185 0.8612
0.3988 3.9583 190 0.8612

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.19.1