metadata

license: other
base_model: meta-llama/Meta-Llama-3-8B
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: C013_llama3-8b-base_instruct_20240428_005832
    results: []

C013_llama3-8b-base_instruct_20240428_005832

This model is a fine-tuned version of ./output/training_results/C013_llama3-8b-base_pretrain_20240428_005832/ on the instructions_curated dataset. It achieves the following results on the evaluation set:

Loss: 0.8123

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.5e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 64
total_eval_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: polynomial
lr_scheduler_warmup_steps: 20
num_epochs: 4.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.9805	0.0208	1	0.9737
0.9446	0.1042	5	0.9455
0.8481	0.2083	10	0.8154
0.7794	0.3125	15	0.8123
0.7798	0.4167	20	0.8411
0.8576	0.5208	25	0.8676
0.8852	0.625	30	0.8673
0.8529	0.7292	35	0.8561
0.8224	0.8333	40	0.8470
0.8536	0.9375	45	0.8378
0.662	1.0417	50	0.8294
0.437	1.1458	55	0.8531
0.4402	1.25	60	0.8569
0.4244	1.3542	65	0.8569
0.4495	1.4583	70	0.8547
0.4689	1.5625	75	0.8494
0.4309	1.6667	80	0.8461
0.4299	1.7708	85	0.8446
0.4461	1.875	90	0.8440
0.4474	1.9792	95	0.8439
0.3614	2.0833	100	0.8445
0.3861	2.1875	105	0.8457
0.3829	2.2917	110	0.8473
0.3764	2.3958	115	0.8488
0.3655	2.5	120	0.8500
0.4243	2.6042	125	0.8511
0.3884	2.7083	130	0.8520
0.3634	2.8125	135	0.8528
0.3846	2.9167	140	0.8537
0.3872	3.0208	145	0.8547
0.3869	3.125	150	0.8558
0.3876	3.2292	155	0.8566
0.3844	3.3333	160	0.8573
0.3535	3.4375	165	0.8579
0.3488	3.5417	170	0.8588
0.3464	3.6458	175	0.8598
0.361	3.75	180	0.8607
0.3674	3.8542	185	0.8612
0.3988	3.9583	190	0.8612

Framework versions

Transformers 4.40.0
Pytorch 2.1.2+cu121
Datasets 2.18.0
Tokenizers 0.19.1