metadata

base_model: meta-llama/Meta-Llama-3-8B
datasets:
  - generator
library_name: peft
license: llama3
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: Meta-Llama-3-8B_AviationQA-cosine
    results: []

Meta-Llama-3-8B_AviationQA-cosine

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the generator dataset. It achieves the following results on the evaluation set:

Loss: 0.6061

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 3
eval_batch_size: 6
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 6
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
0.7872	0.0590	50	0.7652
0.7373	0.1181	100	0.7328
0.7242	0.1771	150	0.7182
0.7143	0.2361	200	0.7107
0.73	0.2952	250	0.7046
0.7159	0.3542	300	0.6973
0.7211	0.4132	350	0.6921
0.7096	0.4723	400	0.6873
0.6845	0.5313	450	0.6824
0.7251	0.5903	500	0.6783
0.6685	0.6494	550	0.6720
0.697	0.7084	600	0.6667
0.7006	0.7674	650	0.6639
0.6952	0.8264	700	0.6618
0.6649	0.8855	750	0.6596
0.6877	0.9445	800	0.6553
0.6673	1.0035	850	0.6531
0.6611	1.0626	900	0.6487
0.6971	1.1216	950	0.6452
0.6652	1.1806	1000	0.6423
0.645	1.2397	1050	0.6397
0.6494	1.2987	1100	0.6388
0.6623	1.3577	1150	0.6359
0.6552	1.4168	1200	0.6334
0.6465	1.4758	1250	0.6297
0.6495	1.5348	1300	0.6285
0.6521	1.5939	1350	0.6272
0.6505	1.6529	1400	0.6261
0.6773	1.7119	1450	0.6238
0.6487	1.7710	1500	0.6225
0.639	1.8300	1550	0.6208
0.6465	1.8890	1600	0.6194
0.6528	1.9481	1650	0.6182
0.6265	2.0071	1700	0.6164
0.6161	2.0661	1750	0.6137
0.6236	2.1251	1800	0.6118
0.6371	2.1842	1850	0.6111
0.6294	2.2432	1900	0.6093
0.6257	2.3022	1950	0.6087
0.6204	2.3613	2000	0.6081
0.6133	2.4203	2050	0.6073
0.6108	2.4793	2100	0.6068
0.622	2.5384	2150	0.6066
0.6233	2.5974	2200	0.6064
0.6183	2.6564	2250	0.6063
0.6237	2.7155	2300	0.6062
0.6388	2.7745	2350	0.6062
0.6236	2.8335	2400	0.6062
0.6236	2.8926	2450	0.6062
0.6205	2.9516	2500	0.6061

Framework versions

PEFT 0.11.1
Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1