metadata

license: llama2
base_model: meta-llama/Llama-2-7b-hf
tags:
  - generated_from_trainer
datasets:
  - tyzhu/lmind_nq_train6000_eval6489_v1_qa
metrics:
  - accuracy
model-index:
  - name: lmind_nq_train6000_eval6489_v1_qa_5e-4_lora2
    results:
      - task:
          name: Causal Language Modeling
          type: text-generation
        dataset:
          name: tyzhu/lmind_nq_train6000_eval6489_v1_qa
          type: tyzhu/lmind_nq_train6000_eval6489_v1_qa
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.3654871794871795

lmind_nq_train6000_eval6489_v1_qa_5e-4_lora2

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on the tyzhu/lmind_nq_train6000_eval6489_v1_qa dataset. It achieves the following results on the evaluation set:

Loss: 5.5751
Accuracy: 0.3655

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2
eval_batch_size: 2
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 32
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.05
num_epochs: 50.0

Training results

Training Loss	Epoch	Step	Accuracy	Validation Loss
1.43	1.0	187	0.6162	1.2683
1.0285	2.0	375	0.6129	1.3220
0.7318	3.0	562	0.6076	1.4645
0.5898	4.0	750	0.6050	1.5454
0.5309	5.0	937	0.6026	1.6439
0.4985	6.0	1125	0.6034	1.7220
0.5091	7.0	1312	0.6008	1.8008
0.4796	8.0	1500	0.6001	1.7782
0.4453	9.0	1687	0.5985	1.8255
0.448	10.0	1875	0.5931	1.7979
0.4522	11.0	2062	0.5959	1.8272
0.4552	12.0	2250	0.5946	1.8670
0.4551	13.0	2437	0.5950	1.8706
0.4559	14.0	2625	0.5925	1.8731
0.4581	15.0	2812	0.5932	1.8531
0.4535	16.0	3000	0.5923	1.9492
0.4308	17.0	3187	0.5915	1.8944
0.4312	18.0	3375	0.5904	1.9315
0.4372	19.0	3562	0.5899	1.9201
0.4359	20.0	3750	0.5895	1.9753
0.4363	21.0	3937	0.5877	1.9932
0.4404	22.0	4125	0.5866	2.0326
0.4436	23.0	4312	0.5848	2.0008
0.4438	24.0	4500	0.5877	2.0186
0.4233	25.0	4687	0.5863	2.0452
0.4237	26.0	4875	0.5843	2.0520
0.4289	27.0	5062	0.5828	2.0817
0.4325	28.0	5250	0.5833	2.0512
0.4329	29.0	5437	0.5828	2.0906
0.4314	30.0	5625	0.5824	2.0403
0.431	31.0	5812	0.5824	2.1194
0.4318	32.0	6000	0.5829	2.0985
0.414	33.0	6187	0.5805	2.1533
0.4214	34.0	6375	0.5779	2.1918
0.4264	35.0	6562	0.5774	2.1835
0.4361	36.0	6750	0.5771	2.1864
0.4369	37.0	6937	0.5761	2.1546
0.4362	38.0	7125	0.5752	2.1423
0.4322	39.0	7312	0.5778	2.1938
0.4359	40.0	7500	0.5752	2.2000
0.4153	41.0	7687	0.5751	2.2344
0.4195	42.0	7875	0.5747	2.2526
0.9164	43.0	8062	0.5717	2.1985
0.4295	44.0	8250	0.5718	2.2145
0.4298	45.0	8437	0.5714	2.2211
0.4446	46.0	8625	0.5703	2.2656
2.0935	47.0	8812	0.5081	2.6962
3.096	48.0	9000	0.4494	3.2961
2.9615	49.0	9187	0.4241	4.3483
4.5736	49.87	9350	0.3655	5.5751

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.14.1