File size: 3,694 Bytes
d18d9cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b942bd6
0827a4f
d18d9cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ee7a0b9
d18d9cd
 
 
b942bd6
 
0827a4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d18d9cd
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
tags:
- generated_from_trainer
model-index:
- name: llama3_question
  results: []
library_name: peft
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# llama3_question

This model was trained from scratch on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8999

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float16
### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 6

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.9948        | 0.14  | 1    | 2.8184          |
| 2.8697        | 0.29  | 2    | 2.6592          |
| 2.6264        | 0.43  | 3    | 2.4946          |
| 2.625         | 0.57  | 4    | 2.3588          |
| 2.3888        | 0.71  | 5    | 2.2385          |
| 2.2949        | 0.86  | 6    | 2.1219          |
| 2.5261        | 1.0   | 7    | 2.0221          |
| 2.0264        | 1.14  | 8    | 1.9246          |
| 1.9661        | 1.29  | 9    | 1.8298          |
| 1.9106        | 1.43  | 10   | 1.7456          |
| 1.8448        | 1.57  | 11   | 1.6686          |
| 1.619         | 1.71  | 12   | 1.6050          |
| 1.5881        | 1.86  | 13   | 1.5468          |
| 1.6859        | 2.0   | 14   | 1.4939          |
| 1.4643        | 2.14  | 15   | 1.4453          |
| 1.4583        | 2.29  | 16   | 1.3949          |
| 1.4086        | 2.43  | 17   | 1.3441          |
| 1.3314        | 2.57  | 18   | 1.2914          |
| 1.3502        | 2.71  | 19   | 1.2400          |
| 1.226         | 2.86  | 20   | 1.1892          |
| 1.073         | 3.0   | 21   | 1.1445          |
| 1.1113        | 3.14  | 22   | 1.0995          |
| 1.1292        | 3.29  | 23   | 1.0570          |
| 1.0242        | 3.43  | 24   | 1.0164          |
| 0.9279        | 3.57  | 25   | 0.9826          |
| 0.8518        | 3.71  | 26   | 0.9617          |
| 1.0302        | 3.86  | 27   | 0.9491          |
| 1.1736        | 4.0   | 28   | 0.9418          |
| 0.8832        | 4.14  | 29   | 0.9352          |
| 0.9151        | 4.29  | 30   | 0.9301          |
| 0.7495        | 4.43  | 31   | 0.9256          |
| 0.8785        | 4.57  | 32   | 0.9220          |
| 0.8635        | 4.71  | 33   | 0.9180          |
| 0.9499        | 4.86  | 34   | 0.9150          |
| 0.8744        | 5.0   | 35   | 0.9125          |
| 0.8221        | 5.14  | 36   | 0.9093          |
| 0.7826        | 5.29  | 37   | 0.9064          |
| 0.8421        | 5.43  | 38   | 0.9047          |
| 0.8155        | 5.57  | 39   | 0.9029          |
| 0.9097        | 5.71  | 40   | 0.9010          |
| 0.7449        | 5.86  | 41   | 0.9003          |
| 0.9502        | 6.0   | 42   | 0.8999          |


### Framework versions

- PEFT 0.5.0
- Transformers 4.37.2
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.1