tridungduong16 commited on
Commit
ffc06d5
·
1 Parent(s): 3f114ba

update model

Browse files
README.md ADDED
@@ -0,0 +1,3104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ ---
4
+ ## Training procedure
5
+
6
+
7
+ The following `bitsandbytes` quantization config was used during training:
8
+ - load_in_8bit: False
9
+ - load_in_4bit: True
10
+ - llm_int8_threshold: 6.0
11
+ - llm_int8_skip_modules: None
12
+ - llm_int8_enable_fp32_cpu_offload: False
13
+ - llm_int8_has_fp16_weight: False
14
+ - bnb_4bit_quant_type: nf4
15
+ - bnb_4bit_use_double_quant: True
16
+ - bnb_4bit_compute_dtype: bfloat16
17
+
18
+ The following `bitsandbytes` quantization config was used during training:
19
+ - load_in_8bit: False
20
+ - load_in_4bit: True
21
+ - llm_int8_threshold: 6.0
22
+ - llm_int8_skip_modules: None
23
+ - llm_int8_enable_fp32_cpu_offload: False
24
+ - llm_int8_has_fp16_weight: False
25
+ - bnb_4bit_quant_type: nf4
26
+ - bnb_4bit_use_double_quant: True
27
+ - bnb_4bit_compute_dtype: bfloat16
28
+
29
+ The following `bitsandbytes` quantization config was used during training:
30
+ - load_in_8bit: False
31
+ - load_in_4bit: True
32
+ - llm_int8_threshold: 6.0
33
+ - llm_int8_skip_modules: None
34
+ - llm_int8_enable_fp32_cpu_offload: False
35
+ - llm_int8_has_fp16_weight: False
36
+ - bnb_4bit_quant_type: nf4
37
+ - bnb_4bit_use_double_quant: True
38
+ - bnb_4bit_compute_dtype: bfloat16
39
+
40
+ The following `bitsandbytes` quantization config was used during training:
41
+ - load_in_8bit: False
42
+ - load_in_4bit: True
43
+ - llm_int8_threshold: 6.0
44
+ - llm_int8_skip_modules: None
45
+ - llm_int8_enable_fp32_cpu_offload: False
46
+ - llm_int8_has_fp16_weight: False
47
+ - bnb_4bit_quant_type: nf4
48
+ - bnb_4bit_use_double_quant: True
49
+ - bnb_4bit_compute_dtype: bfloat16
50
+
51
+ The following `bitsandbytes` quantization config was used during training:
52
+ - load_in_8bit: False
53
+ - load_in_4bit: True
54
+ - llm_int8_threshold: 6.0
55
+ - llm_int8_skip_modules: None
56
+ - llm_int8_enable_fp32_cpu_offload: False
57
+ - llm_int8_has_fp16_weight: False
58
+ - bnb_4bit_quant_type: nf4
59
+ - bnb_4bit_use_double_quant: True
60
+ - bnb_4bit_compute_dtype: bfloat16
61
+
62
+ The following `bitsandbytes` quantization config was used during training:
63
+ - load_in_8bit: False
64
+ - load_in_4bit: True
65
+ - llm_int8_threshold: 6.0
66
+ - llm_int8_skip_modules: None
67
+ - llm_int8_enable_fp32_cpu_offload: False
68
+ - llm_int8_has_fp16_weight: False
69
+ - bnb_4bit_quant_type: nf4
70
+ - bnb_4bit_use_double_quant: True
71
+ - bnb_4bit_compute_dtype: bfloat16
72
+
73
+ The following `bitsandbytes` quantization config was used during training:
74
+ - load_in_8bit: False
75
+ - load_in_4bit: True
76
+ - llm_int8_threshold: 6.0
77
+ - llm_int8_skip_modules: None
78
+ - llm_int8_enable_fp32_cpu_offload: False
79
+ - llm_int8_has_fp16_weight: False
80
+ - bnb_4bit_quant_type: nf4
81
+ - bnb_4bit_use_double_quant: True
82
+ - bnb_4bit_compute_dtype: bfloat16
83
+
84
+ The following `bitsandbytes` quantization config was used during training:
85
+ - load_in_8bit: False
86
+ - load_in_4bit: True
87
+ - llm_int8_threshold: 6.0
88
+ - llm_int8_skip_modules: None
89
+ - llm_int8_enable_fp32_cpu_offload: False
90
+ - llm_int8_has_fp16_weight: False
91
+ - bnb_4bit_quant_type: nf4
92
+ - bnb_4bit_use_double_quant: True
93
+ - bnb_4bit_compute_dtype: bfloat16
94
+
95
+ The following `bitsandbytes` quantization config was used during training:
96
+ - load_in_8bit: False
97
+ - load_in_4bit: True
98
+ - llm_int8_threshold: 6.0
99
+ - llm_int8_skip_modules: None
100
+ - llm_int8_enable_fp32_cpu_offload: False
101
+ - llm_int8_has_fp16_weight: False
102
+ - bnb_4bit_quant_type: nf4
103
+ - bnb_4bit_use_double_quant: True
104
+ - bnb_4bit_compute_dtype: bfloat16
105
+
106
+ The following `bitsandbytes` quantization config was used during training:
107
+ - load_in_8bit: False
108
+ - load_in_4bit: True
109
+ - llm_int8_threshold: 6.0
110
+ - llm_int8_skip_modules: None
111
+ - llm_int8_enable_fp32_cpu_offload: False
112
+ - llm_int8_has_fp16_weight: False
113
+ - bnb_4bit_quant_type: nf4
114
+ - bnb_4bit_use_double_quant: True
115
+ - bnb_4bit_compute_dtype: bfloat16
116
+
117
+ The following `bitsandbytes` quantization config was used during training:
118
+ - load_in_8bit: False
119
+ - load_in_4bit: True
120
+ - llm_int8_threshold: 6.0
121
+ - llm_int8_skip_modules: None
122
+ - llm_int8_enable_fp32_cpu_offload: False
123
+ - llm_int8_has_fp16_weight: False
124
+ - bnb_4bit_quant_type: nf4
125
+ - bnb_4bit_use_double_quant: True
126
+ - bnb_4bit_compute_dtype: bfloat16
127
+
128
+ The following `bitsandbytes` quantization config was used during training:
129
+ - load_in_8bit: False
130
+ - load_in_4bit: True
131
+ - llm_int8_threshold: 6.0
132
+ - llm_int8_skip_modules: None
133
+ - llm_int8_enable_fp32_cpu_offload: False
134
+ - llm_int8_has_fp16_weight: False
135
+ - bnb_4bit_quant_type: nf4
136
+ - bnb_4bit_use_double_quant: True
137
+ - bnb_4bit_compute_dtype: bfloat16
138
+
139
+ The following `bitsandbytes` quantization config was used during training:
140
+ - load_in_8bit: False
141
+ - load_in_4bit: True
142
+ - llm_int8_threshold: 6.0
143
+ - llm_int8_skip_modules: None
144
+ - llm_int8_enable_fp32_cpu_offload: False
145
+ - llm_int8_has_fp16_weight: False
146
+ - bnb_4bit_quant_type: nf4
147
+ - bnb_4bit_use_double_quant: True
148
+ - bnb_4bit_compute_dtype: bfloat16
149
+
150
+ The following `bitsandbytes` quantization config was used during training:
151
+ - load_in_8bit: False
152
+ - load_in_4bit: True
153
+ - llm_int8_threshold: 6.0
154
+ - llm_int8_skip_modules: None
155
+ - llm_int8_enable_fp32_cpu_offload: False
156
+ - llm_int8_has_fp16_weight: False
157
+ - bnb_4bit_quant_type: nf4
158
+ - bnb_4bit_use_double_quant: True
159
+ - bnb_4bit_compute_dtype: bfloat16
160
+
161
+ The following `bitsandbytes` quantization config was used during training:
162
+ - load_in_8bit: False
163
+ - load_in_4bit: True
164
+ - llm_int8_threshold: 6.0
165
+ - llm_int8_skip_modules: None
166
+ - llm_int8_enable_fp32_cpu_offload: False
167
+ - llm_int8_has_fp16_weight: False
168
+ - bnb_4bit_quant_type: nf4
169
+ - bnb_4bit_use_double_quant: True
170
+ - bnb_4bit_compute_dtype: bfloat16
171
+
172
+ The following `bitsandbytes` quantization config was used during training:
173
+ - load_in_8bit: False
174
+ - load_in_4bit: True
175
+ - llm_int8_threshold: 6.0
176
+ - llm_int8_skip_modules: None
177
+ - llm_int8_enable_fp32_cpu_offload: False
178
+ - llm_int8_has_fp16_weight: False
179
+ - bnb_4bit_quant_type: nf4
180
+ - bnb_4bit_use_double_quant: True
181
+ - bnb_4bit_compute_dtype: bfloat16
182
+
183
+ The following `bitsandbytes` quantization config was used during training:
184
+ - load_in_8bit: False
185
+ - load_in_4bit: True
186
+ - llm_int8_threshold: 6.0
187
+ - llm_int8_skip_modules: None
188
+ - llm_int8_enable_fp32_cpu_offload: False
189
+ - llm_int8_has_fp16_weight: False
190
+ - bnb_4bit_quant_type: nf4
191
+ - bnb_4bit_use_double_quant: True
192
+ - bnb_4bit_compute_dtype: bfloat16
193
+
194
+ The following `bitsandbytes` quantization config was used during training:
195
+ - load_in_8bit: False
196
+ - load_in_4bit: True
197
+ - llm_int8_threshold: 6.0
198
+ - llm_int8_skip_modules: None
199
+ - llm_int8_enable_fp32_cpu_offload: False
200
+ - llm_int8_has_fp16_weight: False
201
+ - bnb_4bit_quant_type: nf4
202
+ - bnb_4bit_use_double_quant: True
203
+ - bnb_4bit_compute_dtype: bfloat16
204
+
205
+ The following `bitsandbytes` quantization config was used during training:
206
+ - load_in_8bit: False
207
+ - load_in_4bit: True
208
+ - llm_int8_threshold: 6.0
209
+ - llm_int8_skip_modules: None
210
+ - llm_int8_enable_fp32_cpu_offload: False
211
+ - llm_int8_has_fp16_weight: False
212
+ - bnb_4bit_quant_type: nf4
213
+ - bnb_4bit_use_double_quant: True
214
+ - bnb_4bit_compute_dtype: bfloat16
215
+
216
+ The following `bitsandbytes` quantization config was used during training:
217
+ - load_in_8bit: False
218
+ - load_in_4bit: True
219
+ - llm_int8_threshold: 6.0
220
+ - llm_int8_skip_modules: None
221
+ - llm_int8_enable_fp32_cpu_offload: False
222
+ - llm_int8_has_fp16_weight: False
223
+ - bnb_4bit_quant_type: nf4
224
+ - bnb_4bit_use_double_quant: True
225
+ - bnb_4bit_compute_dtype: bfloat16
226
+
227
+ The following `bitsandbytes` quantization config was used during training:
228
+ - load_in_8bit: False
229
+ - load_in_4bit: True
230
+ - llm_int8_threshold: 6.0
231
+ - llm_int8_skip_modules: None
232
+ - llm_int8_enable_fp32_cpu_offload: False
233
+ - llm_int8_has_fp16_weight: False
234
+ - bnb_4bit_quant_type: nf4
235
+ - bnb_4bit_use_double_quant: True
236
+ - bnb_4bit_compute_dtype: bfloat16
237
+
238
+ The following `bitsandbytes` quantization config was used during training:
239
+ - load_in_8bit: False
240
+ - load_in_4bit: True
241
+ - llm_int8_threshold: 6.0
242
+ - llm_int8_skip_modules: None
243
+ - llm_int8_enable_fp32_cpu_offload: False
244
+ - llm_int8_has_fp16_weight: False
245
+ - bnb_4bit_quant_type: nf4
246
+ - bnb_4bit_use_double_quant: True
247
+ - bnb_4bit_compute_dtype: bfloat16
248
+
249
+ The following `bitsandbytes` quantization config was used during training:
250
+ - load_in_8bit: False
251
+ - load_in_4bit: True
252
+ - llm_int8_threshold: 6.0
253
+ - llm_int8_skip_modules: None
254
+ - llm_int8_enable_fp32_cpu_offload: False
255
+ - llm_int8_has_fp16_weight: False
256
+ - bnb_4bit_quant_type: nf4
257
+ - bnb_4bit_use_double_quant: True
258
+ - bnb_4bit_compute_dtype: bfloat16
259
+
260
+ The following `bitsandbytes` quantization config was used during training:
261
+ - load_in_8bit: False
262
+ - load_in_4bit: True
263
+ - llm_int8_threshold: 6.0
264
+ - llm_int8_skip_modules: None
265
+ - llm_int8_enable_fp32_cpu_offload: False
266
+ - llm_int8_has_fp16_weight: False
267
+ - bnb_4bit_quant_type: nf4
268
+ - bnb_4bit_use_double_quant: True
269
+ - bnb_4bit_compute_dtype: bfloat16
270
+
271
+ The following `bitsandbytes` quantization config was used during training:
272
+ - load_in_8bit: False
273
+ - load_in_4bit: True
274
+ - llm_int8_threshold: 6.0
275
+ - llm_int8_skip_modules: None
276
+ - llm_int8_enable_fp32_cpu_offload: False
277
+ - llm_int8_has_fp16_weight: False
278
+ - bnb_4bit_quant_type: nf4
279
+ - bnb_4bit_use_double_quant: True
280
+ - bnb_4bit_compute_dtype: bfloat16
281
+
282
+ The following `bitsandbytes` quantization config was used during training:
283
+ - load_in_8bit: False
284
+ - load_in_4bit: True
285
+ - llm_int8_threshold: 6.0
286
+ - llm_int8_skip_modules: None
287
+ - llm_int8_enable_fp32_cpu_offload: False
288
+ - llm_int8_has_fp16_weight: False
289
+ - bnb_4bit_quant_type: nf4
290
+ - bnb_4bit_use_double_quant: True
291
+ - bnb_4bit_compute_dtype: bfloat16
292
+
293
+ The following `bitsandbytes` quantization config was used during training:
294
+ - load_in_8bit: False
295
+ - load_in_4bit: True
296
+ - llm_int8_threshold: 6.0
297
+ - llm_int8_skip_modules: None
298
+ - llm_int8_enable_fp32_cpu_offload: False
299
+ - llm_int8_has_fp16_weight: False
300
+ - bnb_4bit_quant_type: nf4
301
+ - bnb_4bit_use_double_quant: True
302
+ - bnb_4bit_compute_dtype: bfloat16
303
+
304
+ The following `bitsandbytes` quantization config was used during training:
305
+ - load_in_8bit: False
306
+ - load_in_4bit: True
307
+ - llm_int8_threshold: 6.0
308
+ - llm_int8_skip_modules: None
309
+ - llm_int8_enable_fp32_cpu_offload: False
310
+ - llm_int8_has_fp16_weight: False
311
+ - bnb_4bit_quant_type: nf4
312
+ - bnb_4bit_use_double_quant: True
313
+ - bnb_4bit_compute_dtype: bfloat16
314
+
315
+ The following `bitsandbytes` quantization config was used during training:
316
+ - load_in_8bit: False
317
+ - load_in_4bit: True
318
+ - llm_int8_threshold: 6.0
319
+ - llm_int8_skip_modules: None
320
+ - llm_int8_enable_fp32_cpu_offload: False
321
+ - llm_int8_has_fp16_weight: False
322
+ - bnb_4bit_quant_type: nf4
323
+ - bnb_4bit_use_double_quant: True
324
+ - bnb_4bit_compute_dtype: bfloat16
325
+
326
+ The following `bitsandbytes` quantization config was used during training:
327
+ - load_in_8bit: False
328
+ - load_in_4bit: True
329
+ - llm_int8_threshold: 6.0
330
+ - llm_int8_skip_modules: None
331
+ - llm_int8_enable_fp32_cpu_offload: False
332
+ - llm_int8_has_fp16_weight: False
333
+ - bnb_4bit_quant_type: nf4
334
+ - bnb_4bit_use_double_quant: True
335
+ - bnb_4bit_compute_dtype: bfloat16
336
+
337
+ The following `bitsandbytes` quantization config was used during training:
338
+ - load_in_8bit: False
339
+ - load_in_4bit: True
340
+ - llm_int8_threshold: 6.0
341
+ - llm_int8_skip_modules: None
342
+ - llm_int8_enable_fp32_cpu_offload: False
343
+ - llm_int8_has_fp16_weight: False
344
+ - bnb_4bit_quant_type: nf4
345
+ - bnb_4bit_use_double_quant: True
346
+ - bnb_4bit_compute_dtype: bfloat16
347
+
348
+ The following `bitsandbytes` quantization config was used during training:
349
+ - load_in_8bit: False
350
+ - load_in_4bit: True
351
+ - llm_int8_threshold: 6.0
352
+ - llm_int8_skip_modules: None
353
+ - llm_int8_enable_fp32_cpu_offload: False
354
+ - llm_int8_has_fp16_weight: False
355
+ - bnb_4bit_quant_type: nf4
356
+ - bnb_4bit_use_double_quant: True
357
+ - bnb_4bit_compute_dtype: bfloat16
358
+
359
+ The following `bitsandbytes` quantization config was used during training:
360
+ - load_in_8bit: False
361
+ - load_in_4bit: True
362
+ - llm_int8_threshold: 6.0
363
+ - llm_int8_skip_modules: None
364
+ - llm_int8_enable_fp32_cpu_offload: False
365
+ - llm_int8_has_fp16_weight: False
366
+ - bnb_4bit_quant_type: nf4
367
+ - bnb_4bit_use_double_quant: True
368
+ - bnb_4bit_compute_dtype: bfloat16
369
+
370
+ The following `bitsandbytes` quantization config was used during training:
371
+ - load_in_8bit: False
372
+ - load_in_4bit: True
373
+ - llm_int8_threshold: 6.0
374
+ - llm_int8_skip_modules: None
375
+ - llm_int8_enable_fp32_cpu_offload: False
376
+ - llm_int8_has_fp16_weight: False
377
+ - bnb_4bit_quant_type: nf4
378
+ - bnb_4bit_use_double_quant: True
379
+ - bnb_4bit_compute_dtype: bfloat16
380
+
381
+ The following `bitsandbytes` quantization config was used during training:
382
+ - load_in_8bit: False
383
+ - load_in_4bit: True
384
+ - llm_int8_threshold: 6.0
385
+ - llm_int8_skip_modules: None
386
+ - llm_int8_enable_fp32_cpu_offload: False
387
+ - llm_int8_has_fp16_weight: False
388
+ - bnb_4bit_quant_type: nf4
389
+ - bnb_4bit_use_double_quant: True
390
+ - bnb_4bit_compute_dtype: bfloat16
391
+
392
+ The following `bitsandbytes` quantization config was used during training:
393
+ - load_in_8bit: False
394
+ - load_in_4bit: True
395
+ - llm_int8_threshold: 6.0
396
+ - llm_int8_skip_modules: None
397
+ - llm_int8_enable_fp32_cpu_offload: False
398
+ - llm_int8_has_fp16_weight: False
399
+ - bnb_4bit_quant_type: nf4
400
+ - bnb_4bit_use_double_quant: True
401
+ - bnb_4bit_compute_dtype: bfloat16
402
+
403
+ The following `bitsandbytes` quantization config was used during training:
404
+ - load_in_8bit: False
405
+ - load_in_4bit: True
406
+ - llm_int8_threshold: 6.0
407
+ - llm_int8_skip_modules: None
408
+ - llm_int8_enable_fp32_cpu_offload: False
409
+ - llm_int8_has_fp16_weight: False
410
+ - bnb_4bit_quant_type: nf4
411
+ - bnb_4bit_use_double_quant: True
412
+ - bnb_4bit_compute_dtype: bfloat16
413
+
414
+ The following `bitsandbytes` quantization config was used during training:
415
+ - load_in_8bit: False
416
+ - load_in_4bit: True
417
+ - llm_int8_threshold: 6.0
418
+ - llm_int8_skip_modules: None
419
+ - llm_int8_enable_fp32_cpu_offload: False
420
+ - llm_int8_has_fp16_weight: False
421
+ - bnb_4bit_quant_type: nf4
422
+ - bnb_4bit_use_double_quant: True
423
+ - bnb_4bit_compute_dtype: bfloat16
424
+
425
+ The following `bitsandbytes` quantization config was used during training:
426
+ - load_in_8bit: False
427
+ - load_in_4bit: True
428
+ - llm_int8_threshold: 6.0
429
+ - llm_int8_skip_modules: None
430
+ - llm_int8_enable_fp32_cpu_offload: False
431
+ - llm_int8_has_fp16_weight: False
432
+ - bnb_4bit_quant_type: nf4
433
+ - bnb_4bit_use_double_quant: True
434
+ - bnb_4bit_compute_dtype: bfloat16
435
+
436
+ The following `bitsandbytes` quantization config was used during training:
437
+ - load_in_8bit: False
438
+ - load_in_4bit: True
439
+ - llm_int8_threshold: 6.0
440
+ - llm_int8_skip_modules: None
441
+ - llm_int8_enable_fp32_cpu_offload: False
442
+ - llm_int8_has_fp16_weight: False
443
+ - bnb_4bit_quant_type: nf4
444
+ - bnb_4bit_use_double_quant: True
445
+ - bnb_4bit_compute_dtype: bfloat16
446
+
447
+ The following `bitsandbytes` quantization config was used during training:
448
+ - load_in_8bit: False
449
+ - load_in_4bit: True
450
+ - llm_int8_threshold: 6.0
451
+ - llm_int8_skip_modules: None
452
+ - llm_int8_enable_fp32_cpu_offload: False
453
+ - llm_int8_has_fp16_weight: False
454
+ - bnb_4bit_quant_type: nf4
455
+ - bnb_4bit_use_double_quant: True
456
+ - bnb_4bit_compute_dtype: bfloat16
457
+
458
+ The following `bitsandbytes` quantization config was used during training:
459
+ - load_in_8bit: False
460
+ - load_in_4bit: True
461
+ - llm_int8_threshold: 6.0
462
+ - llm_int8_skip_modules: None
463
+ - llm_int8_enable_fp32_cpu_offload: False
464
+ - llm_int8_has_fp16_weight: False
465
+ - bnb_4bit_quant_type: nf4
466
+ - bnb_4bit_use_double_quant: True
467
+ - bnb_4bit_compute_dtype: bfloat16
468
+
469
+ The following `bitsandbytes` quantization config was used during training:
470
+ - load_in_8bit: False
471
+ - load_in_4bit: True
472
+ - llm_int8_threshold: 6.0
473
+ - llm_int8_skip_modules: None
474
+ - llm_int8_enable_fp32_cpu_offload: False
475
+ - llm_int8_has_fp16_weight: False
476
+ - bnb_4bit_quant_type: nf4
477
+ - bnb_4bit_use_double_quant: True
478
+ - bnb_4bit_compute_dtype: bfloat16
479
+
480
+ The following `bitsandbytes` quantization config was used during training:
481
+ - load_in_8bit: False
482
+ - load_in_4bit: True
483
+ - llm_int8_threshold: 6.0
484
+ - llm_int8_skip_modules: None
485
+ - llm_int8_enable_fp32_cpu_offload: False
486
+ - llm_int8_has_fp16_weight: False
487
+ - bnb_4bit_quant_type: nf4
488
+ - bnb_4bit_use_double_quant: True
489
+ - bnb_4bit_compute_dtype: bfloat16
490
+
491
+ The following `bitsandbytes` quantization config was used during training:
492
+ - load_in_8bit: False
493
+ - load_in_4bit: True
494
+ - llm_int8_threshold: 6.0
495
+ - llm_int8_skip_modules: None
496
+ - llm_int8_enable_fp32_cpu_offload: False
497
+ - llm_int8_has_fp16_weight: False
498
+ - bnb_4bit_quant_type: nf4
499
+ - bnb_4bit_use_double_quant: True
500
+ - bnb_4bit_compute_dtype: bfloat16
501
+
502
+ The following `bitsandbytes` quantization config was used during training:
503
+ - load_in_8bit: False
504
+ - load_in_4bit: True
505
+ - llm_int8_threshold: 6.0
506
+ - llm_int8_skip_modules: None
507
+ - llm_int8_enable_fp32_cpu_offload: False
508
+ - llm_int8_has_fp16_weight: False
509
+ - bnb_4bit_quant_type: nf4
510
+ - bnb_4bit_use_double_quant: True
511
+ - bnb_4bit_compute_dtype: bfloat16
512
+
513
+ The following `bitsandbytes` quantization config was used during training:
514
+ - load_in_8bit: False
515
+ - load_in_4bit: True
516
+ - llm_int8_threshold: 6.0
517
+ - llm_int8_skip_modules: None
518
+ - llm_int8_enable_fp32_cpu_offload: False
519
+ - llm_int8_has_fp16_weight: False
520
+ - bnb_4bit_quant_type: nf4
521
+ - bnb_4bit_use_double_quant: True
522
+ - bnb_4bit_compute_dtype: bfloat16
523
+
524
+ The following `bitsandbytes` quantization config was used during training:
525
+ - load_in_8bit: False
526
+ - load_in_4bit: True
527
+ - llm_int8_threshold: 6.0
528
+ - llm_int8_skip_modules: None
529
+ - llm_int8_enable_fp32_cpu_offload: False
530
+ - llm_int8_has_fp16_weight: False
531
+ - bnb_4bit_quant_type: nf4
532
+ - bnb_4bit_use_double_quant: True
533
+ - bnb_4bit_compute_dtype: bfloat16
534
+
535
+ The following `bitsandbytes` quantization config was used during training:
536
+ - load_in_8bit: False
537
+ - load_in_4bit: True
538
+ - llm_int8_threshold: 6.0
539
+ - llm_int8_skip_modules: None
540
+ - llm_int8_enable_fp32_cpu_offload: False
541
+ - llm_int8_has_fp16_weight: False
542
+ - bnb_4bit_quant_type: nf4
543
+ - bnb_4bit_use_double_quant: True
544
+ - bnb_4bit_compute_dtype: bfloat16
545
+
546
+ The following `bitsandbytes` quantization config was used during training:
547
+ - load_in_8bit: False
548
+ - load_in_4bit: True
549
+ - llm_int8_threshold: 6.0
550
+ - llm_int8_skip_modules: None
551
+ - llm_int8_enable_fp32_cpu_offload: False
552
+ - llm_int8_has_fp16_weight: False
553
+ - bnb_4bit_quant_type: nf4
554
+ - bnb_4bit_use_double_quant: True
555
+ - bnb_4bit_compute_dtype: bfloat16
556
+
557
+ The following `bitsandbytes` quantization config was used during training:
558
+ - load_in_8bit: False
559
+ - load_in_4bit: True
560
+ - llm_int8_threshold: 6.0
561
+ - llm_int8_skip_modules: None
562
+ - llm_int8_enable_fp32_cpu_offload: False
563
+ - llm_int8_has_fp16_weight: False
564
+ - bnb_4bit_quant_type: nf4
565
+ - bnb_4bit_use_double_quant: True
566
+ - bnb_4bit_compute_dtype: bfloat16
567
+
568
+ The following `bitsandbytes` quantization config was used during training:
569
+ - load_in_8bit: False
570
+ - load_in_4bit: True
571
+ - llm_int8_threshold: 6.0
572
+ - llm_int8_skip_modules: None
573
+ - llm_int8_enable_fp32_cpu_offload: False
574
+ - llm_int8_has_fp16_weight: False
575
+ - bnb_4bit_quant_type: nf4
576
+ - bnb_4bit_use_double_quant: True
577
+ - bnb_4bit_compute_dtype: bfloat16
578
+
579
+ The following `bitsandbytes` quantization config was used during training:
580
+ - load_in_8bit: False
581
+ - load_in_4bit: True
582
+ - llm_int8_threshold: 6.0
583
+ - llm_int8_skip_modules: None
584
+ - llm_int8_enable_fp32_cpu_offload: False
585
+ - llm_int8_has_fp16_weight: False
586
+ - bnb_4bit_quant_type: nf4
587
+ - bnb_4bit_use_double_quant: True
588
+ - bnb_4bit_compute_dtype: bfloat16
589
+
590
+ The following `bitsandbytes` quantization config was used during training:
591
+ - load_in_8bit: False
592
+ - load_in_4bit: True
593
+ - llm_int8_threshold: 6.0
594
+ - llm_int8_skip_modules: None
595
+ - llm_int8_enable_fp32_cpu_offload: False
596
+ - llm_int8_has_fp16_weight: False
597
+ - bnb_4bit_quant_type: nf4
598
+ - bnb_4bit_use_double_quant: True
599
+ - bnb_4bit_compute_dtype: bfloat16
600
+
601
+ The following `bitsandbytes` quantization config was used during training:
602
+ - load_in_8bit: False
603
+ - load_in_4bit: True
604
+ - llm_int8_threshold: 6.0
605
+ - llm_int8_skip_modules: None
606
+ - llm_int8_enable_fp32_cpu_offload: False
607
+ - llm_int8_has_fp16_weight: False
608
+ - bnb_4bit_quant_type: nf4
609
+ - bnb_4bit_use_double_quant: True
610
+ - bnb_4bit_compute_dtype: bfloat16
611
+
612
+ The following `bitsandbytes` quantization config was used during training:
613
+ - load_in_8bit: False
614
+ - load_in_4bit: True
615
+ - llm_int8_threshold: 6.0
616
+ - llm_int8_skip_modules: None
617
+ - llm_int8_enable_fp32_cpu_offload: False
618
+ - llm_int8_has_fp16_weight: False
619
+ - bnb_4bit_quant_type: nf4
620
+ - bnb_4bit_use_double_quant: True
621
+ - bnb_4bit_compute_dtype: bfloat16
622
+
623
+ The following `bitsandbytes` quantization config was used during training:
624
+ - load_in_8bit: False
625
+ - load_in_4bit: True
626
+ - llm_int8_threshold: 6.0
627
+ - llm_int8_skip_modules: None
628
+ - llm_int8_enable_fp32_cpu_offload: False
629
+ - llm_int8_has_fp16_weight: False
630
+ - bnb_4bit_quant_type: nf4
631
+ - bnb_4bit_use_double_quant: True
632
+ - bnb_4bit_compute_dtype: bfloat16
633
+
634
+ The following `bitsandbytes` quantization config was used during training:
635
+ - load_in_8bit: False
636
+ - load_in_4bit: True
637
+ - llm_int8_threshold: 6.0
638
+ - llm_int8_skip_modules: None
639
+ - llm_int8_enable_fp32_cpu_offload: False
640
+ - llm_int8_has_fp16_weight: False
641
+ - bnb_4bit_quant_type: nf4
642
+ - bnb_4bit_use_double_quant: True
643
+ - bnb_4bit_compute_dtype: bfloat16
644
+
645
+ The following `bitsandbytes` quantization config was used during training:
646
+ - load_in_8bit: False
647
+ - load_in_4bit: True
648
+ - llm_int8_threshold: 6.0
649
+ - llm_int8_skip_modules: None
650
+ - llm_int8_enable_fp32_cpu_offload: False
651
+ - llm_int8_has_fp16_weight: False
652
+ - bnb_4bit_quant_type: nf4
653
+ - bnb_4bit_use_double_quant: True
654
+ - bnb_4bit_compute_dtype: bfloat16
655
+
656
+ The following `bitsandbytes` quantization config was used during training:
657
+ - load_in_8bit: False
658
+ - load_in_4bit: True
659
+ - llm_int8_threshold: 6.0
660
+ - llm_int8_skip_modules: None
661
+ - llm_int8_enable_fp32_cpu_offload: False
662
+ - llm_int8_has_fp16_weight: False
663
+ - bnb_4bit_quant_type: nf4
664
+ - bnb_4bit_use_double_quant: True
665
+ - bnb_4bit_compute_dtype: bfloat16
666
+
667
+ The following `bitsandbytes` quantization config was used during training:
668
+ - load_in_8bit: False
669
+ - load_in_4bit: True
670
+ - llm_int8_threshold: 6.0
671
+ - llm_int8_skip_modules: None
672
+ - llm_int8_enable_fp32_cpu_offload: False
673
+ - llm_int8_has_fp16_weight: False
674
+ - bnb_4bit_quant_type: nf4
675
+ - bnb_4bit_use_double_quant: True
676
+ - bnb_4bit_compute_dtype: bfloat16
677
+
678
+ The following `bitsandbytes` quantization config was used during training:
679
+ - load_in_8bit: False
680
+ - load_in_4bit: True
681
+ - llm_int8_threshold: 6.0
682
+ - llm_int8_skip_modules: None
683
+ - llm_int8_enable_fp32_cpu_offload: False
684
+ - llm_int8_has_fp16_weight: False
685
+ - bnb_4bit_quant_type: nf4
686
+ - bnb_4bit_use_double_quant: True
687
+ - bnb_4bit_compute_dtype: bfloat16
688
+
689
+ The following `bitsandbytes` quantization config was used during training:
690
+ - load_in_8bit: False
691
+ - load_in_4bit: True
692
+ - llm_int8_threshold: 6.0
693
+ - llm_int8_skip_modules: None
694
+ - llm_int8_enable_fp32_cpu_offload: False
695
+ - llm_int8_has_fp16_weight: False
696
+ - bnb_4bit_quant_type: nf4
697
+ - bnb_4bit_use_double_quant: True
698
+ - bnb_4bit_compute_dtype: bfloat16
699
+
700
+ The following `bitsandbytes` quantization config was used during training:
701
+ - load_in_8bit: False
702
+ - load_in_4bit: True
703
+ - llm_int8_threshold: 6.0
704
+ - llm_int8_skip_modules: None
705
+ - llm_int8_enable_fp32_cpu_offload: False
706
+ - llm_int8_has_fp16_weight: False
707
+ - bnb_4bit_quant_type: nf4
708
+ - bnb_4bit_use_double_quant: True
709
+ - bnb_4bit_compute_dtype: bfloat16
710
+
711
+ The following `bitsandbytes` quantization config was used during training:
712
+ - load_in_8bit: False
713
+ - load_in_4bit: True
714
+ - llm_int8_threshold: 6.0
715
+ - llm_int8_skip_modules: None
716
+ - llm_int8_enable_fp32_cpu_offload: False
717
+ - llm_int8_has_fp16_weight: False
718
+ - bnb_4bit_quant_type: nf4
719
+ - bnb_4bit_use_double_quant: True
720
+ - bnb_4bit_compute_dtype: bfloat16
721
+
722
+ The following `bitsandbytes` quantization config was used during training:
723
+ - load_in_8bit: False
724
+ - load_in_4bit: True
725
+ - llm_int8_threshold: 6.0
726
+ - llm_int8_skip_modules: None
727
+ - llm_int8_enable_fp32_cpu_offload: False
728
+ - llm_int8_has_fp16_weight: False
729
+ - bnb_4bit_quant_type: nf4
730
+ - bnb_4bit_use_double_quant: True
731
+ - bnb_4bit_compute_dtype: bfloat16
732
+
733
+ The following `bitsandbytes` quantization config was used during training:
734
+ - load_in_8bit: False
735
+ - load_in_4bit: True
736
+ - llm_int8_threshold: 6.0
737
+ - llm_int8_skip_modules: None
738
+ - llm_int8_enable_fp32_cpu_offload: False
739
+ - llm_int8_has_fp16_weight: False
740
+ - bnb_4bit_quant_type: nf4
741
+ - bnb_4bit_use_double_quant: True
742
+ - bnb_4bit_compute_dtype: bfloat16
743
+
744
+ The following `bitsandbytes` quantization config was used during training:
745
+ - load_in_8bit: False
746
+ - load_in_4bit: True
747
+ - llm_int8_threshold: 6.0
748
+ - llm_int8_skip_modules: None
749
+ - llm_int8_enable_fp32_cpu_offload: False
750
+ - llm_int8_has_fp16_weight: False
751
+ - bnb_4bit_quant_type: nf4
752
+ - bnb_4bit_use_double_quant: True
753
+ - bnb_4bit_compute_dtype: bfloat16
754
+
755
+ The following `bitsandbytes` quantization config was used during training:
756
+ - load_in_8bit: False
757
+ - load_in_4bit: True
758
+ - llm_int8_threshold: 6.0
759
+ - llm_int8_skip_modules: None
760
+ - llm_int8_enable_fp32_cpu_offload: False
761
+ - llm_int8_has_fp16_weight: False
762
+ - bnb_4bit_quant_type: nf4
763
+ - bnb_4bit_use_double_quant: True
764
+ - bnb_4bit_compute_dtype: bfloat16
765
+
766
+ The following `bitsandbytes` quantization config was used during training:
767
+ - load_in_8bit: False
768
+ - load_in_4bit: True
769
+ - llm_int8_threshold: 6.0
770
+ - llm_int8_skip_modules: None
771
+ - llm_int8_enable_fp32_cpu_offload: False
772
+ - llm_int8_has_fp16_weight: False
773
+ - bnb_4bit_quant_type: nf4
774
+ - bnb_4bit_use_double_quant: True
775
+ - bnb_4bit_compute_dtype: bfloat16
776
+
777
+ The following `bitsandbytes` quantization config was used during training:
778
+ - load_in_8bit: False
779
+ - load_in_4bit: True
780
+ - llm_int8_threshold: 6.0
781
+ - llm_int8_skip_modules: None
782
+ - llm_int8_enable_fp32_cpu_offload: False
783
+ - llm_int8_has_fp16_weight: False
784
+ - bnb_4bit_quant_type: nf4
785
+ - bnb_4bit_use_double_quant: True
786
+ - bnb_4bit_compute_dtype: bfloat16
787
+
788
+ The following `bitsandbytes` quantization config was used during training:
789
+ - load_in_8bit: False
790
+ - load_in_4bit: True
791
+ - llm_int8_threshold: 6.0
792
+ - llm_int8_skip_modules: None
793
+ - llm_int8_enable_fp32_cpu_offload: False
794
+ - llm_int8_has_fp16_weight: False
795
+ - bnb_4bit_quant_type: nf4
796
+ - bnb_4bit_use_double_quant: True
797
+ - bnb_4bit_compute_dtype: bfloat16
798
+
799
+ The following `bitsandbytes` quantization config was used during training:
800
+ - load_in_8bit: False
801
+ - load_in_4bit: True
802
+ - llm_int8_threshold: 6.0
803
+ - llm_int8_skip_modules: None
804
+ - llm_int8_enable_fp32_cpu_offload: False
805
+ - llm_int8_has_fp16_weight: False
806
+ - bnb_4bit_quant_type: nf4
807
+ - bnb_4bit_use_double_quant: True
808
+ - bnb_4bit_compute_dtype: bfloat16
809
+
810
+ The following `bitsandbytes` quantization config was used during training:
811
+ - load_in_8bit: False
812
+ - load_in_4bit: True
813
+ - llm_int8_threshold: 6.0
814
+ - llm_int8_skip_modules: None
815
+ - llm_int8_enable_fp32_cpu_offload: False
816
+ - llm_int8_has_fp16_weight: False
817
+ - bnb_4bit_quant_type: nf4
818
+ - bnb_4bit_use_double_quant: True
819
+ - bnb_4bit_compute_dtype: bfloat16
820
+
821
+ The following `bitsandbytes` quantization config was used during training:
822
+ - load_in_8bit: False
823
+ - load_in_4bit: True
824
+ - llm_int8_threshold: 6.0
825
+ - llm_int8_skip_modules: None
826
+ - llm_int8_enable_fp32_cpu_offload: False
827
+ - llm_int8_has_fp16_weight: False
828
+ - bnb_4bit_quant_type: nf4
829
+ - bnb_4bit_use_double_quant: True
830
+ - bnb_4bit_compute_dtype: bfloat16
831
+
832
+ The following `bitsandbytes` quantization config was used during training:
833
+ - load_in_8bit: False
834
+ - load_in_4bit: True
835
+ - llm_int8_threshold: 6.0
836
+ - llm_int8_skip_modules: None
837
+ - llm_int8_enable_fp32_cpu_offload: False
838
+ - llm_int8_has_fp16_weight: False
839
+ - bnb_4bit_quant_type: nf4
840
+ - bnb_4bit_use_double_quant: True
841
+ - bnb_4bit_compute_dtype: bfloat16
842
+
843
+ The following `bitsandbytes` quantization config was used during training:
844
+ - load_in_8bit: False
845
+ - load_in_4bit: True
846
+ - llm_int8_threshold: 6.0
847
+ - llm_int8_skip_modules: None
848
+ - llm_int8_enable_fp32_cpu_offload: False
849
+ - llm_int8_has_fp16_weight: False
850
+ - bnb_4bit_quant_type: nf4
851
+ - bnb_4bit_use_double_quant: True
852
+ - bnb_4bit_compute_dtype: bfloat16
853
+
854
+ The following `bitsandbytes` quantization config was used during training:
855
+ - load_in_8bit: False
856
+ - load_in_4bit: True
857
+ - llm_int8_threshold: 6.0
858
+ - llm_int8_skip_modules: None
859
+ - llm_int8_enable_fp32_cpu_offload: False
860
+ - llm_int8_has_fp16_weight: False
861
+ - bnb_4bit_quant_type: nf4
862
+ - bnb_4bit_use_double_quant: True
863
+ - bnb_4bit_compute_dtype: bfloat16
864
+
865
+ The following `bitsandbytes` quantization config was used during training:
866
+ - load_in_8bit: False
867
+ - load_in_4bit: True
868
+ - llm_int8_threshold: 6.0
869
+ - llm_int8_skip_modules: None
870
+ - llm_int8_enable_fp32_cpu_offload: False
871
+ - llm_int8_has_fp16_weight: False
872
+ - bnb_4bit_quant_type: nf4
873
+ - bnb_4bit_use_double_quant: True
874
+ - bnb_4bit_compute_dtype: bfloat16
875
+
876
+ The following `bitsandbytes` quantization config was used during training:
877
+ - load_in_8bit: False
878
+ - load_in_4bit: True
879
+ - llm_int8_threshold: 6.0
880
+ - llm_int8_skip_modules: None
881
+ - llm_int8_enable_fp32_cpu_offload: False
882
+ - llm_int8_has_fp16_weight: False
883
+ - bnb_4bit_quant_type: nf4
884
+ - bnb_4bit_use_double_quant: True
885
+ - bnb_4bit_compute_dtype: bfloat16
886
+
887
+ The following `bitsandbytes` quantization config was used during training:
888
+ - load_in_8bit: False
889
+ - load_in_4bit: True
890
+ - llm_int8_threshold: 6.0
891
+ - llm_int8_skip_modules: None
892
+ - llm_int8_enable_fp32_cpu_offload: False
893
+ - llm_int8_has_fp16_weight: False
894
+ - bnb_4bit_quant_type: nf4
895
+ - bnb_4bit_use_double_quant: True
896
+ - bnb_4bit_compute_dtype: bfloat16
897
+
898
+ The following `bitsandbytes` quantization config was used during training:
899
+ - load_in_8bit: False
900
+ - load_in_4bit: True
901
+ - llm_int8_threshold: 6.0
902
+ - llm_int8_skip_modules: None
903
+ - llm_int8_enable_fp32_cpu_offload: False
904
+ - llm_int8_has_fp16_weight: False
905
+ - bnb_4bit_quant_type: nf4
906
+ - bnb_4bit_use_double_quant: True
907
+ - bnb_4bit_compute_dtype: bfloat16
908
+
909
+ The following `bitsandbytes` quantization config was used during training:
910
+ - load_in_8bit: False
911
+ - load_in_4bit: True
912
+ - llm_int8_threshold: 6.0
913
+ - llm_int8_skip_modules: None
914
+ - llm_int8_enable_fp32_cpu_offload: False
915
+ - llm_int8_has_fp16_weight: False
916
+ - bnb_4bit_quant_type: nf4
917
+ - bnb_4bit_use_double_quant: True
918
+ - bnb_4bit_compute_dtype: bfloat16
919
+
920
+ The following `bitsandbytes` quantization config was used during training:
921
+ - load_in_8bit: False
922
+ - load_in_4bit: True
923
+ - llm_int8_threshold: 6.0
924
+ - llm_int8_skip_modules: None
925
+ - llm_int8_enable_fp32_cpu_offload: False
926
+ - llm_int8_has_fp16_weight: False
927
+ - bnb_4bit_quant_type: nf4
928
+ - bnb_4bit_use_double_quant: True
929
+ - bnb_4bit_compute_dtype: bfloat16
930
+
931
+ The following `bitsandbytes` quantization config was used during training:
932
+ - load_in_8bit: False
933
+ - load_in_4bit: True
934
+ - llm_int8_threshold: 6.0
935
+ - llm_int8_skip_modules: None
936
+ - llm_int8_enable_fp32_cpu_offload: False
937
+ - llm_int8_has_fp16_weight: False
938
+ - bnb_4bit_quant_type: nf4
939
+ - bnb_4bit_use_double_quant: True
940
+ - bnb_4bit_compute_dtype: bfloat16
941
+
942
+ The following `bitsandbytes` quantization config was used during training:
943
+ - load_in_8bit: False
944
+ - load_in_4bit: True
945
+ - llm_int8_threshold: 6.0
946
+ - llm_int8_skip_modules: None
947
+ - llm_int8_enable_fp32_cpu_offload: False
948
+ - llm_int8_has_fp16_weight: False
949
+ - bnb_4bit_quant_type: nf4
950
+ - bnb_4bit_use_double_quant: True
951
+ - bnb_4bit_compute_dtype: bfloat16
952
+
953
+ The following `bitsandbytes` quantization config was used during training:
954
+ - load_in_8bit: False
955
+ - load_in_4bit: True
956
+ - llm_int8_threshold: 6.0
957
+ - llm_int8_skip_modules: None
958
+ - llm_int8_enable_fp32_cpu_offload: False
959
+ - llm_int8_has_fp16_weight: False
960
+ - bnb_4bit_quant_type: nf4
961
+ - bnb_4bit_use_double_quant: True
962
+ - bnb_4bit_compute_dtype: bfloat16
963
+
964
+ The following `bitsandbytes` quantization config was used during training:
965
+ - load_in_8bit: False
966
+ - load_in_4bit: True
967
+ - llm_int8_threshold: 6.0
968
+ - llm_int8_skip_modules: None
969
+ - llm_int8_enable_fp32_cpu_offload: False
970
+ - llm_int8_has_fp16_weight: False
971
+ - bnb_4bit_quant_type: nf4
972
+ - bnb_4bit_use_double_quant: True
973
+ - bnb_4bit_compute_dtype: bfloat16
974
+
975
+ The following `bitsandbytes` quantization config was used during training:
976
+ - load_in_8bit: False
977
+ - load_in_4bit: True
978
+ - llm_int8_threshold: 6.0
979
+ - llm_int8_skip_modules: None
980
+ - llm_int8_enable_fp32_cpu_offload: False
981
+ - llm_int8_has_fp16_weight: False
982
+ - bnb_4bit_quant_type: nf4
983
+ - bnb_4bit_use_double_quant: True
984
+ - bnb_4bit_compute_dtype: bfloat16
985
+
986
+ The following `bitsandbytes` quantization config was used during training:
987
+ - load_in_8bit: False
988
+ - load_in_4bit: True
989
+ - llm_int8_threshold: 6.0
990
+ - llm_int8_skip_modules: None
991
+ - llm_int8_enable_fp32_cpu_offload: False
992
+ - llm_int8_has_fp16_weight: False
993
+ - bnb_4bit_quant_type: nf4
994
+ - bnb_4bit_use_double_quant: True
995
+ - bnb_4bit_compute_dtype: bfloat16
996
+
997
+ The following `bitsandbytes` quantization config was used during training:
998
+ - load_in_8bit: False
999
+ - load_in_4bit: True
1000
+ - llm_int8_threshold: 6.0
1001
+ - llm_int8_skip_modules: None
1002
+ - llm_int8_enable_fp32_cpu_offload: False
1003
+ - llm_int8_has_fp16_weight: False
1004
+ - bnb_4bit_quant_type: nf4
1005
+ - bnb_4bit_use_double_quant: True
1006
+ - bnb_4bit_compute_dtype: bfloat16
1007
+
1008
+ The following `bitsandbytes` quantization config was used during training:
1009
+ - load_in_8bit: False
1010
+ - load_in_4bit: True
1011
+ - llm_int8_threshold: 6.0
1012
+ - llm_int8_skip_modules: None
1013
+ - llm_int8_enable_fp32_cpu_offload: False
1014
+ - llm_int8_has_fp16_weight: False
1015
+ - bnb_4bit_quant_type: nf4
1016
+ - bnb_4bit_use_double_quant: True
1017
+ - bnb_4bit_compute_dtype: bfloat16
1018
+
1019
+ The following `bitsandbytes` quantization config was used during training:
1020
+ - load_in_8bit: False
1021
+ - load_in_4bit: True
1022
+ - llm_int8_threshold: 6.0
1023
+ - llm_int8_skip_modules: None
1024
+ - llm_int8_enable_fp32_cpu_offload: False
1025
+ - llm_int8_has_fp16_weight: False
1026
+ - bnb_4bit_quant_type: nf4
1027
+ - bnb_4bit_use_double_quant: True
1028
+ - bnb_4bit_compute_dtype: bfloat16
1029
+
1030
+ The following `bitsandbytes` quantization config was used during training:
1031
+ - load_in_8bit: False
1032
+ - load_in_4bit: True
1033
+ - llm_int8_threshold: 6.0
1034
+ - llm_int8_skip_modules: None
1035
+ - llm_int8_enable_fp32_cpu_offload: False
1036
+ - llm_int8_has_fp16_weight: False
1037
+ - bnb_4bit_quant_type: nf4
1038
+ - bnb_4bit_use_double_quant: True
1039
+ - bnb_4bit_compute_dtype: bfloat16
1040
+
1041
+ The following `bitsandbytes` quantization config was used during training:
1042
+ - load_in_8bit: False
1043
+ - load_in_4bit: True
1044
+ - llm_int8_threshold: 6.0
1045
+ - llm_int8_skip_modules: None
1046
+ - llm_int8_enable_fp32_cpu_offload: False
1047
+ - llm_int8_has_fp16_weight: False
1048
+ - bnb_4bit_quant_type: nf4
1049
+ - bnb_4bit_use_double_quant: True
1050
+ - bnb_4bit_compute_dtype: bfloat16
1051
+
1052
+ The following `bitsandbytes` quantization config was used during training:
1053
+ - load_in_8bit: False
1054
+ - load_in_4bit: True
1055
+ - llm_int8_threshold: 6.0
1056
+ - llm_int8_skip_modules: None
1057
+ - llm_int8_enable_fp32_cpu_offload: False
1058
+ - llm_int8_has_fp16_weight: False
1059
+ - bnb_4bit_quant_type: nf4
1060
+ - bnb_4bit_use_double_quant: True
1061
+ - bnb_4bit_compute_dtype: bfloat16
1062
+
1063
+ The following `bitsandbytes` quantization config was used during training:
1064
+ - load_in_8bit: False
1065
+ - load_in_4bit: True
1066
+ - llm_int8_threshold: 6.0
1067
+ - llm_int8_skip_modules: None
1068
+ - llm_int8_enable_fp32_cpu_offload: False
1069
+ - llm_int8_has_fp16_weight: False
1070
+ - bnb_4bit_quant_type: nf4
1071
+ - bnb_4bit_use_double_quant: True
1072
+ - bnb_4bit_compute_dtype: bfloat16
1073
+
1074
+ The following `bitsandbytes` quantization config was used during training:
1075
+ - load_in_8bit: False
1076
+ - load_in_4bit: True
1077
+ - llm_int8_threshold: 6.0
1078
+ - llm_int8_skip_modules: None
1079
+ - llm_int8_enable_fp32_cpu_offload: False
1080
+ - llm_int8_has_fp16_weight: False
1081
+ - bnb_4bit_quant_type: nf4
1082
+ - bnb_4bit_use_double_quant: True
1083
+ - bnb_4bit_compute_dtype: bfloat16
1084
+
1085
+ The following `bitsandbytes` quantization config was used during training:
1086
+ - load_in_8bit: False
1087
+ - load_in_4bit: True
1088
+ - llm_int8_threshold: 6.0
1089
+ - llm_int8_skip_modules: None
1090
+ - llm_int8_enable_fp32_cpu_offload: False
1091
+ - llm_int8_has_fp16_weight: False
1092
+ - bnb_4bit_quant_type: nf4
1093
+ - bnb_4bit_use_double_quant: True
1094
+ - bnb_4bit_compute_dtype: bfloat16
1095
+
1096
+ The following `bitsandbytes` quantization config was used during training:
1097
+ - load_in_8bit: False
1098
+ - load_in_4bit: True
1099
+ - llm_int8_threshold: 6.0
1100
+ - llm_int8_skip_modules: None
1101
+ - llm_int8_enable_fp32_cpu_offload: False
1102
+ - llm_int8_has_fp16_weight: False
1103
+ - bnb_4bit_quant_type: nf4
1104
+ - bnb_4bit_use_double_quant: True
1105
+ - bnb_4bit_compute_dtype: bfloat16
1106
+
1107
+ The following `bitsandbytes` quantization config was used during training:
1108
+ - load_in_8bit: False
1109
+ - load_in_4bit: True
1110
+ - llm_int8_threshold: 6.0
1111
+ - llm_int8_skip_modules: None
1112
+ - llm_int8_enable_fp32_cpu_offload: False
1113
+ - llm_int8_has_fp16_weight: False
1114
+ - bnb_4bit_quant_type: nf4
1115
+ - bnb_4bit_use_double_quant: True
1116
+ - bnb_4bit_compute_dtype: bfloat16
1117
+
1118
+ The following `bitsandbytes` quantization config was used during training:
1119
+ - load_in_8bit: False
1120
+ - load_in_4bit: True
1121
+ - llm_int8_threshold: 6.0
1122
+ - llm_int8_skip_modules: None
1123
+ - llm_int8_enable_fp32_cpu_offload: False
1124
+ - llm_int8_has_fp16_weight: False
1125
+ - bnb_4bit_quant_type: nf4
1126
+ - bnb_4bit_use_double_quant: True
1127
+ - bnb_4bit_compute_dtype: bfloat16
1128
+
1129
+ The following `bitsandbytes` quantization config was used during training:
1130
+ - load_in_8bit: False
1131
+ - load_in_4bit: True
1132
+ - llm_int8_threshold: 6.0
1133
+ - llm_int8_skip_modules: None
1134
+ - llm_int8_enable_fp32_cpu_offload: False
1135
+ - llm_int8_has_fp16_weight: False
1136
+ - bnb_4bit_quant_type: nf4
1137
+ - bnb_4bit_use_double_quant: True
1138
+ - bnb_4bit_compute_dtype: bfloat16
1139
+
1140
+ The following `bitsandbytes` quantization config was used during training:
1141
+ - load_in_8bit: False
1142
+ - load_in_4bit: True
1143
+ - llm_int8_threshold: 6.0
1144
+ - llm_int8_skip_modules: None
1145
+ - llm_int8_enable_fp32_cpu_offload: False
1146
+ - llm_int8_has_fp16_weight: False
1147
+ - bnb_4bit_quant_type: nf4
1148
+ - bnb_4bit_use_double_quant: True
1149
+ - bnb_4bit_compute_dtype: bfloat16
1150
+
1151
+ The following `bitsandbytes` quantization config was used during training:
1152
+ - load_in_8bit: False
1153
+ - load_in_4bit: True
1154
+ - llm_int8_threshold: 6.0
1155
+ - llm_int8_skip_modules: None
1156
+ - llm_int8_enable_fp32_cpu_offload: False
1157
+ - llm_int8_has_fp16_weight: False
1158
+ - bnb_4bit_quant_type: nf4
1159
+ - bnb_4bit_use_double_quant: True
1160
+ - bnb_4bit_compute_dtype: bfloat16
1161
+
1162
+ The following `bitsandbytes` quantization config was used during training:
1163
+ - load_in_8bit: False
1164
+ - load_in_4bit: True
1165
+ - llm_int8_threshold: 6.0
1166
+ - llm_int8_skip_modules: None
1167
+ - llm_int8_enable_fp32_cpu_offload: False
1168
+ - llm_int8_has_fp16_weight: False
1169
+ - bnb_4bit_quant_type: nf4
1170
+ - bnb_4bit_use_double_quant: True
1171
+ - bnb_4bit_compute_dtype: bfloat16
1172
+
1173
+ The following `bitsandbytes` quantization config was used during training:
1174
+ - load_in_8bit: False
1175
+ - load_in_4bit: True
1176
+ - llm_int8_threshold: 6.0
1177
+ - llm_int8_skip_modules: None
1178
+ - llm_int8_enable_fp32_cpu_offload: False
1179
+ - llm_int8_has_fp16_weight: False
1180
+ - bnb_4bit_quant_type: nf4
1181
+ - bnb_4bit_use_double_quant: True
1182
+ - bnb_4bit_compute_dtype: bfloat16
1183
+
1184
+ The following `bitsandbytes` quantization config was used during training:
1185
+ - load_in_8bit: False
1186
+ - load_in_4bit: True
1187
+ - llm_int8_threshold: 6.0
1188
+ - llm_int8_skip_modules: None
1189
+ - llm_int8_enable_fp32_cpu_offload: False
1190
+ - llm_int8_has_fp16_weight: False
1191
+ - bnb_4bit_quant_type: nf4
1192
+ - bnb_4bit_use_double_quant: True
1193
+ - bnb_4bit_compute_dtype: bfloat16
1194
+
1195
+ The following `bitsandbytes` quantization config was used during training:
1196
+ - load_in_8bit: False
1197
+ - load_in_4bit: True
1198
+ - llm_int8_threshold: 6.0
1199
+ - llm_int8_skip_modules: None
1200
+ - llm_int8_enable_fp32_cpu_offload: False
1201
+ - llm_int8_has_fp16_weight: False
1202
+ - bnb_4bit_quant_type: nf4
1203
+ - bnb_4bit_use_double_quant: True
1204
+ - bnb_4bit_compute_dtype: bfloat16
1205
+
1206
+ The following `bitsandbytes` quantization config was used during training:
1207
+ - load_in_8bit: False
1208
+ - load_in_4bit: True
1209
+ - llm_int8_threshold: 6.0
1210
+ - llm_int8_skip_modules: None
1211
+ - llm_int8_enable_fp32_cpu_offload: False
1212
+ - llm_int8_has_fp16_weight: False
1213
+ - bnb_4bit_quant_type: nf4
1214
+ - bnb_4bit_use_double_quant: True
1215
+ - bnb_4bit_compute_dtype: bfloat16
1216
+
1217
+ The following `bitsandbytes` quantization config was used during training:
1218
+ - load_in_8bit: False
1219
+ - load_in_4bit: True
1220
+ - llm_int8_threshold: 6.0
1221
+ - llm_int8_skip_modules: None
1222
+ - llm_int8_enable_fp32_cpu_offload: False
1223
+ - llm_int8_has_fp16_weight: False
1224
+ - bnb_4bit_quant_type: nf4
1225
+ - bnb_4bit_use_double_quant: True
1226
+ - bnb_4bit_compute_dtype: bfloat16
1227
+
1228
+ The following `bitsandbytes` quantization config was used during training:
1229
+ - load_in_8bit: False
1230
+ - load_in_4bit: True
1231
+ - llm_int8_threshold: 6.0
1232
+ - llm_int8_skip_modules: None
1233
+ - llm_int8_enable_fp32_cpu_offload: False
1234
+ - llm_int8_has_fp16_weight: False
1235
+ - bnb_4bit_quant_type: nf4
1236
+ - bnb_4bit_use_double_quant: True
1237
+ - bnb_4bit_compute_dtype: bfloat16
1238
+
1239
+ The following `bitsandbytes` quantization config was used during training:
1240
+ - load_in_8bit: False
1241
+ - load_in_4bit: True
1242
+ - llm_int8_threshold: 6.0
1243
+ - llm_int8_skip_modules: None
1244
+ - llm_int8_enable_fp32_cpu_offload: False
1245
+ - llm_int8_has_fp16_weight: False
1246
+ - bnb_4bit_quant_type: nf4
1247
+ - bnb_4bit_use_double_quant: True
1248
+ - bnb_4bit_compute_dtype: bfloat16
1249
+
1250
+ The following `bitsandbytes` quantization config was used during training:
1251
+ - load_in_8bit: False
1252
+ - load_in_4bit: True
1253
+ - llm_int8_threshold: 6.0
1254
+ - llm_int8_skip_modules: None
1255
+ - llm_int8_enable_fp32_cpu_offload: False
1256
+ - llm_int8_has_fp16_weight: False
1257
+ - bnb_4bit_quant_type: nf4
1258
+ - bnb_4bit_use_double_quant: True
1259
+ - bnb_4bit_compute_dtype: bfloat16
1260
+
1261
+ The following `bitsandbytes` quantization config was used during training:
1262
+ - load_in_8bit: False
1263
+ - load_in_4bit: True
1264
+ - llm_int8_threshold: 6.0
1265
+ - llm_int8_skip_modules: None
1266
+ - llm_int8_enable_fp32_cpu_offload: False
1267
+ - llm_int8_has_fp16_weight: False
1268
+ - bnb_4bit_quant_type: nf4
1269
+ - bnb_4bit_use_double_quant: True
1270
+ - bnb_4bit_compute_dtype: bfloat16
1271
+
1272
+ The following `bitsandbytes` quantization config was used during training:
1273
+ - load_in_8bit: False
1274
+ - load_in_4bit: True
1275
+ - llm_int8_threshold: 6.0
1276
+ - llm_int8_skip_modules: None
1277
+ - llm_int8_enable_fp32_cpu_offload: False
1278
+ - llm_int8_has_fp16_weight: False
1279
+ - bnb_4bit_quant_type: nf4
1280
+ - bnb_4bit_use_double_quant: True
1281
+ - bnb_4bit_compute_dtype: bfloat16
1282
+
1283
+ The following `bitsandbytes` quantization config was used during training:
1284
+ - load_in_8bit: False
1285
+ - load_in_4bit: True
1286
+ - llm_int8_threshold: 6.0
1287
+ - llm_int8_skip_modules: None
1288
+ - llm_int8_enable_fp32_cpu_offload: False
1289
+ - llm_int8_has_fp16_weight: False
1290
+ - bnb_4bit_quant_type: nf4
1291
+ - bnb_4bit_use_double_quant: True
1292
+ - bnb_4bit_compute_dtype: bfloat16
1293
+
1294
+ The following `bitsandbytes` quantization config was used during training:
1295
+ - load_in_8bit: False
1296
+ - load_in_4bit: True
1297
+ - llm_int8_threshold: 6.0
1298
+ - llm_int8_skip_modules: None
1299
+ - llm_int8_enable_fp32_cpu_offload: False
1300
+ - llm_int8_has_fp16_weight: False
1301
+ - bnb_4bit_quant_type: nf4
1302
+ - bnb_4bit_use_double_quant: True
1303
+ - bnb_4bit_compute_dtype: bfloat16
1304
+
1305
+ The following `bitsandbytes` quantization config was used during training:
1306
+ - load_in_8bit: False
1307
+ - load_in_4bit: True
1308
+ - llm_int8_threshold: 6.0
1309
+ - llm_int8_skip_modules: None
1310
+ - llm_int8_enable_fp32_cpu_offload: False
1311
+ - llm_int8_has_fp16_weight: False
1312
+ - bnb_4bit_quant_type: nf4
1313
+ - bnb_4bit_use_double_quant: True
1314
+ - bnb_4bit_compute_dtype: bfloat16
1315
+
1316
+ The following `bitsandbytes` quantization config was used during training:
1317
+ - load_in_8bit: False
1318
+ - load_in_4bit: True
1319
+ - llm_int8_threshold: 6.0
1320
+ - llm_int8_skip_modules: None
1321
+ - llm_int8_enable_fp32_cpu_offload: False
1322
+ - llm_int8_has_fp16_weight: False
1323
+ - bnb_4bit_quant_type: nf4
1324
+ - bnb_4bit_use_double_quant: True
1325
+ - bnb_4bit_compute_dtype: bfloat16
1326
+
1327
+ The following `bitsandbytes` quantization config was used during training:
1328
+ - load_in_8bit: False
1329
+ - load_in_4bit: True
1330
+ - llm_int8_threshold: 6.0
1331
+ - llm_int8_skip_modules: None
1332
+ - llm_int8_enable_fp32_cpu_offload: False
1333
+ - llm_int8_has_fp16_weight: False
1334
+ - bnb_4bit_quant_type: nf4
1335
+ - bnb_4bit_use_double_quant: True
1336
+ - bnb_4bit_compute_dtype: bfloat16
1337
+
1338
+ The following `bitsandbytes` quantization config was used during training:
1339
+ - load_in_8bit: False
1340
+ - load_in_4bit: True
1341
+ - llm_int8_threshold: 6.0
1342
+ - llm_int8_skip_modules: None
1343
+ - llm_int8_enable_fp32_cpu_offload: False
1344
+ - llm_int8_has_fp16_weight: False
1345
+ - bnb_4bit_quant_type: nf4
1346
+ - bnb_4bit_use_double_quant: True
1347
+ - bnb_4bit_compute_dtype: bfloat16
1348
+
1349
+ The following `bitsandbytes` quantization config was used during training:
1350
+ - load_in_8bit: False
1351
+ - load_in_4bit: True
1352
+ - llm_int8_threshold: 6.0
1353
+ - llm_int8_skip_modules: None
1354
+ - llm_int8_enable_fp32_cpu_offload: False
1355
+ - llm_int8_has_fp16_weight: False
1356
+ - bnb_4bit_quant_type: nf4
1357
+ - bnb_4bit_use_double_quant: True
1358
+ - bnb_4bit_compute_dtype: bfloat16
1359
+
1360
+ The following `bitsandbytes` quantization config was used during training:
1361
+ - load_in_8bit: False
1362
+ - load_in_4bit: True
1363
+ - llm_int8_threshold: 6.0
1364
+ - llm_int8_skip_modules: None
1365
+ - llm_int8_enable_fp32_cpu_offload: False
1366
+ - llm_int8_has_fp16_weight: False
1367
+ - bnb_4bit_quant_type: nf4
1368
+ - bnb_4bit_use_double_quant: True
1369
+ - bnb_4bit_compute_dtype: bfloat16
1370
+
1371
+ The following `bitsandbytes` quantization config was used during training:
1372
+ - load_in_8bit: False
1373
+ - load_in_4bit: True
1374
+ - llm_int8_threshold: 6.0
1375
+ - llm_int8_skip_modules: None
1376
+ - llm_int8_enable_fp32_cpu_offload: False
1377
+ - llm_int8_has_fp16_weight: False
1378
+ - bnb_4bit_quant_type: nf4
1379
+ - bnb_4bit_use_double_quant: True
1380
+ - bnb_4bit_compute_dtype: bfloat16
1381
+
1382
+ The following `bitsandbytes` quantization config was used during training:
1383
+ - load_in_8bit: False
1384
+ - load_in_4bit: True
1385
+ - llm_int8_threshold: 6.0
1386
+ - llm_int8_skip_modules: None
1387
+ - llm_int8_enable_fp32_cpu_offload: False
1388
+ - llm_int8_has_fp16_weight: False
1389
+ - bnb_4bit_quant_type: nf4
1390
+ - bnb_4bit_use_double_quant: True
1391
+ - bnb_4bit_compute_dtype: bfloat16
1392
+
1393
+ The following `bitsandbytes` quantization config was used during training:
1394
+ - load_in_8bit: False
1395
+ - load_in_4bit: True
1396
+ - llm_int8_threshold: 6.0
1397
+ - llm_int8_skip_modules: None
1398
+ - llm_int8_enable_fp32_cpu_offload: False
1399
+ - llm_int8_has_fp16_weight: False
1400
+ - bnb_4bit_quant_type: nf4
1401
+ - bnb_4bit_use_double_quant: True
1402
+ - bnb_4bit_compute_dtype: bfloat16
1403
+
1404
+ The following `bitsandbytes` quantization config was used during training:
1405
+ - load_in_8bit: False
1406
+ - load_in_4bit: True
1407
+ - llm_int8_threshold: 6.0
1408
+ - llm_int8_skip_modules: None
1409
+ - llm_int8_enable_fp32_cpu_offload: False
1410
+ - llm_int8_has_fp16_weight: False
1411
+ - bnb_4bit_quant_type: nf4
1412
+ - bnb_4bit_use_double_quant: True
1413
+ - bnb_4bit_compute_dtype: bfloat16
1414
+
1415
+ The following `bitsandbytes` quantization config was used during training:
1416
+ - load_in_8bit: False
1417
+ - load_in_4bit: True
1418
+ - llm_int8_threshold: 6.0
1419
+ - llm_int8_skip_modules: None
1420
+ - llm_int8_enable_fp32_cpu_offload: False
1421
+ - llm_int8_has_fp16_weight: False
1422
+ - bnb_4bit_quant_type: nf4
1423
+ - bnb_4bit_use_double_quant: True
1424
+ - bnb_4bit_compute_dtype: bfloat16
1425
+
1426
+ The following `bitsandbytes` quantization config was used during training:
1427
+ - load_in_8bit: False
1428
+ - load_in_4bit: True
1429
+ - llm_int8_threshold: 6.0
1430
+ - llm_int8_skip_modules: None
1431
+ - llm_int8_enable_fp32_cpu_offload: False
1432
+ - llm_int8_has_fp16_weight: False
1433
+ - bnb_4bit_quant_type: nf4
1434
+ - bnb_4bit_use_double_quant: True
1435
+ - bnb_4bit_compute_dtype: bfloat16
1436
+
1437
+ The following `bitsandbytes` quantization config was used during training:
1438
+ - load_in_8bit: False
1439
+ - load_in_4bit: True
1440
+ - llm_int8_threshold: 6.0
1441
+ - llm_int8_skip_modules: None
1442
+ - llm_int8_enable_fp32_cpu_offload: False
1443
+ - llm_int8_has_fp16_weight: False
1444
+ - bnb_4bit_quant_type: nf4
1445
+ - bnb_4bit_use_double_quant: True
1446
+ - bnb_4bit_compute_dtype: bfloat16
1447
+
1448
+ The following `bitsandbytes` quantization config was used during training:
1449
+ - load_in_8bit: False
1450
+ - load_in_4bit: True
1451
+ - llm_int8_threshold: 6.0
1452
+ - llm_int8_skip_modules: None
1453
+ - llm_int8_enable_fp32_cpu_offload: False
1454
+ - llm_int8_has_fp16_weight: False
1455
+ - bnb_4bit_quant_type: nf4
1456
+ - bnb_4bit_use_double_quant: True
1457
+ - bnb_4bit_compute_dtype: bfloat16
1458
+
1459
+ The following `bitsandbytes` quantization config was used during training:
1460
+ - load_in_8bit: False
1461
+ - load_in_4bit: True
1462
+ - llm_int8_threshold: 6.0
1463
+ - llm_int8_skip_modules: None
1464
+ - llm_int8_enable_fp32_cpu_offload: False
1465
+ - llm_int8_has_fp16_weight: False
1466
+ - bnb_4bit_quant_type: nf4
1467
+ - bnb_4bit_use_double_quant: True
1468
+ - bnb_4bit_compute_dtype: bfloat16
1469
+
1470
+ The following `bitsandbytes` quantization config was used during training:
1471
+ - load_in_8bit: False
1472
+ - load_in_4bit: True
1473
+ - llm_int8_threshold: 6.0
1474
+ - llm_int8_skip_modules: None
1475
+ - llm_int8_enable_fp32_cpu_offload: False
1476
+ - llm_int8_has_fp16_weight: False
1477
+ - bnb_4bit_quant_type: nf4
1478
+ - bnb_4bit_use_double_quant: True
1479
+ - bnb_4bit_compute_dtype: bfloat16
1480
+
1481
+ The following `bitsandbytes` quantization config was used during training:
1482
+ - load_in_8bit: False
1483
+ - load_in_4bit: True
1484
+ - llm_int8_threshold: 6.0
1485
+ - llm_int8_skip_modules: None
1486
+ - llm_int8_enable_fp32_cpu_offload: False
1487
+ - llm_int8_has_fp16_weight: False
1488
+ - bnb_4bit_quant_type: nf4
1489
+ - bnb_4bit_use_double_quant: True
1490
+ - bnb_4bit_compute_dtype: bfloat16
1491
+
1492
+ The following `bitsandbytes` quantization config was used during training:
1493
+ - load_in_8bit: False
1494
+ - load_in_4bit: True
1495
+ - llm_int8_threshold: 6.0
1496
+ - llm_int8_skip_modules: None
1497
+ - llm_int8_enable_fp32_cpu_offload: False
1498
+ - llm_int8_has_fp16_weight: False
1499
+ - bnb_4bit_quant_type: nf4
1500
+ - bnb_4bit_use_double_quant: True
1501
+ - bnb_4bit_compute_dtype: bfloat16
1502
+
1503
+ The following `bitsandbytes` quantization config was used during training:
1504
+ - load_in_8bit: False
1505
+ - load_in_4bit: True
1506
+ - llm_int8_threshold: 6.0
1507
+ - llm_int8_skip_modules: None
1508
+ - llm_int8_enable_fp32_cpu_offload: False
1509
+ - llm_int8_has_fp16_weight: False
1510
+ - bnb_4bit_quant_type: nf4
1511
+ - bnb_4bit_use_double_quant: True
1512
+ - bnb_4bit_compute_dtype: bfloat16
1513
+
1514
+ The following `bitsandbytes` quantization config was used during training:
1515
+ - load_in_8bit: False
1516
+ - load_in_4bit: True
1517
+ - llm_int8_threshold: 6.0
1518
+ - llm_int8_skip_modules: None
1519
+ - llm_int8_enable_fp32_cpu_offload: False
1520
+ - llm_int8_has_fp16_weight: False
1521
+ - bnb_4bit_quant_type: nf4
1522
+ - bnb_4bit_use_double_quant: True
1523
+ - bnb_4bit_compute_dtype: bfloat16
1524
+
1525
+ The following `bitsandbytes` quantization config was used during training:
1526
+ - load_in_8bit: False
1527
+ - load_in_4bit: True
1528
+ - llm_int8_threshold: 6.0
1529
+ - llm_int8_skip_modules: None
1530
+ - llm_int8_enable_fp32_cpu_offload: False
1531
+ - llm_int8_has_fp16_weight: False
1532
+ - bnb_4bit_quant_type: nf4
1533
+ - bnb_4bit_use_double_quant: True
1534
+ - bnb_4bit_compute_dtype: bfloat16
1535
+
1536
+ The following `bitsandbytes` quantization config was used during training:
1537
+ - load_in_8bit: False
1538
+ - load_in_4bit: True
1539
+ - llm_int8_threshold: 6.0
1540
+ - llm_int8_skip_modules: None
1541
+ - llm_int8_enable_fp32_cpu_offload: False
1542
+ - llm_int8_has_fp16_weight: False
1543
+ - bnb_4bit_quant_type: nf4
1544
+ - bnb_4bit_use_double_quant: True
1545
+ - bnb_4bit_compute_dtype: bfloat16
1546
+
1547
+ The following `bitsandbytes` quantization config was used during training:
1548
+ - load_in_8bit: False
1549
+ - load_in_4bit: True
1550
+ - llm_int8_threshold: 6.0
1551
+ - llm_int8_skip_modules: None
1552
+ - llm_int8_enable_fp32_cpu_offload: False
1553
+ - llm_int8_has_fp16_weight: False
1554
+ - bnb_4bit_quant_type: nf4
1555
+ - bnb_4bit_use_double_quant: True
1556
+ - bnb_4bit_compute_dtype: bfloat16
1557
+
1558
+ The following `bitsandbytes` quantization config was used during training:
1559
+ - load_in_8bit: False
1560
+ - load_in_4bit: True
1561
+ - llm_int8_threshold: 6.0
1562
+ - llm_int8_skip_modules: None
1563
+ - llm_int8_enable_fp32_cpu_offload: False
1564
+ - llm_int8_has_fp16_weight: False
1565
+ - bnb_4bit_quant_type: nf4
1566
+ - bnb_4bit_use_double_quant: True
1567
+ - bnb_4bit_compute_dtype: bfloat16
1568
+
1569
+ The following `bitsandbytes` quantization config was used during training:
1570
+ - load_in_8bit: False
1571
+ - load_in_4bit: True
1572
+ - llm_int8_threshold: 6.0
1573
+ - llm_int8_skip_modules: None
1574
+ - llm_int8_enable_fp32_cpu_offload: False
1575
+ - llm_int8_has_fp16_weight: False
1576
+ - bnb_4bit_quant_type: nf4
1577
+ - bnb_4bit_use_double_quant: True
1578
+ - bnb_4bit_compute_dtype: bfloat16
1579
+
1580
+ The following `bitsandbytes` quantization config was used during training:
1581
+ - load_in_8bit: False
1582
+ - load_in_4bit: True
1583
+ - llm_int8_threshold: 6.0
1584
+ - llm_int8_skip_modules: None
1585
+ - llm_int8_enable_fp32_cpu_offload: False
1586
+ - llm_int8_has_fp16_weight: False
1587
+ - bnb_4bit_quant_type: nf4
1588
+ - bnb_4bit_use_double_quant: True
1589
+ - bnb_4bit_compute_dtype: bfloat16
1590
+
1591
+ The following `bitsandbytes` quantization config was used during training:
1592
+ - load_in_8bit: False
1593
+ - load_in_4bit: True
1594
+ - llm_int8_threshold: 6.0
1595
+ - llm_int8_skip_modules: None
1596
+ - llm_int8_enable_fp32_cpu_offload: False
1597
+ - llm_int8_has_fp16_weight: False
1598
+ - bnb_4bit_quant_type: nf4
1599
+ - bnb_4bit_use_double_quant: True
1600
+ - bnb_4bit_compute_dtype: bfloat16
1601
+
1602
+ The following `bitsandbytes` quantization config was used during training:
1603
+ - load_in_8bit: False
1604
+ - load_in_4bit: True
1605
+ - llm_int8_threshold: 6.0
1606
+ - llm_int8_skip_modules: None
1607
+ - llm_int8_enable_fp32_cpu_offload: False
1608
+ - llm_int8_has_fp16_weight: False
1609
+ - bnb_4bit_quant_type: nf4
1610
+ - bnb_4bit_use_double_quant: True
1611
+ - bnb_4bit_compute_dtype: bfloat16
1612
+
1613
+ The following `bitsandbytes` quantization config was used during training:
1614
+ - load_in_8bit: False
1615
+ - load_in_4bit: True
1616
+ - llm_int8_threshold: 6.0
1617
+ - llm_int8_skip_modules: None
1618
+ - llm_int8_enable_fp32_cpu_offload: False
1619
+ - llm_int8_has_fp16_weight: False
1620
+ - bnb_4bit_quant_type: nf4
1621
+ - bnb_4bit_use_double_quant: True
1622
+ - bnb_4bit_compute_dtype: bfloat16
1623
+
1624
+ The following `bitsandbytes` quantization config was used during training:
1625
+ - load_in_8bit: False
1626
+ - load_in_4bit: True
1627
+ - llm_int8_threshold: 6.0
1628
+ - llm_int8_skip_modules: None
1629
+ - llm_int8_enable_fp32_cpu_offload: False
1630
+ - llm_int8_has_fp16_weight: False
1631
+ - bnb_4bit_quant_type: nf4
1632
+ - bnb_4bit_use_double_quant: True
1633
+ - bnb_4bit_compute_dtype: bfloat16
1634
+
1635
+ The following `bitsandbytes` quantization config was used during training:
1636
+ - load_in_8bit: False
1637
+ - load_in_4bit: True
1638
+ - llm_int8_threshold: 6.0
1639
+ - llm_int8_skip_modules: None
1640
+ - llm_int8_enable_fp32_cpu_offload: False
1641
+ - llm_int8_has_fp16_weight: False
1642
+ - bnb_4bit_quant_type: nf4
1643
+ - bnb_4bit_use_double_quant: True
1644
+ - bnb_4bit_compute_dtype: bfloat16
1645
+
1646
+ The following `bitsandbytes` quantization config was used during training:
1647
+ - load_in_8bit: False
1648
+ - load_in_4bit: True
1649
+ - llm_int8_threshold: 6.0
1650
+ - llm_int8_skip_modules: None
1651
+ - llm_int8_enable_fp32_cpu_offload: False
1652
+ - llm_int8_has_fp16_weight: False
1653
+ - bnb_4bit_quant_type: nf4
1654
+ - bnb_4bit_use_double_quant: True
1655
+ - bnb_4bit_compute_dtype: bfloat16
1656
+
1657
+ The following `bitsandbytes` quantization config was used during training:
1658
+ - load_in_8bit: False
1659
+ - load_in_4bit: True
1660
+ - llm_int8_threshold: 6.0
1661
+ - llm_int8_skip_modules: None
1662
+ - llm_int8_enable_fp32_cpu_offload: False
1663
+ - llm_int8_has_fp16_weight: False
1664
+ - bnb_4bit_quant_type: nf4
1665
+ - bnb_4bit_use_double_quant: True
1666
+ - bnb_4bit_compute_dtype: bfloat16
1667
+
1668
+ The following `bitsandbytes` quantization config was used during training:
1669
+ - load_in_8bit: False
1670
+ - load_in_4bit: True
1671
+ - llm_int8_threshold: 6.0
1672
+ - llm_int8_skip_modules: None
1673
+ - llm_int8_enable_fp32_cpu_offload: False
1674
+ - llm_int8_has_fp16_weight: False
1675
+ - bnb_4bit_quant_type: nf4
1676
+ - bnb_4bit_use_double_quant: True
1677
+ - bnb_4bit_compute_dtype: bfloat16
1678
+
1679
+ The following `bitsandbytes` quantization config was used during training:
1680
+ - load_in_8bit: False
1681
+ - load_in_4bit: True
1682
+ - llm_int8_threshold: 6.0
1683
+ - llm_int8_skip_modules: None
1684
+ - llm_int8_enable_fp32_cpu_offload: False
1685
+ - llm_int8_has_fp16_weight: False
1686
+ - bnb_4bit_quant_type: nf4
1687
+ - bnb_4bit_use_double_quant: True
1688
+ - bnb_4bit_compute_dtype: bfloat16
1689
+
1690
+ The following `bitsandbytes` quantization config was used during training:
1691
+ - load_in_8bit: False
1692
+ - load_in_4bit: True
1693
+ - llm_int8_threshold: 6.0
1694
+ - llm_int8_skip_modules: None
1695
+ - llm_int8_enable_fp32_cpu_offload: False
1696
+ - llm_int8_has_fp16_weight: False
1697
+ - bnb_4bit_quant_type: nf4
1698
+ - bnb_4bit_use_double_quant: True
1699
+ - bnb_4bit_compute_dtype: bfloat16
1700
+
1701
+ The following `bitsandbytes` quantization config was used during training:
1702
+ - load_in_8bit: False
1703
+ - load_in_4bit: True
1704
+ - llm_int8_threshold: 6.0
1705
+ - llm_int8_skip_modules: None
1706
+ - llm_int8_enable_fp32_cpu_offload: False
1707
+ - llm_int8_has_fp16_weight: False
1708
+ - bnb_4bit_quant_type: nf4
1709
+ - bnb_4bit_use_double_quant: True
1710
+ - bnb_4bit_compute_dtype: bfloat16
1711
+
1712
+ The following `bitsandbytes` quantization config was used during training:
1713
+ - load_in_8bit: False
1714
+ - load_in_4bit: True
1715
+ - llm_int8_threshold: 6.0
1716
+ - llm_int8_skip_modules: None
1717
+ - llm_int8_enable_fp32_cpu_offload: False
1718
+ - llm_int8_has_fp16_weight: False
1719
+ - bnb_4bit_quant_type: nf4
1720
+ - bnb_4bit_use_double_quant: True
1721
+ - bnb_4bit_compute_dtype: bfloat16
1722
+
1723
+ The following `bitsandbytes` quantization config was used during training:
1724
+ - load_in_8bit: False
1725
+ - load_in_4bit: True
1726
+ - llm_int8_threshold: 6.0
1727
+ - llm_int8_skip_modules: None
1728
+ - llm_int8_enable_fp32_cpu_offload: False
1729
+ - llm_int8_has_fp16_weight: False
1730
+ - bnb_4bit_quant_type: nf4
1731
+ - bnb_4bit_use_double_quant: True
1732
+ - bnb_4bit_compute_dtype: bfloat16
1733
+
1734
+ The following `bitsandbytes` quantization config was used during training:
1735
+ - load_in_8bit: False
1736
+ - load_in_4bit: True
1737
+ - llm_int8_threshold: 6.0
1738
+ - llm_int8_skip_modules: None
1739
+ - llm_int8_enable_fp32_cpu_offload: False
1740
+ - llm_int8_has_fp16_weight: False
1741
+ - bnb_4bit_quant_type: nf4
1742
+ - bnb_4bit_use_double_quant: True
1743
+ - bnb_4bit_compute_dtype: bfloat16
1744
+
1745
+ The following `bitsandbytes` quantization config was used during training:
1746
+ - load_in_8bit: False
1747
+ - load_in_4bit: True
1748
+ - llm_int8_threshold: 6.0
1749
+ - llm_int8_skip_modules: None
1750
+ - llm_int8_enable_fp32_cpu_offload: False
1751
+ - llm_int8_has_fp16_weight: False
1752
+ - bnb_4bit_quant_type: nf4
1753
+ - bnb_4bit_use_double_quant: True
1754
+ - bnb_4bit_compute_dtype: bfloat16
1755
+
1756
+ The following `bitsandbytes` quantization config was used during training:
1757
+ - load_in_8bit: False
1758
+ - load_in_4bit: True
1759
+ - llm_int8_threshold: 6.0
1760
+ - llm_int8_skip_modules: None
1761
+ - llm_int8_enable_fp32_cpu_offload: False
1762
+ - llm_int8_has_fp16_weight: False
1763
+ - bnb_4bit_quant_type: nf4
1764
+ - bnb_4bit_use_double_quant: True
1765
+ - bnb_4bit_compute_dtype: bfloat16
1766
+
1767
+ The following `bitsandbytes` quantization config was used during training:
1768
+ - load_in_8bit: False
1769
+ - load_in_4bit: True
1770
+ - llm_int8_threshold: 6.0
1771
+ - llm_int8_skip_modules: None
1772
+ - llm_int8_enable_fp32_cpu_offload: False
1773
+ - llm_int8_has_fp16_weight: False
1774
+ - bnb_4bit_quant_type: nf4
1775
+ - bnb_4bit_use_double_quant: True
1776
+ - bnb_4bit_compute_dtype: bfloat16
1777
+
1778
+ The following `bitsandbytes` quantization config was used during training:
1779
+ - load_in_8bit: False
1780
+ - load_in_4bit: True
1781
+ - llm_int8_threshold: 6.0
1782
+ - llm_int8_skip_modules: None
1783
+ - llm_int8_enable_fp32_cpu_offload: False
1784
+ - llm_int8_has_fp16_weight: False
1785
+ - bnb_4bit_quant_type: nf4
1786
+ - bnb_4bit_use_double_quant: True
1787
+ - bnb_4bit_compute_dtype: bfloat16
1788
+
1789
+ The following `bitsandbytes` quantization config was used during training:
1790
+ - load_in_8bit: False
1791
+ - load_in_4bit: True
1792
+ - llm_int8_threshold: 6.0
1793
+ - llm_int8_skip_modules: None
1794
+ - llm_int8_enable_fp32_cpu_offload: False
1795
+ - llm_int8_has_fp16_weight: False
1796
+ - bnb_4bit_quant_type: nf4
1797
+ - bnb_4bit_use_double_quant: True
1798
+ - bnb_4bit_compute_dtype: bfloat16
1799
+
1800
+ The following `bitsandbytes` quantization config was used during training:
1801
+ - load_in_8bit: False
1802
+ - load_in_4bit: True
1803
+ - llm_int8_threshold: 6.0
1804
+ - llm_int8_skip_modules: None
1805
+ - llm_int8_enable_fp32_cpu_offload: False
1806
+ - llm_int8_has_fp16_weight: False
1807
+ - bnb_4bit_quant_type: nf4
1808
+ - bnb_4bit_use_double_quant: True
1809
+ - bnb_4bit_compute_dtype: bfloat16
1810
+
1811
+ The following `bitsandbytes` quantization config was used during training:
1812
+ - load_in_8bit: False
1813
+ - load_in_4bit: True
1814
+ - llm_int8_threshold: 6.0
1815
+ - llm_int8_skip_modules: None
1816
+ - llm_int8_enable_fp32_cpu_offload: False
1817
+ - llm_int8_has_fp16_weight: False
1818
+ - bnb_4bit_quant_type: nf4
1819
+ - bnb_4bit_use_double_quant: True
1820
+ - bnb_4bit_compute_dtype: bfloat16
1821
+
1822
+ The following `bitsandbytes` quantization config was used during training:
1823
+ - load_in_8bit: False
1824
+ - load_in_4bit: True
1825
+ - llm_int8_threshold: 6.0
1826
+ - llm_int8_skip_modules: None
1827
+ - llm_int8_enable_fp32_cpu_offload: False
1828
+ - llm_int8_has_fp16_weight: False
1829
+ - bnb_4bit_quant_type: nf4
1830
+ - bnb_4bit_use_double_quant: True
1831
+ - bnb_4bit_compute_dtype: bfloat16
1832
+
1833
+ The following `bitsandbytes` quantization config was used during training:
1834
+ - load_in_8bit: False
1835
+ - load_in_4bit: True
1836
+ - llm_int8_threshold: 6.0
1837
+ - llm_int8_skip_modules: None
1838
+ - llm_int8_enable_fp32_cpu_offload: False
1839
+ - llm_int8_has_fp16_weight: False
1840
+ - bnb_4bit_quant_type: nf4
1841
+ - bnb_4bit_use_double_quant: True
1842
+ - bnb_4bit_compute_dtype: bfloat16
1843
+
1844
+ The following `bitsandbytes` quantization config was used during training:
1845
+ - load_in_8bit: False
1846
+ - load_in_4bit: True
1847
+ - llm_int8_threshold: 6.0
1848
+ - llm_int8_skip_modules: None
1849
+ - llm_int8_enable_fp32_cpu_offload: False
1850
+ - llm_int8_has_fp16_weight: False
1851
+ - bnb_4bit_quant_type: nf4
1852
+ - bnb_4bit_use_double_quant: True
1853
+ - bnb_4bit_compute_dtype: bfloat16
1854
+
1855
+ The following `bitsandbytes` quantization config was used during training:
1856
+ - load_in_8bit: False
1857
+ - load_in_4bit: True
1858
+ - llm_int8_threshold: 6.0
1859
+ - llm_int8_skip_modules: None
1860
+ - llm_int8_enable_fp32_cpu_offload: False
1861
+ - llm_int8_has_fp16_weight: False
1862
+ - bnb_4bit_quant_type: nf4
1863
+ - bnb_4bit_use_double_quant: True
1864
+ - bnb_4bit_compute_dtype: bfloat16
1865
+
1866
+ The following `bitsandbytes` quantization config was used during training:
1867
+ - load_in_8bit: False
1868
+ - load_in_4bit: True
1869
+ - llm_int8_threshold: 6.0
1870
+ - llm_int8_skip_modules: None
1871
+ - llm_int8_enable_fp32_cpu_offload: False
1872
+ - llm_int8_has_fp16_weight: False
1873
+ - bnb_4bit_quant_type: nf4
1874
+ - bnb_4bit_use_double_quant: True
1875
+ - bnb_4bit_compute_dtype: bfloat16
1876
+
1877
+ The following `bitsandbytes` quantization config was used during training:
1878
+ - load_in_8bit: False
1879
+ - load_in_4bit: True
1880
+ - llm_int8_threshold: 6.0
1881
+ - llm_int8_skip_modules: None
1882
+ - llm_int8_enable_fp32_cpu_offload: False
1883
+ - llm_int8_has_fp16_weight: False
1884
+ - bnb_4bit_quant_type: nf4
1885
+ - bnb_4bit_use_double_quant: True
1886
+ - bnb_4bit_compute_dtype: bfloat16
1887
+
1888
+ The following `bitsandbytes` quantization config was used during training:
1889
+ - load_in_8bit: False
1890
+ - load_in_4bit: True
1891
+ - llm_int8_threshold: 6.0
1892
+ - llm_int8_skip_modules: None
1893
+ - llm_int8_enable_fp32_cpu_offload: False
1894
+ - llm_int8_has_fp16_weight: False
1895
+ - bnb_4bit_quant_type: nf4
1896
+ - bnb_4bit_use_double_quant: True
1897
+ - bnb_4bit_compute_dtype: bfloat16
1898
+
1899
+ The following `bitsandbytes` quantization config was used during training:
1900
+ - load_in_8bit: False
1901
+ - load_in_4bit: True
1902
+ - llm_int8_threshold: 6.0
1903
+ - llm_int8_skip_modules: None
1904
+ - llm_int8_enable_fp32_cpu_offload: False
1905
+ - llm_int8_has_fp16_weight: False
1906
+ - bnb_4bit_quant_type: nf4
1907
+ - bnb_4bit_use_double_quant: True
1908
+ - bnb_4bit_compute_dtype: bfloat16
1909
+
1910
+ The following `bitsandbytes` quantization config was used during training:
1911
+ - load_in_8bit: False
1912
+ - load_in_4bit: True
1913
+ - llm_int8_threshold: 6.0
1914
+ - llm_int8_skip_modules: None
1915
+ - llm_int8_enable_fp32_cpu_offload: False
1916
+ - llm_int8_has_fp16_weight: False
1917
+ - bnb_4bit_quant_type: nf4
1918
+ - bnb_4bit_use_double_quant: True
1919
+ - bnb_4bit_compute_dtype: bfloat16
1920
+
1921
+ The following `bitsandbytes` quantization config was used during training:
1922
+ - load_in_8bit: False
1923
+ - load_in_4bit: True
1924
+ - llm_int8_threshold: 6.0
1925
+ - llm_int8_skip_modules: None
1926
+ - llm_int8_enable_fp32_cpu_offload: False
1927
+ - llm_int8_has_fp16_weight: False
1928
+ - bnb_4bit_quant_type: nf4
1929
+ - bnb_4bit_use_double_quant: True
1930
+ - bnb_4bit_compute_dtype: bfloat16
1931
+
1932
+ The following `bitsandbytes` quantization config was used during training:
1933
+ - load_in_8bit: False
1934
+ - load_in_4bit: True
1935
+ - llm_int8_threshold: 6.0
1936
+ - llm_int8_skip_modules: None
1937
+ - llm_int8_enable_fp32_cpu_offload: False
1938
+ - llm_int8_has_fp16_weight: False
1939
+ - bnb_4bit_quant_type: nf4
1940
+ - bnb_4bit_use_double_quant: True
1941
+ - bnb_4bit_compute_dtype: bfloat16
1942
+
1943
+ The following `bitsandbytes` quantization config was used during training:
1944
+ - load_in_8bit: False
1945
+ - load_in_4bit: True
1946
+ - llm_int8_threshold: 6.0
1947
+ - llm_int8_skip_modules: None
1948
+ - llm_int8_enable_fp32_cpu_offload: False
1949
+ - llm_int8_has_fp16_weight: False
1950
+ - bnb_4bit_quant_type: nf4
1951
+ - bnb_4bit_use_double_quant: True
1952
+ - bnb_4bit_compute_dtype: bfloat16
1953
+
1954
+ The following `bitsandbytes` quantization config was used during training:
1955
+ - load_in_8bit: False
1956
+ - load_in_4bit: True
1957
+ - llm_int8_threshold: 6.0
1958
+ - llm_int8_skip_modules: None
1959
+ - llm_int8_enable_fp32_cpu_offload: False
1960
+ - llm_int8_has_fp16_weight: False
1961
+ - bnb_4bit_quant_type: nf4
1962
+ - bnb_4bit_use_double_quant: True
1963
+ - bnb_4bit_compute_dtype: bfloat16
1964
+
1965
+ The following `bitsandbytes` quantization config was used during training:
1966
+ - load_in_8bit: False
1967
+ - load_in_4bit: True
1968
+ - llm_int8_threshold: 6.0
1969
+ - llm_int8_skip_modules: None
1970
+ - llm_int8_enable_fp32_cpu_offload: False
1971
+ - llm_int8_has_fp16_weight: False
1972
+ - bnb_4bit_quant_type: nf4
1973
+ - bnb_4bit_use_double_quant: True
1974
+ - bnb_4bit_compute_dtype: bfloat16
1975
+
1976
+ The following `bitsandbytes` quantization config was used during training:
1977
+ - load_in_8bit: False
1978
+ - load_in_4bit: True
1979
+ - llm_int8_threshold: 6.0
1980
+ - llm_int8_skip_modules: None
1981
+ - llm_int8_enable_fp32_cpu_offload: False
1982
+ - llm_int8_has_fp16_weight: False
1983
+ - bnb_4bit_quant_type: nf4
1984
+ - bnb_4bit_use_double_quant: True
1985
+ - bnb_4bit_compute_dtype: bfloat16
1986
+
1987
+ The following `bitsandbytes` quantization config was used during training:
1988
+ - load_in_8bit: False
1989
+ - load_in_4bit: True
1990
+ - llm_int8_threshold: 6.0
1991
+ - llm_int8_skip_modules: None
1992
+ - llm_int8_enable_fp32_cpu_offload: False
1993
+ - llm_int8_has_fp16_weight: False
1994
+ - bnb_4bit_quant_type: nf4
1995
+ - bnb_4bit_use_double_quant: True
1996
+ - bnb_4bit_compute_dtype: bfloat16
1997
+
1998
+ The following `bitsandbytes` quantization config was used during training:
1999
+ - load_in_8bit: False
2000
+ - load_in_4bit: True
2001
+ - llm_int8_threshold: 6.0
2002
+ - llm_int8_skip_modules: None
2003
+ - llm_int8_enable_fp32_cpu_offload: False
2004
+ - llm_int8_has_fp16_weight: False
2005
+ - bnb_4bit_quant_type: nf4
2006
+ - bnb_4bit_use_double_quant: True
2007
+ - bnb_4bit_compute_dtype: bfloat16
2008
+
2009
+ The following `bitsandbytes` quantization config was used during training:
2010
+ - load_in_8bit: False
2011
+ - load_in_4bit: True
2012
+ - llm_int8_threshold: 6.0
2013
+ - llm_int8_skip_modules: None
2014
+ - llm_int8_enable_fp32_cpu_offload: False
2015
+ - llm_int8_has_fp16_weight: False
2016
+ - bnb_4bit_quant_type: nf4
2017
+ - bnb_4bit_use_double_quant: True
2018
+ - bnb_4bit_compute_dtype: bfloat16
2019
+
2020
+ The following `bitsandbytes` quantization config was used during training:
2021
+ - load_in_8bit: False
2022
+ - load_in_4bit: True
2023
+ - llm_int8_threshold: 6.0
2024
+ - llm_int8_skip_modules: None
2025
+ - llm_int8_enable_fp32_cpu_offload: False
2026
+ - llm_int8_has_fp16_weight: False
2027
+ - bnb_4bit_quant_type: nf4
2028
+ - bnb_4bit_use_double_quant: True
2029
+ - bnb_4bit_compute_dtype: bfloat16
2030
+
2031
+ The following `bitsandbytes` quantization config was used during training:
2032
+ - load_in_8bit: False
2033
+ - load_in_4bit: True
2034
+ - llm_int8_threshold: 6.0
2035
+ - llm_int8_skip_modules: None
2036
+ - llm_int8_enable_fp32_cpu_offload: False
2037
+ - llm_int8_has_fp16_weight: False
2038
+ - bnb_4bit_quant_type: nf4
2039
+ - bnb_4bit_use_double_quant: True
2040
+ - bnb_4bit_compute_dtype: bfloat16
2041
+
2042
+ The following `bitsandbytes` quantization config was used during training:
2043
+ - load_in_8bit: False
2044
+ - load_in_4bit: True
2045
+ - llm_int8_threshold: 6.0
2046
+ - llm_int8_skip_modules: None
2047
+ - llm_int8_enable_fp32_cpu_offload: False
2048
+ - llm_int8_has_fp16_weight: False
2049
+ - bnb_4bit_quant_type: nf4
2050
+ - bnb_4bit_use_double_quant: True
2051
+ - bnb_4bit_compute_dtype: bfloat16
2052
+
2053
+ The following `bitsandbytes` quantization config was used during training:
2054
+ - load_in_8bit: False
2055
+ - load_in_4bit: True
2056
+ - llm_int8_threshold: 6.0
2057
+ - llm_int8_skip_modules: None
2058
+ - llm_int8_enable_fp32_cpu_offload: False
2059
+ - llm_int8_has_fp16_weight: False
2060
+ - bnb_4bit_quant_type: nf4
2061
+ - bnb_4bit_use_double_quant: True
2062
+ - bnb_4bit_compute_dtype: bfloat16
2063
+
2064
+ The following `bitsandbytes` quantization config was used during training:
2065
+ - load_in_8bit: False
2066
+ - load_in_4bit: True
2067
+ - llm_int8_threshold: 6.0
2068
+ - llm_int8_skip_modules: None
2069
+ - llm_int8_enable_fp32_cpu_offload: False
2070
+ - llm_int8_has_fp16_weight: False
2071
+ - bnb_4bit_quant_type: nf4
2072
+ - bnb_4bit_use_double_quant: True
2073
+ - bnb_4bit_compute_dtype: bfloat16
2074
+
2075
+ The following `bitsandbytes` quantization config was used during training:
2076
+ - load_in_8bit: False
2077
+ - load_in_4bit: True
2078
+ - llm_int8_threshold: 6.0
2079
+ - llm_int8_skip_modules: None
2080
+ - llm_int8_enable_fp32_cpu_offload: False
2081
+ - llm_int8_has_fp16_weight: False
2082
+ - bnb_4bit_quant_type: nf4
2083
+ - bnb_4bit_use_double_quant: True
2084
+ - bnb_4bit_compute_dtype: bfloat16
2085
+
2086
+ The following `bitsandbytes` quantization config was used during training:
2087
+ - load_in_8bit: False
2088
+ - load_in_4bit: True
2089
+ - llm_int8_threshold: 6.0
2090
+ - llm_int8_skip_modules: None
2091
+ - llm_int8_enable_fp32_cpu_offload: False
2092
+ - llm_int8_has_fp16_weight: False
2093
+ - bnb_4bit_quant_type: nf4
2094
+ - bnb_4bit_use_double_quant: True
2095
+ - bnb_4bit_compute_dtype: bfloat16
2096
+
2097
+ The following `bitsandbytes` quantization config was used during training:
2098
+ - load_in_8bit: False
2099
+ - load_in_4bit: True
2100
+ - llm_int8_threshold: 6.0
2101
+ - llm_int8_skip_modules: None
2102
+ - llm_int8_enable_fp32_cpu_offload: False
2103
+ - llm_int8_has_fp16_weight: False
2104
+ - bnb_4bit_quant_type: nf4
2105
+ - bnb_4bit_use_double_quant: True
2106
+ - bnb_4bit_compute_dtype: bfloat16
2107
+
2108
+ The following `bitsandbytes` quantization config was used during training:
2109
+ - load_in_8bit: False
2110
+ - load_in_4bit: True
2111
+ - llm_int8_threshold: 6.0
2112
+ - llm_int8_skip_modules: None
2113
+ - llm_int8_enable_fp32_cpu_offload: False
2114
+ - llm_int8_has_fp16_weight: False
2115
+ - bnb_4bit_quant_type: nf4
2116
+ - bnb_4bit_use_double_quant: True
2117
+ - bnb_4bit_compute_dtype: bfloat16
2118
+
2119
+ The following `bitsandbytes` quantization config was used during training:
2120
+ - load_in_8bit: False
2121
+ - load_in_4bit: True
2122
+ - llm_int8_threshold: 6.0
2123
+ - llm_int8_skip_modules: None
2124
+ - llm_int8_enable_fp32_cpu_offload: False
2125
+ - llm_int8_has_fp16_weight: False
2126
+ - bnb_4bit_quant_type: nf4
2127
+ - bnb_4bit_use_double_quant: True
2128
+ - bnb_4bit_compute_dtype: bfloat16
2129
+
2130
+ The following `bitsandbytes` quantization config was used during training:
2131
+ - load_in_8bit: False
2132
+ - load_in_4bit: True
2133
+ - llm_int8_threshold: 6.0
2134
+ - llm_int8_skip_modules: None
2135
+ - llm_int8_enable_fp32_cpu_offload: False
2136
+ - llm_int8_has_fp16_weight: False
2137
+ - bnb_4bit_quant_type: nf4
2138
+ - bnb_4bit_use_double_quant: True
2139
+ - bnb_4bit_compute_dtype: bfloat16
2140
+
2141
+ The following `bitsandbytes` quantization config was used during training:
2142
+ - load_in_8bit: False
2143
+ - load_in_4bit: True
2144
+ - llm_int8_threshold: 6.0
2145
+ - llm_int8_skip_modules: None
2146
+ - llm_int8_enable_fp32_cpu_offload: False
2147
+ - llm_int8_has_fp16_weight: False
2148
+ - bnb_4bit_quant_type: nf4
2149
+ - bnb_4bit_use_double_quant: True
2150
+ - bnb_4bit_compute_dtype: bfloat16
2151
+
2152
+ The following `bitsandbytes` quantization config was used during training:
2153
+ - load_in_8bit: False
2154
+ - load_in_4bit: True
2155
+ - llm_int8_threshold: 6.0
2156
+ - llm_int8_skip_modules: None
2157
+ - llm_int8_enable_fp32_cpu_offload: False
2158
+ - llm_int8_has_fp16_weight: False
2159
+ - bnb_4bit_quant_type: nf4
2160
+ - bnb_4bit_use_double_quant: True
2161
+ - bnb_4bit_compute_dtype: bfloat16
2162
+
2163
+ The following `bitsandbytes` quantization config was used during training:
2164
+ - load_in_8bit: False
2165
+ - load_in_4bit: True
2166
+ - llm_int8_threshold: 6.0
2167
+ - llm_int8_skip_modules: None
2168
+ - llm_int8_enable_fp32_cpu_offload: False
2169
+ - llm_int8_has_fp16_weight: False
2170
+ - bnb_4bit_quant_type: nf4
2171
+ - bnb_4bit_use_double_quant: True
2172
+ - bnb_4bit_compute_dtype: bfloat16
2173
+
2174
+ The following `bitsandbytes` quantization config was used during training:
2175
+ - load_in_8bit: False
2176
+ - load_in_4bit: True
2177
+ - llm_int8_threshold: 6.0
2178
+ - llm_int8_skip_modules: None
2179
+ - llm_int8_enable_fp32_cpu_offload: False
2180
+ - llm_int8_has_fp16_weight: False
2181
+ - bnb_4bit_quant_type: nf4
2182
+ - bnb_4bit_use_double_quant: True
2183
+ - bnb_4bit_compute_dtype: bfloat16
2184
+
2185
+ The following `bitsandbytes` quantization config was used during training:
2186
+ - load_in_8bit: False
2187
+ - load_in_4bit: True
2188
+ - llm_int8_threshold: 6.0
2189
+ - llm_int8_skip_modules: None
2190
+ - llm_int8_enable_fp32_cpu_offload: False
2191
+ - llm_int8_has_fp16_weight: False
2192
+ - bnb_4bit_quant_type: nf4
2193
+ - bnb_4bit_use_double_quant: True
2194
+ - bnb_4bit_compute_dtype: bfloat16
2195
+
2196
+ The following `bitsandbytes` quantization config was used during training:
2197
+ - load_in_8bit: False
2198
+ - load_in_4bit: True
2199
+ - llm_int8_threshold: 6.0
2200
+ - llm_int8_skip_modules: None
2201
+ - llm_int8_enable_fp32_cpu_offload: False
2202
+ - llm_int8_has_fp16_weight: False
2203
+ - bnb_4bit_quant_type: nf4
2204
+ - bnb_4bit_use_double_quant: True
2205
+ - bnb_4bit_compute_dtype: bfloat16
2206
+
2207
+ The following `bitsandbytes` quantization config was used during training:
2208
+ - load_in_8bit: False
2209
+ - load_in_4bit: True
2210
+ - llm_int8_threshold: 6.0
2211
+ - llm_int8_skip_modules: None
2212
+ - llm_int8_enable_fp32_cpu_offload: False
2213
+ - llm_int8_has_fp16_weight: False
2214
+ - bnb_4bit_quant_type: nf4
2215
+ - bnb_4bit_use_double_quant: True
2216
+ - bnb_4bit_compute_dtype: bfloat16
2217
+
2218
+ The following `bitsandbytes` quantization config was used during training:
2219
+ - load_in_8bit: False
2220
+ - load_in_4bit: True
2221
+ - llm_int8_threshold: 6.0
2222
+ - llm_int8_skip_modules: None
2223
+ - llm_int8_enable_fp32_cpu_offload: False
2224
+ - llm_int8_has_fp16_weight: False
2225
+ - bnb_4bit_quant_type: nf4
2226
+ - bnb_4bit_use_double_quant: True
2227
+ - bnb_4bit_compute_dtype: bfloat16
2228
+
2229
+ The following `bitsandbytes` quantization config was used during training:
2230
+ - load_in_8bit: False
2231
+ - load_in_4bit: True
2232
+ - llm_int8_threshold: 6.0
2233
+ - llm_int8_skip_modules: None
2234
+ - llm_int8_enable_fp32_cpu_offload: False
2235
+ - llm_int8_has_fp16_weight: False
2236
+ - bnb_4bit_quant_type: nf4
2237
+ - bnb_4bit_use_double_quant: True
2238
+ - bnb_4bit_compute_dtype: bfloat16
2239
+
2240
+ The following `bitsandbytes` quantization config was used during training:
2241
+ - load_in_8bit: False
2242
+ - load_in_4bit: True
2243
+ - llm_int8_threshold: 6.0
2244
+ - llm_int8_skip_modules: None
2245
+ - llm_int8_enable_fp32_cpu_offload: False
2246
+ - llm_int8_has_fp16_weight: False
2247
+ - bnb_4bit_quant_type: nf4
2248
+ - bnb_4bit_use_double_quant: True
2249
+ - bnb_4bit_compute_dtype: bfloat16
2250
+
2251
+ The following `bitsandbytes` quantization config was used during training:
2252
+ - load_in_8bit: False
2253
+ - load_in_4bit: True
2254
+ - llm_int8_threshold: 6.0
2255
+ - llm_int8_skip_modules: None
2256
+ - llm_int8_enable_fp32_cpu_offload: False
2257
+ - llm_int8_has_fp16_weight: False
2258
+ - bnb_4bit_quant_type: nf4
2259
+ - bnb_4bit_use_double_quant: True
2260
+ - bnb_4bit_compute_dtype: bfloat16
2261
+
2262
+ The following `bitsandbytes` quantization config was used during training:
2263
+ - load_in_8bit: False
2264
+ - load_in_4bit: True
2265
+ - llm_int8_threshold: 6.0
2266
+ - llm_int8_skip_modules: None
2267
+ - llm_int8_enable_fp32_cpu_offload: False
2268
+ - llm_int8_has_fp16_weight: False
2269
+ - bnb_4bit_quant_type: nf4
2270
+ - bnb_4bit_use_double_quant: True
2271
+ - bnb_4bit_compute_dtype: bfloat16
2272
+
2273
+ The following `bitsandbytes` quantization config was used during training:
2274
+ - load_in_8bit: False
2275
+ - load_in_4bit: True
2276
+ - llm_int8_threshold: 6.0
2277
+ - llm_int8_skip_modules: None
2278
+ - llm_int8_enable_fp32_cpu_offload: False
2279
+ - llm_int8_has_fp16_weight: False
2280
+ - bnb_4bit_quant_type: nf4
2281
+ - bnb_4bit_use_double_quant: True
2282
+ - bnb_4bit_compute_dtype: bfloat16
2283
+
2284
+ The following `bitsandbytes` quantization config was used during training:
2285
+ - load_in_8bit: False
2286
+ - load_in_4bit: True
2287
+ - llm_int8_threshold: 6.0
2288
+ - llm_int8_skip_modules: None
2289
+ - llm_int8_enable_fp32_cpu_offload: False
2290
+ - llm_int8_has_fp16_weight: False
2291
+ - bnb_4bit_quant_type: nf4
2292
+ - bnb_4bit_use_double_quant: True
2293
+ - bnb_4bit_compute_dtype: bfloat16
2294
+
2295
+ The following `bitsandbytes` quantization config was used during training:
2296
+ - load_in_8bit: False
2297
+ - load_in_4bit: True
2298
+ - llm_int8_threshold: 6.0
2299
+ - llm_int8_skip_modules: None
2300
+ - llm_int8_enable_fp32_cpu_offload: False
2301
+ - llm_int8_has_fp16_weight: False
2302
+ - bnb_4bit_quant_type: nf4
2303
+ - bnb_4bit_use_double_quant: True
2304
+ - bnb_4bit_compute_dtype: bfloat16
2305
+
2306
+ The following `bitsandbytes` quantization config was used during training:
2307
+ - load_in_8bit: False
2308
+ - load_in_4bit: True
2309
+ - llm_int8_threshold: 6.0
2310
+ - llm_int8_skip_modules: None
2311
+ - llm_int8_enable_fp32_cpu_offload: False
2312
+ - llm_int8_has_fp16_weight: False
2313
+ - bnb_4bit_quant_type: nf4
2314
+ - bnb_4bit_use_double_quant: True
2315
+ - bnb_4bit_compute_dtype: bfloat16
2316
+
2317
+ The following `bitsandbytes` quantization config was used during training:
2318
+ - load_in_8bit: False
2319
+ - load_in_4bit: True
2320
+ - llm_int8_threshold: 6.0
2321
+ - llm_int8_skip_modules: None
2322
+ - llm_int8_enable_fp32_cpu_offload: False
2323
+ - llm_int8_has_fp16_weight: False
2324
+ - bnb_4bit_quant_type: nf4
2325
+ - bnb_4bit_use_double_quant: True
2326
+ - bnb_4bit_compute_dtype: bfloat16
2327
+
2328
+ The following `bitsandbytes` quantization config was used during training:
2329
+ - load_in_8bit: False
2330
+ - load_in_4bit: True
2331
+ - llm_int8_threshold: 6.0
2332
+ - llm_int8_skip_modules: None
2333
+ - llm_int8_enable_fp32_cpu_offload: False
2334
+ - llm_int8_has_fp16_weight: False
2335
+ - bnb_4bit_quant_type: nf4
2336
+ - bnb_4bit_use_double_quant: True
2337
+ - bnb_4bit_compute_dtype: bfloat16
2338
+
2339
+ The following `bitsandbytes` quantization config was used during training:
2340
+ - load_in_8bit: False
2341
+ - load_in_4bit: True
2342
+ - llm_int8_threshold: 6.0
2343
+ - llm_int8_skip_modules: None
2344
+ - llm_int8_enable_fp32_cpu_offload: False
2345
+ - llm_int8_has_fp16_weight: False
2346
+ - bnb_4bit_quant_type: nf4
2347
+ - bnb_4bit_use_double_quant: True
2348
+ - bnb_4bit_compute_dtype: bfloat16
2349
+
2350
+ The following `bitsandbytes` quantization config was used during training:
2351
+ - load_in_8bit: False
2352
+ - load_in_4bit: True
2353
+ - llm_int8_threshold: 6.0
2354
+ - llm_int8_skip_modules: None
2355
+ - llm_int8_enable_fp32_cpu_offload: False
2356
+ - llm_int8_has_fp16_weight: False
2357
+ - bnb_4bit_quant_type: nf4
2358
+ - bnb_4bit_use_double_quant: True
2359
+ - bnb_4bit_compute_dtype: bfloat16
2360
+
2361
+ The following `bitsandbytes` quantization config was used during training:
2362
+ - load_in_8bit: False
2363
+ - load_in_4bit: True
2364
+ - llm_int8_threshold: 6.0
2365
+ - llm_int8_skip_modules: None
2366
+ - llm_int8_enable_fp32_cpu_offload: False
2367
+ - llm_int8_has_fp16_weight: False
2368
+ - bnb_4bit_quant_type: nf4
2369
+ - bnb_4bit_use_double_quant: True
2370
+ - bnb_4bit_compute_dtype: bfloat16
2371
+
2372
+ The following `bitsandbytes` quantization config was used during training:
2373
+ - load_in_8bit: False
2374
+ - load_in_4bit: True
2375
+ - llm_int8_threshold: 6.0
2376
+ - llm_int8_skip_modules: None
2377
+ - llm_int8_enable_fp32_cpu_offload: False
2378
+ - llm_int8_has_fp16_weight: False
2379
+ - bnb_4bit_quant_type: nf4
2380
+ - bnb_4bit_use_double_quant: True
2381
+ - bnb_4bit_compute_dtype: bfloat16
2382
+
2383
+ The following `bitsandbytes` quantization config was used during training:
2384
+ - load_in_8bit: False
2385
+ - load_in_4bit: True
2386
+ - llm_int8_threshold: 6.0
2387
+ - llm_int8_skip_modules: None
2388
+ - llm_int8_enable_fp32_cpu_offload: False
2389
+ - llm_int8_has_fp16_weight: False
2390
+ - bnb_4bit_quant_type: nf4
2391
+ - bnb_4bit_use_double_quant: True
2392
+ - bnb_4bit_compute_dtype: bfloat16
2393
+
2394
+ The following `bitsandbytes` quantization config was used during training:
2395
+ - load_in_8bit: False
2396
+ - load_in_4bit: True
2397
+ - llm_int8_threshold: 6.0
2398
+ - llm_int8_skip_modules: None
2399
+ - llm_int8_enable_fp32_cpu_offload: False
2400
+ - llm_int8_has_fp16_weight: False
2401
+ - bnb_4bit_quant_type: nf4
2402
+ - bnb_4bit_use_double_quant: True
2403
+ - bnb_4bit_compute_dtype: bfloat16
2404
+
2405
+ The following `bitsandbytes` quantization config was used during training:
2406
+ - load_in_8bit: False
2407
+ - load_in_4bit: True
2408
+ - llm_int8_threshold: 6.0
2409
+ - llm_int8_skip_modules: None
2410
+ - llm_int8_enable_fp32_cpu_offload: False
2411
+ - llm_int8_has_fp16_weight: False
2412
+ - bnb_4bit_quant_type: nf4
2413
+ - bnb_4bit_use_double_quant: True
2414
+ - bnb_4bit_compute_dtype: bfloat16
2415
+
2416
+ The following `bitsandbytes` quantization config was used during training:
2417
+ - load_in_8bit: False
2418
+ - load_in_4bit: True
2419
+ - llm_int8_threshold: 6.0
2420
+ - llm_int8_skip_modules: None
2421
+ - llm_int8_enable_fp32_cpu_offload: False
2422
+ - llm_int8_has_fp16_weight: False
2423
+ - bnb_4bit_quant_type: nf4
2424
+ - bnb_4bit_use_double_quant: True
2425
+ - bnb_4bit_compute_dtype: bfloat16
2426
+
2427
+ The following `bitsandbytes` quantization config was used during training:
2428
+ - load_in_8bit: False
2429
+ - load_in_4bit: True
2430
+ - llm_int8_threshold: 6.0
2431
+ - llm_int8_skip_modules: None
2432
+ - llm_int8_enable_fp32_cpu_offload: False
2433
+ - llm_int8_has_fp16_weight: False
2434
+ - bnb_4bit_quant_type: nf4
2435
+ - bnb_4bit_use_double_quant: True
2436
+ - bnb_4bit_compute_dtype: bfloat16
2437
+
2438
+ The following `bitsandbytes` quantization config was used during training:
2439
+ - load_in_8bit: False
2440
+ - load_in_4bit: True
2441
+ - llm_int8_threshold: 6.0
2442
+ - llm_int8_skip_modules: None
2443
+ - llm_int8_enable_fp32_cpu_offload: False
2444
+ - llm_int8_has_fp16_weight: False
2445
+ - bnb_4bit_quant_type: nf4
2446
+ - bnb_4bit_use_double_quant: True
2447
+ - bnb_4bit_compute_dtype: bfloat16
2448
+
2449
+ The following `bitsandbytes` quantization config was used during training:
2450
+ - load_in_8bit: False
2451
+ - load_in_4bit: True
2452
+ - llm_int8_threshold: 6.0
2453
+ - llm_int8_skip_modules: None
2454
+ - llm_int8_enable_fp32_cpu_offload: False
2455
+ - llm_int8_has_fp16_weight: False
2456
+ - bnb_4bit_quant_type: nf4
2457
+ - bnb_4bit_use_double_quant: True
2458
+ - bnb_4bit_compute_dtype: bfloat16
2459
+
2460
+ The following `bitsandbytes` quantization config was used during training:
2461
+ - load_in_8bit: False
2462
+ - load_in_4bit: True
2463
+ - llm_int8_threshold: 6.0
2464
+ - llm_int8_skip_modules: None
2465
+ - llm_int8_enable_fp32_cpu_offload: False
2466
+ - llm_int8_has_fp16_weight: False
2467
+ - bnb_4bit_quant_type: nf4
2468
+ - bnb_4bit_use_double_quant: True
2469
+ - bnb_4bit_compute_dtype: bfloat16
2470
+
2471
+ The following `bitsandbytes` quantization config was used during training:
2472
+ - load_in_8bit: False
2473
+ - load_in_4bit: True
2474
+ - llm_int8_threshold: 6.0
2475
+ - llm_int8_skip_modules: None
2476
+ - llm_int8_enable_fp32_cpu_offload: False
2477
+ - llm_int8_has_fp16_weight: False
2478
+ - bnb_4bit_quant_type: nf4
2479
+ - bnb_4bit_use_double_quant: True
2480
+ - bnb_4bit_compute_dtype: bfloat16
2481
+
2482
+ The following `bitsandbytes` quantization config was used during training:
2483
+ - load_in_8bit: False
2484
+ - load_in_4bit: True
2485
+ - llm_int8_threshold: 6.0
2486
+ - llm_int8_skip_modules: None
2487
+ - llm_int8_enable_fp32_cpu_offload: False
2488
+ - llm_int8_has_fp16_weight: False
2489
+ - bnb_4bit_quant_type: nf4
2490
+ - bnb_4bit_use_double_quant: True
2491
+ - bnb_4bit_compute_dtype: bfloat16
2492
+
2493
+ The following `bitsandbytes` quantization config was used during training:
2494
+ - load_in_8bit: False
2495
+ - load_in_4bit: True
2496
+ - llm_int8_threshold: 6.0
2497
+ - llm_int8_skip_modules: None
2498
+ - llm_int8_enable_fp32_cpu_offload: False
2499
+ - llm_int8_has_fp16_weight: False
2500
+ - bnb_4bit_quant_type: nf4
2501
+ - bnb_4bit_use_double_quant: True
2502
+ - bnb_4bit_compute_dtype: bfloat16
2503
+
2504
+ The following `bitsandbytes` quantization config was used during training:
2505
+ - load_in_8bit: False
2506
+ - load_in_4bit: True
2507
+ - llm_int8_threshold: 6.0
2508
+ - llm_int8_skip_modules: None
2509
+ - llm_int8_enable_fp32_cpu_offload: False
2510
+ - llm_int8_has_fp16_weight: False
2511
+ - bnb_4bit_quant_type: nf4
2512
+ - bnb_4bit_use_double_quant: True
2513
+ - bnb_4bit_compute_dtype: bfloat16
2514
+
2515
+ The following `bitsandbytes` quantization config was used during training:
2516
+ - load_in_8bit: False
2517
+ - load_in_4bit: True
2518
+ - llm_int8_threshold: 6.0
2519
+ - llm_int8_skip_modules: None
2520
+ - llm_int8_enable_fp32_cpu_offload: False
2521
+ - llm_int8_has_fp16_weight: False
2522
+ - bnb_4bit_quant_type: nf4
2523
+ - bnb_4bit_use_double_quant: True
2524
+ - bnb_4bit_compute_dtype: bfloat16
2525
+
2526
+ The following `bitsandbytes` quantization config was used during training:
2527
+ - load_in_8bit: False
2528
+ - load_in_4bit: True
2529
+ - llm_int8_threshold: 6.0
2530
+ - llm_int8_skip_modules: None
2531
+ - llm_int8_enable_fp32_cpu_offload: False
2532
+ - llm_int8_has_fp16_weight: False
2533
+ - bnb_4bit_quant_type: nf4
2534
+ - bnb_4bit_use_double_quant: True
2535
+ - bnb_4bit_compute_dtype: bfloat16
2536
+
2537
+ The following `bitsandbytes` quantization config was used during training:
2538
+ - load_in_8bit: False
2539
+ - load_in_4bit: True
2540
+ - llm_int8_threshold: 6.0
2541
+ - llm_int8_skip_modules: None
2542
+ - llm_int8_enable_fp32_cpu_offload: False
2543
+ - llm_int8_has_fp16_weight: False
2544
+ - bnb_4bit_quant_type: nf4
2545
+ - bnb_4bit_use_double_quant: True
2546
+ - bnb_4bit_compute_dtype: bfloat16
2547
+
2548
+ The following `bitsandbytes` quantization config was used during training:
2549
+ - load_in_8bit: False
2550
+ - load_in_4bit: True
2551
+ - llm_int8_threshold: 6.0
2552
+ - llm_int8_skip_modules: None
2553
+ - llm_int8_enable_fp32_cpu_offload: False
2554
+ - llm_int8_has_fp16_weight: False
2555
+ - bnb_4bit_quant_type: nf4
2556
+ - bnb_4bit_use_double_quant: True
2557
+ - bnb_4bit_compute_dtype: bfloat16
2558
+
2559
+ The following `bitsandbytes` quantization config was used during training:
2560
+ - load_in_8bit: False
2561
+ - load_in_4bit: True
2562
+ - llm_int8_threshold: 6.0
2563
+ - llm_int8_skip_modules: None
2564
+ - llm_int8_enable_fp32_cpu_offload: False
2565
+ - llm_int8_has_fp16_weight: False
2566
+ - bnb_4bit_quant_type: nf4
2567
+ - bnb_4bit_use_double_quant: True
2568
+ - bnb_4bit_compute_dtype: bfloat16
2569
+
2570
+ The following `bitsandbytes` quantization config was used during training:
2571
+ - load_in_8bit: False
2572
+ - load_in_4bit: True
2573
+ - llm_int8_threshold: 6.0
2574
+ - llm_int8_skip_modules: None
2575
+ - llm_int8_enable_fp32_cpu_offload: False
2576
+ - llm_int8_has_fp16_weight: False
2577
+ - bnb_4bit_quant_type: nf4
2578
+ - bnb_4bit_use_double_quant: True
2579
+ - bnb_4bit_compute_dtype: bfloat16
2580
+
2581
+ The following `bitsandbytes` quantization config was used during training:
2582
+ - load_in_8bit: False
2583
+ - load_in_4bit: True
2584
+ - llm_int8_threshold: 6.0
2585
+ - llm_int8_skip_modules: None
2586
+ - llm_int8_enable_fp32_cpu_offload: False
2587
+ - llm_int8_has_fp16_weight: False
2588
+ - bnb_4bit_quant_type: nf4
2589
+ - bnb_4bit_use_double_quant: True
2590
+ - bnb_4bit_compute_dtype: bfloat16
2591
+
2592
+ The following `bitsandbytes` quantization config was used during training:
2593
+ - load_in_8bit: False
2594
+ - load_in_4bit: True
2595
+ - llm_int8_threshold: 6.0
2596
+ - llm_int8_skip_modules: None
2597
+ - llm_int8_enable_fp32_cpu_offload: False
2598
+ - llm_int8_has_fp16_weight: False
2599
+ - bnb_4bit_quant_type: nf4
2600
+ - bnb_4bit_use_double_quant: True
2601
+ - bnb_4bit_compute_dtype: bfloat16
2602
+
2603
+ The following `bitsandbytes` quantization config was used during training:
2604
+ - load_in_8bit: False
2605
+ - load_in_4bit: True
2606
+ - llm_int8_threshold: 6.0
2607
+ - llm_int8_skip_modules: None
2608
+ - llm_int8_enable_fp32_cpu_offload: False
2609
+ - llm_int8_has_fp16_weight: False
2610
+ - bnb_4bit_quant_type: nf4
2611
+ - bnb_4bit_use_double_quant: True
2612
+ - bnb_4bit_compute_dtype: bfloat16
2613
+
2614
+ The following `bitsandbytes` quantization config was used during training:
2615
+ - load_in_8bit: False
2616
+ - load_in_4bit: True
2617
+ - llm_int8_threshold: 6.0
2618
+ - llm_int8_skip_modules: None
2619
+ - llm_int8_enable_fp32_cpu_offload: False
2620
+ - llm_int8_has_fp16_weight: False
2621
+ - bnb_4bit_quant_type: nf4
2622
+ - bnb_4bit_use_double_quant: True
2623
+ - bnb_4bit_compute_dtype: bfloat16
2624
+
2625
+ The following `bitsandbytes` quantization config was used during training:
2626
+ - load_in_8bit: False
2627
+ - load_in_4bit: True
2628
+ - llm_int8_threshold: 6.0
2629
+ - llm_int8_skip_modules: None
2630
+ - llm_int8_enable_fp32_cpu_offload: False
2631
+ - llm_int8_has_fp16_weight: False
2632
+ - bnb_4bit_quant_type: nf4
2633
+ - bnb_4bit_use_double_quant: True
2634
+ - bnb_4bit_compute_dtype: bfloat16
2635
+
2636
+ The following `bitsandbytes` quantization config was used during training:
2637
+ - load_in_8bit: False
2638
+ - load_in_4bit: True
2639
+ - llm_int8_threshold: 6.0
2640
+ - llm_int8_skip_modules: None
2641
+ - llm_int8_enable_fp32_cpu_offload: False
2642
+ - llm_int8_has_fp16_weight: False
2643
+ - bnb_4bit_quant_type: nf4
2644
+ - bnb_4bit_use_double_quant: True
2645
+ - bnb_4bit_compute_dtype: bfloat16
2646
+
2647
+ The following `bitsandbytes` quantization config was used during training:
2648
+ - load_in_8bit: False
2649
+ - load_in_4bit: True
2650
+ - llm_int8_threshold: 6.0
2651
+ - llm_int8_skip_modules: None
2652
+ - llm_int8_enable_fp32_cpu_offload: False
2653
+ - llm_int8_has_fp16_weight: False
2654
+ - bnb_4bit_quant_type: nf4
2655
+ - bnb_4bit_use_double_quant: True
2656
+ - bnb_4bit_compute_dtype: bfloat16
2657
+
2658
+ The following `bitsandbytes` quantization config was used during training:
2659
+ - load_in_8bit: False
2660
+ - load_in_4bit: True
2661
+ - llm_int8_threshold: 6.0
2662
+ - llm_int8_skip_modules: None
2663
+ - llm_int8_enable_fp32_cpu_offload: False
2664
+ - llm_int8_has_fp16_weight: False
2665
+ - bnb_4bit_quant_type: nf4
2666
+ - bnb_4bit_use_double_quant: True
2667
+ - bnb_4bit_compute_dtype: bfloat16
2668
+
2669
+ The following `bitsandbytes` quantization config was used during training:
2670
+ - load_in_8bit: False
2671
+ - load_in_4bit: True
2672
+ - llm_int8_threshold: 6.0
2673
+ - llm_int8_skip_modules: None
2674
+ - llm_int8_enable_fp32_cpu_offload: False
2675
+ - llm_int8_has_fp16_weight: False
2676
+ - bnb_4bit_quant_type: nf4
2677
+ - bnb_4bit_use_double_quant: True
2678
+ - bnb_4bit_compute_dtype: bfloat16
2679
+
2680
+ The following `bitsandbytes` quantization config was used during training:
2681
+ - load_in_8bit: False
2682
+ - load_in_4bit: True
2683
+ - llm_int8_threshold: 6.0
2684
+ - llm_int8_skip_modules: None
2685
+ - llm_int8_enable_fp32_cpu_offload: False
2686
+ - llm_int8_has_fp16_weight: False
2687
+ - bnb_4bit_quant_type: nf4
2688
+ - bnb_4bit_use_double_quant: True
2689
+ - bnb_4bit_compute_dtype: bfloat16
2690
+
2691
+ The following `bitsandbytes` quantization config was used during training:
2692
+ - load_in_8bit: False
2693
+ - load_in_4bit: True
2694
+ - llm_int8_threshold: 6.0
2695
+ - llm_int8_skip_modules: None
2696
+ - llm_int8_enable_fp32_cpu_offload: False
2697
+ - llm_int8_has_fp16_weight: False
2698
+ - bnb_4bit_quant_type: nf4
2699
+ - bnb_4bit_use_double_quant: True
2700
+ - bnb_4bit_compute_dtype: bfloat16
2701
+
2702
+ The following `bitsandbytes` quantization config was used during training:
2703
+ - load_in_8bit: False
2704
+ - load_in_4bit: True
2705
+ - llm_int8_threshold: 6.0
2706
+ - llm_int8_skip_modules: None
2707
+ - llm_int8_enable_fp32_cpu_offload: False
2708
+ - llm_int8_has_fp16_weight: False
2709
+ - bnb_4bit_quant_type: nf4
2710
+ - bnb_4bit_use_double_quant: True
2711
+ - bnb_4bit_compute_dtype: bfloat16
2712
+
2713
+ The following `bitsandbytes` quantization config was used during training:
2714
+ - load_in_8bit: False
2715
+ - load_in_4bit: True
2716
+ - llm_int8_threshold: 6.0
2717
+ - llm_int8_skip_modules: None
2718
+ - llm_int8_enable_fp32_cpu_offload: False
2719
+ - llm_int8_has_fp16_weight: False
2720
+ - bnb_4bit_quant_type: nf4
2721
+ - bnb_4bit_use_double_quant: True
2722
+ - bnb_4bit_compute_dtype: bfloat16
2723
+
2724
+ The following `bitsandbytes` quantization config was used during training:
2725
+ - load_in_8bit: False
2726
+ - load_in_4bit: True
2727
+ - llm_int8_threshold: 6.0
2728
+ - llm_int8_skip_modules: None
2729
+ - llm_int8_enable_fp32_cpu_offload: False
2730
+ - llm_int8_has_fp16_weight: False
2731
+ - bnb_4bit_quant_type: nf4
2732
+ - bnb_4bit_use_double_quant: True
2733
+ - bnb_4bit_compute_dtype: bfloat16
2734
+
2735
+ The following `bitsandbytes` quantization config was used during training:
2736
+ - load_in_8bit: False
2737
+ - load_in_4bit: True
2738
+ - llm_int8_threshold: 6.0
2739
+ - llm_int8_skip_modules: None
2740
+ - llm_int8_enable_fp32_cpu_offload: False
2741
+ - llm_int8_has_fp16_weight: False
2742
+ - bnb_4bit_quant_type: nf4
2743
+ - bnb_4bit_use_double_quant: True
2744
+ - bnb_4bit_compute_dtype: bfloat16
2745
+
2746
+ The following `bitsandbytes` quantization config was used during training:
2747
+ - load_in_8bit: False
2748
+ - load_in_4bit: True
2749
+ - llm_int8_threshold: 6.0
2750
+ - llm_int8_skip_modules: None
2751
+ - llm_int8_enable_fp32_cpu_offload: False
2752
+ - llm_int8_has_fp16_weight: False
2753
+ - bnb_4bit_quant_type: nf4
2754
+ - bnb_4bit_use_double_quant: True
2755
+ - bnb_4bit_compute_dtype: bfloat16
2756
+
2757
+ The following `bitsandbytes` quantization config was used during training:
2758
+ - load_in_8bit: False
2759
+ - load_in_4bit: True
2760
+ - llm_int8_threshold: 6.0
2761
+ - llm_int8_skip_modules: None
2762
+ - llm_int8_enable_fp32_cpu_offload: False
2763
+ - llm_int8_has_fp16_weight: False
2764
+ - bnb_4bit_quant_type: nf4
2765
+ - bnb_4bit_use_double_quant: True
2766
+ - bnb_4bit_compute_dtype: bfloat16
2767
+
2768
+ The following `bitsandbytes` quantization config was used during training:
2769
+ - load_in_8bit: False
2770
+ - load_in_4bit: True
2771
+ - llm_int8_threshold: 6.0
2772
+ - llm_int8_skip_modules: None
2773
+ - llm_int8_enable_fp32_cpu_offload: False
2774
+ - llm_int8_has_fp16_weight: False
2775
+ - bnb_4bit_quant_type: nf4
2776
+ - bnb_4bit_use_double_quant: True
2777
+ - bnb_4bit_compute_dtype: bfloat16
2778
+
2779
+ The following `bitsandbytes` quantization config was used during training:
2780
+ - load_in_8bit: False
2781
+ - load_in_4bit: True
2782
+ - llm_int8_threshold: 6.0
2783
+ - llm_int8_skip_modules: None
2784
+ - llm_int8_enable_fp32_cpu_offload: False
2785
+ - llm_int8_has_fp16_weight: False
2786
+ - bnb_4bit_quant_type: nf4
2787
+ - bnb_4bit_use_double_quant: True
2788
+ - bnb_4bit_compute_dtype: bfloat16
2789
+
2790
+ The following `bitsandbytes` quantization config was used during training:
2791
+ - load_in_8bit: False
2792
+ - load_in_4bit: True
2793
+ - llm_int8_threshold: 6.0
2794
+ - llm_int8_skip_modules: None
2795
+ - llm_int8_enable_fp32_cpu_offload: False
2796
+ - llm_int8_has_fp16_weight: False
2797
+ - bnb_4bit_quant_type: nf4
2798
+ - bnb_4bit_use_double_quant: True
2799
+ - bnb_4bit_compute_dtype: bfloat16
2800
+
2801
+ The following `bitsandbytes` quantization config was used during training:
2802
+ - load_in_8bit: False
2803
+ - load_in_4bit: True
2804
+ - llm_int8_threshold: 6.0
2805
+ - llm_int8_skip_modules: None
2806
+ - llm_int8_enable_fp32_cpu_offload: False
2807
+ - llm_int8_has_fp16_weight: False
2808
+ - bnb_4bit_quant_type: nf4
2809
+ - bnb_4bit_use_double_quant: True
2810
+ - bnb_4bit_compute_dtype: bfloat16
2811
+
2812
+ The following `bitsandbytes` quantization config was used during training:
2813
+ - load_in_8bit: False
2814
+ - load_in_4bit: True
2815
+ - llm_int8_threshold: 6.0
2816
+ - llm_int8_skip_modules: None
2817
+ - llm_int8_enable_fp32_cpu_offload: False
2818
+ - llm_int8_has_fp16_weight: False
2819
+ - bnb_4bit_quant_type: nf4
2820
+ - bnb_4bit_use_double_quant: True
2821
+ - bnb_4bit_compute_dtype: bfloat16
2822
+
2823
+ The following `bitsandbytes` quantization config was used during training:
2824
+ - load_in_8bit: False
2825
+ - load_in_4bit: True
2826
+ - llm_int8_threshold: 6.0
2827
+ - llm_int8_skip_modules: None
2828
+ - llm_int8_enable_fp32_cpu_offload: False
2829
+ - llm_int8_has_fp16_weight: False
2830
+ - bnb_4bit_quant_type: nf4
2831
+ - bnb_4bit_use_double_quant: True
2832
+ - bnb_4bit_compute_dtype: bfloat16
2833
+
2834
+ The following `bitsandbytes` quantization config was used during training:
2835
+ - load_in_8bit: False
2836
+ - load_in_4bit: True
2837
+ - llm_int8_threshold: 6.0
2838
+ - llm_int8_skip_modules: None
2839
+ - llm_int8_enable_fp32_cpu_offload: False
2840
+ - llm_int8_has_fp16_weight: False
2841
+ - bnb_4bit_quant_type: nf4
2842
+ - bnb_4bit_use_double_quant: True
2843
+ - bnb_4bit_compute_dtype: bfloat16
2844
+ ### Framework versions
2845
+
2846
+ - PEFT 0.4.0
2847
+ - PEFT 0.4.0
2848
+ - PEFT 0.4.0
2849
+ - PEFT 0.4.0
2850
+ - PEFT 0.4.0
2851
+ - PEFT 0.4.0
2852
+ - PEFT 0.4.0
2853
+ - PEFT 0.4.0
2854
+ - PEFT 0.4.0
2855
+ - PEFT 0.4.0
2856
+ - PEFT 0.4.0
2857
+ - PEFT 0.4.0
2858
+ - PEFT 0.4.0
2859
+ - PEFT 0.4.0
2860
+ - PEFT 0.4.0
2861
+ - PEFT 0.4.0
2862
+ - PEFT 0.4.0
2863
+ - PEFT 0.4.0
2864
+ - PEFT 0.4.0
2865
+ - PEFT 0.4.0
2866
+ - PEFT 0.4.0
2867
+ - PEFT 0.4.0
2868
+ - PEFT 0.4.0
2869
+ - PEFT 0.4.0
2870
+ - PEFT 0.4.0
2871
+ - PEFT 0.4.0
2872
+ - PEFT 0.4.0
2873
+ - PEFT 0.4.0
2874
+ - PEFT 0.4.0
2875
+ - PEFT 0.4.0
2876
+ - PEFT 0.4.0
2877
+ - PEFT 0.4.0
2878
+ - PEFT 0.4.0
2879
+ - PEFT 0.4.0
2880
+ - PEFT 0.4.0
2881
+ - PEFT 0.4.0
2882
+ - PEFT 0.4.0
2883
+ - PEFT 0.4.0
2884
+ - PEFT 0.4.0
2885
+ - PEFT 0.4.0
2886
+ - PEFT 0.4.0
2887
+ - PEFT 0.4.0
2888
+ - PEFT 0.4.0
2889
+ - PEFT 0.4.0
2890
+ - PEFT 0.4.0
2891
+ - PEFT 0.4.0
2892
+ - PEFT 0.4.0
2893
+ - PEFT 0.4.0
2894
+ - PEFT 0.4.0
2895
+ - PEFT 0.4.0
2896
+ - PEFT 0.4.0
2897
+ - PEFT 0.4.0
2898
+ - PEFT 0.4.0
2899
+ - PEFT 0.4.0
2900
+ - PEFT 0.4.0
2901
+ - PEFT 0.4.0
2902
+ - PEFT 0.4.0
2903
+ - PEFT 0.4.0
2904
+ - PEFT 0.4.0
2905
+ - PEFT 0.4.0
2906
+ - PEFT 0.4.0
2907
+ - PEFT 0.4.0
2908
+ - PEFT 0.4.0
2909
+ - PEFT 0.4.0
2910
+ - PEFT 0.4.0
2911
+ - PEFT 0.4.0
2912
+ - PEFT 0.4.0
2913
+ - PEFT 0.4.0
2914
+ - PEFT 0.4.0
2915
+ - PEFT 0.4.0
2916
+ - PEFT 0.4.0
2917
+ - PEFT 0.4.0
2918
+ - PEFT 0.4.0
2919
+ - PEFT 0.4.0
2920
+ - PEFT 0.4.0
2921
+ - PEFT 0.4.0
2922
+ - PEFT 0.4.0
2923
+ - PEFT 0.4.0
2924
+ - PEFT 0.4.0
2925
+ - PEFT 0.4.0
2926
+ - PEFT 0.4.0
2927
+ - PEFT 0.4.0
2928
+ - PEFT 0.4.0
2929
+ - PEFT 0.4.0
2930
+ - PEFT 0.4.0
2931
+ - PEFT 0.4.0
2932
+ - PEFT 0.4.0
2933
+ - PEFT 0.4.0
2934
+ - PEFT 0.4.0
2935
+ - PEFT 0.4.0
2936
+ - PEFT 0.4.0
2937
+ - PEFT 0.4.0
2938
+ - PEFT 0.4.0
2939
+ - PEFT 0.4.0
2940
+ - PEFT 0.4.0
2941
+ - PEFT 0.4.0
2942
+ - PEFT 0.4.0
2943
+ - PEFT 0.4.0
2944
+ - PEFT 0.4.0
2945
+ - PEFT 0.4.0
2946
+ - PEFT 0.4.0
2947
+ - PEFT 0.4.0
2948
+ - PEFT 0.4.0
2949
+ - PEFT 0.4.0
2950
+ - PEFT 0.4.0
2951
+ - PEFT 0.4.0
2952
+ - PEFT 0.4.0
2953
+ - PEFT 0.4.0
2954
+ - PEFT 0.4.0
2955
+ - PEFT 0.4.0
2956
+ - PEFT 0.4.0
2957
+ - PEFT 0.4.0
2958
+ - PEFT 0.4.0
2959
+ - PEFT 0.4.0
2960
+ - PEFT 0.4.0
2961
+ - PEFT 0.4.0
2962
+ - PEFT 0.4.0
2963
+ - PEFT 0.4.0
2964
+ - PEFT 0.4.0
2965
+ - PEFT 0.4.0
2966
+ - PEFT 0.4.0
2967
+ - PEFT 0.4.0
2968
+ - PEFT 0.4.0
2969
+ - PEFT 0.4.0
2970
+ - PEFT 0.4.0
2971
+ - PEFT 0.4.0
2972
+ - PEFT 0.4.0
2973
+ - PEFT 0.4.0
2974
+ - PEFT 0.4.0
2975
+ - PEFT 0.4.0
2976
+ - PEFT 0.4.0
2977
+ - PEFT 0.4.0
2978
+ - PEFT 0.4.0
2979
+ - PEFT 0.4.0
2980
+ - PEFT 0.4.0
2981
+ - PEFT 0.4.0
2982
+ - PEFT 0.4.0
2983
+ - PEFT 0.4.0
2984
+ - PEFT 0.4.0
2985
+ - PEFT 0.4.0
2986
+ - PEFT 0.4.0
2987
+ - PEFT 0.4.0
2988
+ - PEFT 0.4.0
2989
+ - PEFT 0.4.0
2990
+ - PEFT 0.4.0
2991
+ - PEFT 0.4.0
2992
+ - PEFT 0.4.0
2993
+ - PEFT 0.4.0
2994
+ - PEFT 0.4.0
2995
+ - PEFT 0.4.0
2996
+ - PEFT 0.4.0
2997
+ - PEFT 0.4.0
2998
+ - PEFT 0.4.0
2999
+ - PEFT 0.4.0
3000
+ - PEFT 0.4.0
3001
+ - PEFT 0.4.0
3002
+ - PEFT 0.4.0
3003
+ - PEFT 0.4.0
3004
+ - PEFT 0.4.0
3005
+ - PEFT 0.4.0
3006
+ - PEFT 0.4.0
3007
+ - PEFT 0.4.0
3008
+ - PEFT 0.4.0
3009
+ - PEFT 0.4.0
3010
+ - PEFT 0.4.0
3011
+ - PEFT 0.4.0
3012
+ - PEFT 0.4.0
3013
+ - PEFT 0.4.0
3014
+ - PEFT 0.4.0
3015
+ - PEFT 0.4.0
3016
+ - PEFT 0.4.0
3017
+ - PEFT 0.4.0
3018
+ - PEFT 0.4.0
3019
+ - PEFT 0.4.0
3020
+ - PEFT 0.4.0
3021
+ - PEFT 0.4.0
3022
+ - PEFT 0.4.0
3023
+ - PEFT 0.4.0
3024
+ - PEFT 0.4.0
3025
+ - PEFT 0.4.0
3026
+ - PEFT 0.4.0
3027
+ - PEFT 0.4.0
3028
+ - PEFT 0.4.0
3029
+ - PEFT 0.4.0
3030
+ - PEFT 0.4.0
3031
+ - PEFT 0.4.0
3032
+ - PEFT 0.4.0
3033
+ - PEFT 0.4.0
3034
+ - PEFT 0.4.0
3035
+ - PEFT 0.4.0
3036
+ - PEFT 0.4.0
3037
+ - PEFT 0.4.0
3038
+ - PEFT 0.4.0
3039
+ - PEFT 0.4.0
3040
+ - PEFT 0.4.0
3041
+ - PEFT 0.4.0
3042
+ - PEFT 0.4.0
3043
+ - PEFT 0.4.0
3044
+ - PEFT 0.4.0
3045
+ - PEFT 0.4.0
3046
+ - PEFT 0.4.0
3047
+ - PEFT 0.4.0
3048
+ - PEFT 0.4.0
3049
+ - PEFT 0.4.0
3050
+ - PEFT 0.4.0
3051
+ - PEFT 0.4.0
3052
+ - PEFT 0.4.0
3053
+ - PEFT 0.4.0
3054
+ - PEFT 0.4.0
3055
+ - PEFT 0.4.0
3056
+ - PEFT 0.4.0
3057
+ - PEFT 0.4.0
3058
+ - PEFT 0.4.0
3059
+ - PEFT 0.4.0
3060
+ - PEFT 0.4.0
3061
+ - PEFT 0.4.0
3062
+ - PEFT 0.4.0
3063
+ - PEFT 0.4.0
3064
+ - PEFT 0.4.0
3065
+ - PEFT 0.4.0
3066
+ - PEFT 0.4.0
3067
+ - PEFT 0.4.0
3068
+ - PEFT 0.4.0
3069
+ - PEFT 0.4.0
3070
+ - PEFT 0.4.0
3071
+ - PEFT 0.4.0
3072
+ - PEFT 0.4.0
3073
+ - PEFT 0.4.0
3074
+ - PEFT 0.4.0
3075
+ - PEFT 0.4.0
3076
+ - PEFT 0.4.0
3077
+ - PEFT 0.4.0
3078
+ - PEFT 0.4.0
3079
+ - PEFT 0.4.0
3080
+ - PEFT 0.4.0
3081
+ - PEFT 0.4.0
3082
+ - PEFT 0.4.0
3083
+ - PEFT 0.4.0
3084
+ - PEFT 0.4.0
3085
+ - PEFT 0.4.0
3086
+ - PEFT 0.4.0
3087
+ - PEFT 0.4.0
3088
+ - PEFT 0.4.0
3089
+ - PEFT 0.4.0
3090
+ - PEFT 0.4.0
3091
+ - PEFT 0.4.0
3092
+ - PEFT 0.4.0
3093
+ - PEFT 0.4.0
3094
+ - PEFT 0.4.0
3095
+ - PEFT 0.4.0
3096
+ - PEFT 0.4.0
3097
+ - PEFT 0.4.0
3098
+ - PEFT 0.4.0
3099
+ - PEFT 0.4.0
3100
+ - PEFT 0.4.0
3101
+ - PEFT 0.4.0
3102
+ - PEFT 0.4.0
3103
+
3104
+ - PEFT 0.4.0
adapter_config.json ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "auto_mapping": null,
3
+ "base_model_name_or_path": "Salesforce/xgen-7b-8k-base",
4
+ "bias": "none",
5
+ "fan_in_fan_out": false,
6
+ "inference_mode": true,
7
+ "init_lora_weights": true,
8
+ "layers_pattern": null,
9
+ "layers_to_transform": null,
10
+ "lora_alpha": 16,
11
+ "lora_dropout": 0.05,
12
+ "modules_to_save": null,
13
+ "peft_type": "LORA",
14
+ "r": 64,
15
+ "revision": null,
16
+ "target_modules": [
17
+ "up_proj",
18
+ "down_proj"
19
+ ],
20
+ "task_type": "CAUSAL_LM"
21
+ }
adapter_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d3ba3dd7958d1ed8cb7ae0f27af58d19080253f81237893c125976b53623716
3
+ size 247509453
config.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "bos_token_id": 50256,
6
+ "eos_token_id": 50256,
7
+ "hidden_act": "silu",
8
+ "hidden_size": 4096,
9
+ "initializer_range": 0.02,
10
+ "intermediate_size": 11008,
11
+ "max_position_embeddings": 8192,
12
+ "model_type": "llama",
13
+ "num_attention_heads": 32,
14
+ "num_hidden_layers": 32,
15
+ "pad_token_id": 0,
16
+ "rms_norm_eps": 1e-06,
17
+ "tie_word_embeddings": false,
18
+ "torch_dtype": "float32",
19
+ "transformers_version": "4.29.2",
20
+ "use_cache": true,
21
+ "vocab_size": 51200
22
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "eos_token_id": 50256,
5
+ "transformers_version": "4.32.0.dev0"
6
+ }
pytorch_model-00001-of-00002.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e359d2d05c32603542c47e8683fd4ae0123713a7f5ac49fad7e44f6b60eef64
3
+ size 9953551450
pytorch_model-00002-of-00002.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db2b1504413b09c7b553a1486e8035167c27de4f98660bfec18afcd872b65989
3
+ size 3837975899
pytorch_model.bin.index.json ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 13791412224
4
+ },
5
+ "weight_map": {
6
+ "lm_head.weight": "pytorch_model-00002-of-00002.bin",
7
+ "model.embed_tokens.weight": "pytorch_model-00001-of-00002.bin",
8
+ "model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
9
+ "model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
10
+ "model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
11
+ "model.layers.0.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
12
+ "model.layers.0.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
13
+ "model.layers.0.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
14
+ "model.layers.0.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
15
+ "model.layers.0.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
16
+ "model.layers.0.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
17
+ "model.layers.0.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
18
+ "model.layers.1.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
19
+ "model.layers.1.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
20
+ "model.layers.1.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
21
+ "model.layers.1.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
22
+ "model.layers.1.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
23
+ "model.layers.1.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
24
+ "model.layers.1.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
25
+ "model.layers.1.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
26
+ "model.layers.1.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
27
+ "model.layers.1.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
28
+ "model.layers.10.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
29
+ "model.layers.10.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
30
+ "model.layers.10.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
31
+ "model.layers.10.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
32
+ "model.layers.10.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
33
+ "model.layers.10.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
34
+ "model.layers.10.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
35
+ "model.layers.10.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
36
+ "model.layers.10.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
37
+ "model.layers.10.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
38
+ "model.layers.11.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
39
+ "model.layers.11.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
40
+ "model.layers.11.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
41
+ "model.layers.11.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
42
+ "model.layers.11.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
43
+ "model.layers.11.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
44
+ "model.layers.11.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
45
+ "model.layers.11.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
46
+ "model.layers.11.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
47
+ "model.layers.11.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
48
+ "model.layers.12.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
49
+ "model.layers.12.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
50
+ "model.layers.12.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
51
+ "model.layers.12.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
52
+ "model.layers.12.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
53
+ "model.layers.12.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
54
+ "model.layers.12.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
55
+ "model.layers.12.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
56
+ "model.layers.12.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
57
+ "model.layers.12.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
58
+ "model.layers.13.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
59
+ "model.layers.13.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
60
+ "model.layers.13.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
61
+ "model.layers.13.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
62
+ "model.layers.13.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
63
+ "model.layers.13.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
64
+ "model.layers.13.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
65
+ "model.layers.13.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
66
+ "model.layers.13.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
67
+ "model.layers.13.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
68
+ "model.layers.14.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
69
+ "model.layers.14.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
70
+ "model.layers.14.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
71
+ "model.layers.14.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
72
+ "model.layers.14.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
73
+ "model.layers.14.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
74
+ "model.layers.14.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
75
+ "model.layers.14.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
76
+ "model.layers.14.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
77
+ "model.layers.14.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
78
+ "model.layers.15.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
79
+ "model.layers.15.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
80
+ "model.layers.15.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
81
+ "model.layers.15.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
82
+ "model.layers.15.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
83
+ "model.layers.15.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
84
+ "model.layers.15.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
85
+ "model.layers.15.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
86
+ "model.layers.15.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
87
+ "model.layers.15.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
88
+ "model.layers.16.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
89
+ "model.layers.16.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
90
+ "model.layers.16.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
91
+ "model.layers.16.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
92
+ "model.layers.16.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
93
+ "model.layers.16.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
94
+ "model.layers.16.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
95
+ "model.layers.16.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
96
+ "model.layers.16.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
97
+ "model.layers.16.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
98
+ "model.layers.17.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
99
+ "model.layers.17.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
100
+ "model.layers.17.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
101
+ "model.layers.17.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
102
+ "model.layers.17.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
103
+ "model.layers.17.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
104
+ "model.layers.17.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
105
+ "model.layers.17.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
106
+ "model.layers.17.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
107
+ "model.layers.17.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
108
+ "model.layers.18.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
109
+ "model.layers.18.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
110
+ "model.layers.18.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
111
+ "model.layers.18.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
112
+ "model.layers.18.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
113
+ "model.layers.18.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
114
+ "model.layers.18.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
115
+ "model.layers.18.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
116
+ "model.layers.18.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
117
+ "model.layers.18.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
118
+ "model.layers.19.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
119
+ "model.layers.19.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
120
+ "model.layers.19.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
121
+ "model.layers.19.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
122
+ "model.layers.19.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
123
+ "model.layers.19.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
124
+ "model.layers.19.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
125
+ "model.layers.19.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
126
+ "model.layers.19.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
127
+ "model.layers.19.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
128
+ "model.layers.2.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
129
+ "model.layers.2.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
130
+ "model.layers.2.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
131
+ "model.layers.2.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
132
+ "model.layers.2.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
133
+ "model.layers.2.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
134
+ "model.layers.2.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
135
+ "model.layers.2.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
136
+ "model.layers.2.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
137
+ "model.layers.2.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
138
+ "model.layers.20.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
139
+ "model.layers.20.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
140
+ "model.layers.20.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
141
+ "model.layers.20.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
142
+ "model.layers.20.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
143
+ "model.layers.20.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
144
+ "model.layers.20.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
145
+ "model.layers.20.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
146
+ "model.layers.20.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
147
+ "model.layers.20.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
148
+ "model.layers.21.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
149
+ "model.layers.21.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
150
+ "model.layers.21.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
151
+ "model.layers.21.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
152
+ "model.layers.21.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
153
+ "model.layers.21.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
154
+ "model.layers.21.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
155
+ "model.layers.21.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
156
+ "model.layers.21.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
157
+ "model.layers.21.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
158
+ "model.layers.22.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
159
+ "model.layers.22.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
160
+ "model.layers.22.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
161
+ "model.layers.22.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
162
+ "model.layers.22.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
163
+ "model.layers.22.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
164
+ "model.layers.22.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
165
+ "model.layers.22.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
166
+ "model.layers.22.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
167
+ "model.layers.22.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
168
+ "model.layers.23.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
169
+ "model.layers.23.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
170
+ "model.layers.23.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
171
+ "model.layers.23.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
172
+ "model.layers.23.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
173
+ "model.layers.23.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
174
+ "model.layers.23.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
175
+ "model.layers.23.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
176
+ "model.layers.23.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
177
+ "model.layers.23.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
178
+ "model.layers.24.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
179
+ "model.layers.24.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
180
+ "model.layers.24.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
181
+ "model.layers.24.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
182
+ "model.layers.24.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
183
+ "model.layers.24.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
184
+ "model.layers.24.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
185
+ "model.layers.24.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
186
+ "model.layers.24.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
187
+ "model.layers.24.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
188
+ "model.layers.25.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
189
+ "model.layers.25.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
190
+ "model.layers.25.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
191
+ "model.layers.25.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
192
+ "model.layers.25.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
193
+ "model.layers.25.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
194
+ "model.layers.25.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
195
+ "model.layers.25.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
196
+ "model.layers.25.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
197
+ "model.layers.25.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
198
+ "model.layers.26.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
199
+ "model.layers.26.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
200
+ "model.layers.26.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
201
+ "model.layers.26.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
202
+ "model.layers.26.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
203
+ "model.layers.26.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
204
+ "model.layers.26.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
205
+ "model.layers.26.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
206
+ "model.layers.26.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
207
+ "model.layers.26.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
208
+ "model.layers.27.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
209
+ "model.layers.27.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
210
+ "model.layers.27.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
211
+ "model.layers.27.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
212
+ "model.layers.27.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
213
+ "model.layers.27.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
214
+ "model.layers.27.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
215
+ "model.layers.27.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
216
+ "model.layers.27.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
217
+ "model.layers.27.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
218
+ "model.layers.28.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
219
+ "model.layers.28.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
220
+ "model.layers.28.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
221
+ "model.layers.28.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
222
+ "model.layers.28.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
223
+ "model.layers.28.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
224
+ "model.layers.28.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
225
+ "model.layers.28.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
226
+ "model.layers.28.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
227
+ "model.layers.28.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
228
+ "model.layers.29.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
229
+ "model.layers.29.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
230
+ "model.layers.29.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
231
+ "model.layers.29.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
232
+ "model.layers.29.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
233
+ "model.layers.29.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
234
+ "model.layers.29.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
235
+ "model.layers.29.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
236
+ "model.layers.29.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
237
+ "model.layers.29.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
238
+ "model.layers.3.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
239
+ "model.layers.3.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
240
+ "model.layers.3.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
241
+ "model.layers.3.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
242
+ "model.layers.3.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
243
+ "model.layers.3.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
244
+ "model.layers.3.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
245
+ "model.layers.3.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
246
+ "model.layers.3.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
247
+ "model.layers.3.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
248
+ "model.layers.30.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
249
+ "model.layers.30.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
250
+ "model.layers.30.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
251
+ "model.layers.30.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
252
+ "model.layers.30.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
253
+ "model.layers.30.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
254
+ "model.layers.30.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
255
+ "model.layers.30.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
256
+ "model.layers.30.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
257
+ "model.layers.30.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
258
+ "model.layers.31.input_layernorm.weight": "pytorch_model-00002-of-00002.bin",
259
+ "model.layers.31.mlp.down_proj.weight": "pytorch_model-00002-of-00002.bin",
260
+ "model.layers.31.mlp.gate_proj.weight": "pytorch_model-00002-of-00002.bin",
261
+ "model.layers.31.mlp.up_proj.weight": "pytorch_model-00002-of-00002.bin",
262
+ "model.layers.31.post_attention_layernorm.weight": "pytorch_model-00002-of-00002.bin",
263
+ "model.layers.31.self_attn.k_proj.weight": "pytorch_model-00002-of-00002.bin",
264
+ "model.layers.31.self_attn.o_proj.weight": "pytorch_model-00002-of-00002.bin",
265
+ "model.layers.31.self_attn.q_proj.weight": "pytorch_model-00002-of-00002.bin",
266
+ "model.layers.31.self_attn.rotary_emb.inv_freq": "pytorch_model-00002-of-00002.bin",
267
+ "model.layers.31.self_attn.v_proj.weight": "pytorch_model-00002-of-00002.bin",
268
+ "model.layers.4.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
269
+ "model.layers.4.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
270
+ "model.layers.4.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
271
+ "model.layers.4.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
272
+ "model.layers.4.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
273
+ "model.layers.4.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
274
+ "model.layers.4.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
275
+ "model.layers.4.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
276
+ "model.layers.4.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
277
+ "model.layers.4.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
278
+ "model.layers.5.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
279
+ "model.layers.5.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
280
+ "model.layers.5.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
281
+ "model.layers.5.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
282
+ "model.layers.5.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
283
+ "model.layers.5.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
284
+ "model.layers.5.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
285
+ "model.layers.5.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
286
+ "model.layers.5.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
287
+ "model.layers.5.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
288
+ "model.layers.6.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
289
+ "model.layers.6.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
290
+ "model.layers.6.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
291
+ "model.layers.6.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
292
+ "model.layers.6.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
293
+ "model.layers.6.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
294
+ "model.layers.6.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
295
+ "model.layers.6.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
296
+ "model.layers.6.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
297
+ "model.layers.6.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
298
+ "model.layers.7.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
299
+ "model.layers.7.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
300
+ "model.layers.7.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
301
+ "model.layers.7.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
302
+ "model.layers.7.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
303
+ "model.layers.7.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
304
+ "model.layers.7.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
305
+ "model.layers.7.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
306
+ "model.layers.7.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
307
+ "model.layers.7.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
308
+ "model.layers.8.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
309
+ "model.layers.8.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
310
+ "model.layers.8.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
311
+ "model.layers.8.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
312
+ "model.layers.8.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
313
+ "model.layers.8.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
314
+ "model.layers.8.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
315
+ "model.layers.8.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
316
+ "model.layers.8.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
317
+ "model.layers.8.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
318
+ "model.layers.9.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
319
+ "model.layers.9.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
320
+ "model.layers.9.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
321
+ "model.layers.9.mlp.up_proj.weight": "pytorch_model-00001-of-00002.bin",
322
+ "model.layers.9.post_attention_layernorm.weight": "pytorch_model-00001-of-00002.bin",
323
+ "model.layers.9.self_attn.k_proj.weight": "pytorch_model-00001-of-00002.bin",
324
+ "model.layers.9.self_attn.o_proj.weight": "pytorch_model-00001-of-00002.bin",
325
+ "model.layers.9.self_attn.q_proj.weight": "pytorch_model-00001-of-00002.bin",
326
+ "model.layers.9.self_attn.rotary_emb.inv_freq": "pytorch_model-00001-of-00002.bin",
327
+ "model.layers.9.self_attn.v_proj.weight": "pytorch_model-00001-of-00002.bin",
328
+ "model.norm.weight": "pytorch_model-00002-of-00002.bin"
329
+ }
330
+ }
tokenization_xgen.py ADDED
@@ -0,0 +1,234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) 2023, salesforce.com, inc.
2
+ # All rights reserved.
3
+ # SPDX-License-Identifier: Apache-2.0
4
+ # For full license text, see the LICENSE file in the repo root or https://opensource.org/licenses/Apache-2.0
5
+ """Tokenization classes for xgen."""
6
+
7
+ from typing import List, Optional
8
+
9
+ from transformers.tokenization_utils import AddedToken, PreTrainedTokenizer
10
+ from transformers.utils import logging
11
+
12
+ try:
13
+ import tiktoken
14
+ except ModuleNotFoundError as e:
15
+ raise ModuleNotFoundError("XGen requires the installation of tiktoken. Please install it via `pip install tiktoken`.") from e
16
+
17
+
18
+ logger = logging.get_logger(__name__)
19
+
20
+ MAX_MODEL_INPUT_SIZES = {
21
+ "Salesforce/xgen-7b-4k-base": 4096,
22
+ "Salesforce/xgen-7b-8k-base": 8192,
23
+ "Salesforce/xgen-7b-4k-inst": 4096,
24
+ "Salesforce/xgen-7b-8k-inst": 8192
25
+ }
26
+
27
+
28
+ def tiktoken_tokenizer(base="gpt2", pad_token=None, add_special=True):
29
+ if not add_special:
30
+ return tiktoken.get_encoding(base)
31
+
32
+ def include_whitespace(n_min=2, n_max=20):
33
+ whitespaces = [" " * n for n in reversed(range(n_min, n_max))]
34
+ return whitespaces
35
+
36
+ def include_tabs(n_min=2, n_max=20):
37
+ tabs = ["\t" * n for n in reversed(range(n_min, n_max))]
38
+ return tabs
39
+
40
+ def include_fim_tokens():
41
+ fim_tokens = [
42
+ "<fim_prefix>",
43
+ "<fim_middle>",
44
+ "<fim_suffix>",
45
+ "<fim_pad>",
46
+ "<filename>",
47
+ "<gh_stars>",
48
+ "<issue_start>",
49
+ "<issue_comment>",
50
+ "<issue_closed>",
51
+ "<jupyter_start>",
52
+ "<jupyter_text>",
53
+ "<jupyter_code>",
54
+ "<jupyter_output>",
55
+ "<empty_output>",
56
+ "<commit_before>",
57
+ "<commit_msg>",
58
+ "<commit_after>",
59
+ "<reponame>"
60
+ ]
61
+ return fim_tokens
62
+
63
+ add_whitespaces = include_whitespace(n_min=2, n_max=32)
64
+ add_tabs = include_tabs(n_min=2, n_max=10)
65
+ fim_tokens = include_fim_tokens()
66
+
67
+ tokenizer = tiktoken.get_encoding(base)
68
+
69
+ idx = tokenizer.n_vocab
70
+
71
+ bpe_ranks = tokenizer._mergeable_ranks
72
+
73
+ for wsp in add_whitespaces:
74
+ bpe_ranks[bytes(wsp, 'ascii')] = idx
75
+ idx += 1
76
+ for t in add_tabs:
77
+ bpe_ranks[bytes(t, 'ascii')] = idx
78
+ idx += 1
79
+
80
+ special_tokens = dict()
81
+
82
+ for sp in fim_tokens:
83
+ special_tokens[sp] = idx
84
+ idx += 1
85
+
86
+ if pad_token and pad_token not in tokenizer._special_tokens and pad_token not in special_tokens:
87
+ special_tokens[pad_token] = idx
88
+ idx += 1
89
+ # In production, load the arguments directly instead of accessing private attributes
90
+ # See openai_public.py for examples of arguments for specific encodings
91
+ enc = tiktoken.Encoding(
92
+ # If you're changing the set of special tokens, make sure to use a different name
93
+ # It should be clear from the name what behaviour to expect.
94
+ name=base.replace("base", "im"),
95
+ pat_str=tokenizer._pat_str,
96
+ mergeable_ranks=bpe_ranks,
97
+ special_tokens={
98
+ **tokenizer._special_tokens,
99
+ **special_tokens
100
+ }
101
+ )
102
+ return enc
103
+
104
+
105
+ class XgenTokenizer(PreTrainedTokenizer):
106
+ """
107
+ Construct a Xgen tokenizer. Based on byte-level Byte-Pair-Encoding.
108
+ Args:
109
+ vocab_file (`str`):
110
+ Path to the vocabulary file.
111
+ """
112
+ max_model_input_sizes = MAX_MODEL_INPUT_SIZES
113
+ model_input_names = ["input_ids", "attention_mask"]
114
+
115
+ def __init__(
116
+ self,
117
+ pad_token=None,
118
+ eos_token="<|endoftext|>",
119
+ add_eos_token=False,
120
+ add_special_tokens=True,
121
+ **kwargs,
122
+ ):
123
+ pad_token_added = AddedToken(pad_token, lstrip=False, rstrip=False) if isinstance(pad_token, str) else pad_token
124
+ eos_token_added = AddedToken(eos_token, lstrip=False, rstrip=False) if isinstance(eos_token, str) else eos_token
125
+ super().__init__(
126
+ pad_token=pad_token_added,
127
+ eos_token=eos_token_added,
128
+ add_eos_token=add_eos_token,
129
+ add_special_tokens=add_special_tokens,
130
+ **kwargs,
131
+ )
132
+ self.add_eos_token = add_eos_token
133
+ self.encoder = tiktoken_tokenizer(base="gpt2", pad_token=pad_token, add_special=add_special_tokens)
134
+
135
+ @property
136
+ def vocab_size(self):
137
+ """Returns vocab size"""
138
+ return self.encoder.n_vocab
139
+
140
+ def get_vocab(self):
141
+ """Returns vocab as a dict"""
142
+ vocab = {self.encoder.decode_single_token_bytes(i): i for i in range(self.vocab_size)}
143
+ return vocab
144
+
145
+ def _tokenize(self, text, **kwargs):
146
+ """Returns a tokenized string."""
147
+ return self.encoder.encode(text, allowed_special="all")
148
+
149
+ def _convert_token_to_id(self, token):
150
+ """Converts a token (str) in an id using the vocab."""
151
+ if isinstance(token, str):
152
+ return self.encoder.encode_single_token(token)
153
+ else:
154
+ return token
155
+
156
+ def _convert_id_to_token(self, index):
157
+ """Converts an index (integer) in a token (str) using the vocab."""
158
+ return self.encoder.decode_single_token_bytes(index).decode("utf-8")
159
+
160
+ def _decode(self, token_ids: List[int], skip_special_tokens: bool = False, **kwargs):
161
+ if skip_special_tokens:
162
+ token_ids = [t for t in token_ids if t not in self.all_special_ids]
163
+ return self.encoder.decode(token_ids)
164
+
165
+ def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None) -> List[int]:
166
+ """Build model inputs from a sequence by appending eos_token_id."""
167
+ eos_token_id = [self.eos_token_id] if self.add_eos_token else []
168
+
169
+ output = token_ids_0 + eos_token_id
170
+
171
+ if token_ids_1 is not None:
172
+ output = output + token_ids_1 + eos_token_id
173
+
174
+ return output
175
+
176
+ def get_special_tokens_mask(
177
+ self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None,
178
+ already_has_special_tokens: bool = False
179
+ ) -> List[int]:
180
+ """
181
+ Retrieve sequence ids from a token list that has no special tokens added. This method is called when adding
182
+ special tokens using the tokenizer `prepare_for_model` method.
183
+ Args:
184
+ token_ids_0 (`List[int]`):
185
+ List of IDs.
186
+ token_ids_1 (`List[int]`, *optional*):
187
+ Optional second list of IDs for sequence pairs.
188
+ already_has_special_tokens (`bool`, *optional*, defaults to `False`):
189
+ Whether the token list is already formatted with special tokens for the model.
190
+ Returns:
191
+ `List[int]`: A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
192
+ """
193
+ if already_has_special_tokens:
194
+ return super().get_special_tokens_mask(
195
+ token_ids_0=token_ids_0, token_ids_1=token_ids_1, already_has_special_tokens=True
196
+ )
197
+
198
+ eos_token_id = [1] if self.add_eos_token else []
199
+
200
+ if token_ids_1 is None:
201
+ return ([0] * len(token_ids_0)) + eos_token_id
202
+ return ([0] * len(token_ids_0)) + eos_token_id + ([0] * len(token_ids_1)) + eos_token_id
203
+
204
+ def create_token_type_ids_from_sequences(
205
+ self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None
206
+ ) -> List[int]:
207
+ """
208
+ Creates a mask from the two sequences passed to be used in a sequence-pair classification task. An ALBERT
209
+ sequence pair mask has the following format:
210
+ ```
211
+ 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
212
+ | first sequence | second sequence |
213
+ ```
214
+ if token_ids_1 is None, only returns the first portion of the mask (0s).
215
+ Args:
216
+ token_ids_0 (`List[int]`):
217
+ List of ids.
218
+ token_ids_1 (`List[int]`, *optional*):
219
+ Optional second list of IDs for sequence pairs.
220
+ Returns:
221
+ `List[int]`: List of [token type IDs](../glossary#token-type-ids) according to the given sequence(s).
222
+ """
223
+ eos_token_id = [self.eos_token_id] if self.add_eos_token else []
224
+
225
+ output = [0] * len(token_ids_0 + eos_token_id)
226
+
227
+ if token_ids_1 is not None:
228
+ output += [1] * len(token_ids_1 + eos_token_id)
229
+
230
+ return output
231
+
232
+ # has no vocab file
233
+ def save_vocabulary(self, save_directory: str, filename_prefix: Optional[str] = None):
234
+ return ()
tokenizer_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_eos_token": false,
3
+ "add_special_tokens": true,
4
+ "clean_up_tokenization_spaces": true,
5
+ "eos_token": "<|endoftext|>",
6
+ "model_max_length": 1000000000000000019884624838656,
7
+ "pad_token": null,
8
+ "tokenizer_class": "XgenTokenizer",
9
+ "auto_map": {
10
+ "AutoTokenizer": ["tokenization_xgen.XgenTokenizer", null]
11
+ }
12
+ }