royleibov commited on
Commit
a94002e
·
verified ·
1 Parent(s): cc8acc6

Add ZipNN stuff

Browse files
Files changed (1) hide show
  1. README.md +113 -55
README.md CHANGED
@@ -5,13 +5,8 @@ license: apache-2.0
5
  datasets:
6
  - codeparrot/github-code-clean
7
  - bigcode/starcoderdata
8
- # - Stackexchange
9
- # - CommonCrawl
10
  - open-web-math/open-web-math
11
  - math-ai/StackMathQA
12
- # - Arxiv
13
- # - Wikipedia
14
- # - conceptofmind/FLAN_2022 # Original link is broken, we used IBM's filtered version
15
  metrics:
16
  - code_eval
17
  library_name: transformers
@@ -24,18 +19,18 @@ model-index:
24
  - task:
25
  type: text-generation
26
  dataset:
27
- type: bigcode/humanevalpack
28
- name: HumanEvalSynthesis (Python)
29
  metrics:
30
  - name: pass@1
31
  type: pass@1
32
- value: 36.0
33
  verified: false
34
  - task:
35
  type: text-generation
36
  dataset:
37
- type: bigcode/humanevalpack
38
- name: HumanEvalSynthesis (Average)
39
  metrics:
40
  - name: pass@1
41
  type: pass@1
@@ -44,8 +39,8 @@ model-index:
44
  - task:
45
  type: text-generation
46
  dataset:
47
- type: bigcode/humanevalpack
48
- name: HumanEvalExplain (Average)
49
  metrics:
50
  - name: pass@1
51
  type: pass@1
@@ -54,8 +49,8 @@ model-index:
54
  - task:
55
  type: text-generation
56
  dataset:
57
- type: bigcode/humanevalpack
58
- name: HumanEvalFix (Average)
59
  metrics:
60
  - name: pass@1
61
  type: pass@1
@@ -64,135 +59,193 @@ model-index:
64
  - task:
65
  type: text-generation
66
  dataset:
67
- type: repoqa
68
- name: RepoQA (Python@16K)
69
  metrics:
70
  - name: pass@1 (thresh=0.5)
71
  type: pass@1 (thresh=0.5)
72
- value: 40.0
73
  verified: false
74
  - task:
75
  type: text-generation
76
  dataset:
77
- type: repoqa
78
- name: RepoQA (C++@16K)
79
  metrics:
80
  - name: pass@1 (thresh=0.5)
81
  type: pass@1 (thresh=0.5)
82
- value: 36.0
83
  verified: false
84
  - task:
85
  type: text-generation
86
  dataset:
87
- type: repoqa
88
- name: RepoQA (Java@16K)
89
  metrics:
90
  - name: pass@1 (thresh=0.5)
91
  type: pass@1 (thresh=0.5)
92
- value: 37.0
93
  verified: false
94
  - task:
95
  type: text-generation
96
  dataset:
97
- type: repoqa
98
- name: RepoQA (TypeScript@16K)
99
  metrics:
100
  - name: pass@1 (thresh=0.5)
101
  type: pass@1 (thresh=0.5)
102
- value: 27.0
103
  verified: false
104
  - task:
105
  type: text-generation
106
  dataset:
107
- type: repoqa
108
- name: RepoQA (Rust@16K)
109
  metrics:
110
  - name: pass@1 (thresh=0.5)
111
  type: pass@1 (thresh=0.5)
112
- value: 29.0
113
  verified: false
114
  - task:
115
  type: text-generation
116
  dataset:
117
- type: lcc
118
- name: LCC (Balanced)
119
  metrics:
120
- - name: Exact Match@4K
121
  type: Exact Match@4K
122
  value: 54.6
123
  verified: false
124
  - task:
125
  type: text-generation
126
  dataset:
127
- type: lcc
128
- name: LCC (Balanced)
129
  metrics:
130
- - name: Exact Match@8K
131
  type: Exact Match@8K
132
  value: 56.8
133
  verified: false
134
  - task:
135
  type: text-generation
136
  dataset:
137
- type: lcc
138
- name: LCC (Balanced)
139
  metrics:
140
- - name: Exact Match@16K
141
  type: Exact Match@16K
142
  value: 52.2
143
  verified: false
144
  - task:
145
  type: text-generation
146
  dataset:
147
- type: lcc
148
- name: LCC (Balanced)
149
  metrics:
150
- - name: Exact Match@32K
151
  type: Exact Match@32K
152
  value: 57.8
153
  verified: false
154
  - task:
155
  type: text-generation
156
  dataset:
157
- type: repobench
158
- name: RepoBench-P (Balanced)
159
  metrics:
160
- - name: Exact Match@4K
161
  type: Exact Match@4K
162
  value: 39.8
163
  verified: false
164
  - task:
165
  type: text-generation
166
  dataset:
167
- type: repobench
168
- name: RepoBench-P (Balanced)
169
  metrics:
170
- - name: Exact Match@8K
171
  type: Exact Match@8K
172
  value: 46.8
173
  verified: false
174
  - task:
175
  type: text-generation
176
  dataset:
177
- type: repobench
178
- name: RepoBench-P (Balanced)
179
  metrics:
180
- - name: Exact Match@16K
181
  type: Exact Match@16K
182
  value: 43.1
183
  verified: false
184
  - task:
185
  type: text-generation
186
  dataset:
187
- type: repobench
188
- name: RepoBench-Pn(Balanced)
189
  metrics:
190
- - name: Exact Match@32K
191
  type: Exact Match@32K
192
  value: 45.3
193
  verified: false
 
 
194
  ---
195
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
196
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
197
 
198
  # Granite-3B-Code-Base-128K
@@ -217,11 +270,16 @@ This is a simple example of how to use **Granite-3B-Code-Base-128K** model.
217
  ```python
218
  import torch
219
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
 
220
  device = "cuda" # or "cpu"
221
- model_path = "ibm-granite/granite-3b-code-base-128k"
222
  tokenizer = AutoTokenizer.from_pretrained(model_path)
223
  # drop device_map if running on CPU
224
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
 
225
  model.eval()
226
  # change input text as desired
227
  input_text = "def generate():"
@@ -246,4 +304,4 @@ Starting from the base Granite model, this model was further pretrained on repos
246
  We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.
247
 
248
  ## Ethical Considerations and Limitations
249
- The use of Large Language Models involves risks and ethical considerations people must be aware of. Regarding code generation, caution is urged against complete reliance on specific code models for crucial decisions or impactful information as the generated code is not guaranteed to work as intended. **Granite-3B-Code-Base-128K** model is not the exception in this regard. Even though this model is suited for multiple code-related tasks, it has not undergone any safety alignment, there it may produce problematic outputs. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios by copying source code verbatim from the training dataset due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use **Granite-3B-Code-Base-128K** model with ethical intentions and in a responsible way. 
 
5
  datasets:
6
  - codeparrot/github-code-clean
7
  - bigcode/starcoderdata
 
 
8
  - open-web-math/open-web-math
9
  - math-ai/StackMathQA
 
 
 
10
  metrics:
11
  - code_eval
12
  library_name: transformers
 
19
  - task:
20
  type: text-generation
21
  dataset:
22
+ type: bigcode/humanevalpack
23
+ name: HumanEvalSynthesis (Python)
24
  metrics:
25
  - name: pass@1
26
  type: pass@1
27
+ value: 36
28
  verified: false
29
  - task:
30
  type: text-generation
31
  dataset:
32
+ type: bigcode/humanevalpack
33
+ name: HumanEvalSynthesis (Average)
34
  metrics:
35
  - name: pass@1
36
  type: pass@1
 
39
  - task:
40
  type: text-generation
41
  dataset:
42
+ type: bigcode/humanevalpack
43
+ name: HumanEvalExplain (Average)
44
  metrics:
45
  - name: pass@1
46
  type: pass@1
 
49
  - task:
50
  type: text-generation
51
  dataset:
52
+ type: bigcode/humanevalpack
53
+ name: HumanEvalFix (Average)
54
  metrics:
55
  - name: pass@1
56
  type: pass@1
 
59
  - task:
60
  type: text-generation
61
  dataset:
62
+ type: repoqa
63
+ name: RepoQA (Python@16K)
64
  metrics:
65
  - name: pass@1 (thresh=0.5)
66
  type: pass@1 (thresh=0.5)
67
+ value: 40
68
  verified: false
69
  - task:
70
  type: text-generation
71
  dataset:
72
+ type: repoqa
73
+ name: RepoQA (C++@16K)
74
  metrics:
75
  - name: pass@1 (thresh=0.5)
76
  type: pass@1 (thresh=0.5)
77
+ value: 36
78
  verified: false
79
  - task:
80
  type: text-generation
81
  dataset:
82
+ type: repoqa
83
+ name: RepoQA (Java@16K)
84
  metrics:
85
  - name: pass@1 (thresh=0.5)
86
  type: pass@1 (thresh=0.5)
87
+ value: 37
88
  verified: false
89
  - task:
90
  type: text-generation
91
  dataset:
92
+ type: repoqa
93
+ name: RepoQA (TypeScript@16K)
94
  metrics:
95
  - name: pass@1 (thresh=0.5)
96
  type: pass@1 (thresh=0.5)
97
+ value: 27
98
  verified: false
99
  - task:
100
  type: text-generation
101
  dataset:
102
+ type: repoqa
103
+ name: RepoQA (Rust@16K)
104
  metrics:
105
  - name: pass@1 (thresh=0.5)
106
  type: pass@1 (thresh=0.5)
107
+ value: 29
108
  verified: false
109
  - task:
110
  type: text-generation
111
  dataset:
112
+ type: lcc
113
+ name: LCC (Balanced)
114
  metrics:
115
+ - name: Exact Match@4K
116
  type: Exact Match@4K
117
  value: 54.6
118
  verified: false
119
  - task:
120
  type: text-generation
121
  dataset:
122
+ type: lcc
123
+ name: LCC (Balanced)
124
  metrics:
125
+ - name: Exact Match@8K
126
  type: Exact Match@8K
127
  value: 56.8
128
  verified: false
129
  - task:
130
  type: text-generation
131
  dataset:
132
+ type: lcc
133
+ name: LCC (Balanced)
134
  metrics:
135
+ - name: Exact Match@16K
136
  type: Exact Match@16K
137
  value: 52.2
138
  verified: false
139
  - task:
140
  type: text-generation
141
  dataset:
142
+ type: lcc
143
+ name: LCC (Balanced)
144
  metrics:
145
+ - name: Exact Match@32K
146
  type: Exact Match@32K
147
  value: 57.8
148
  verified: false
149
  - task:
150
  type: text-generation
151
  dataset:
152
+ type: repobench
153
+ name: RepoBench-P (Balanced)
154
  metrics:
155
+ - name: Exact Match@4K
156
  type: Exact Match@4K
157
  value: 39.8
158
  verified: false
159
  - task:
160
  type: text-generation
161
  dataset:
162
+ type: repobench
163
+ name: RepoBench-P (Balanced)
164
  metrics:
165
+ - name: Exact Match@8K
166
  type: Exact Match@8K
167
  value: 46.8
168
  verified: false
169
  - task:
170
  type: text-generation
171
  dataset:
172
+ type: repobench
173
+ name: RepoBench-P (Balanced)
174
  metrics:
175
+ - name: Exact Match@16K
176
  type: Exact Match@16K
177
  value: 43.1
178
  verified: false
179
  - task:
180
  type: text-generation
181
  dataset:
182
+ type: repobench
183
+ name: RepoBench-Pn(Balanced)
184
  metrics:
185
+ - name: Exact Match@32K
186
  type: Exact Match@32K
187
  value: 45.3
188
  verified: false
189
+ base_model:
190
+ - ibm-granite/granite-3b-code-base-128k
191
  ---
192
 
193
+ # Disclaimer and Requirements
194
+
195
+ This model is a clone of [**ibm-granite/granite-3b-code-base-128k**](https://huggingface.co/ibm-granite/granite-3b-code-base-128k) compressed using ZipNN. Compressed losslessly to 67% its original size, ZipNN saved ~3GB in storage and potentially ~2TB in data transfer **monthly**.
196
+
197
+ ### Requirement
198
+
199
+ In order to use the model, ZipNN is necessary:
200
+ ```bash
201
+ pip install zipnn
202
+ ```
203
+ ### Use This Model
204
+ ```python
205
+ # Use a pipeline as a high-level helper
206
+ from transformers import pipeline
207
+ from zipnn import zipnn_hf
208
+
209
+ zipnn_hf()
210
+
211
+ messages = [
212
+ {"role": "user", "content": "Who are you?"},
213
+ ]
214
+ pipe = pipeline("text-generation", model="royleibov/granite-3b-code-base-128k-ZipNN-Compressed")
215
+ pipe(messages)
216
+ ```
217
+ ```python
218
+ # Load model directly
219
+ import torch
220
+ from transformers import AutoModelForCausalLM, AutoTokenizer
221
+ from zipnn import zipnn_hf
222
+
223
+ zipnn_hf()
224
+
225
+ model = AutoModelForCausalLM.from_pretrained(
226
+ "royleibov/granite-3b-code-base-128k-ZipNN-Compressed",
227
+ device_map="cuda",
228
+ torch_dtype="auto",
229
+ trust_remote_code=True,
230
+ )
231
+ tokenizer = AutoTokenizer.from_pretrained("royleibov/granite-3b-code-base-128k-ZipNN-Compressed")
232
+ ```
233
+ ### ZipNN
234
+ ZipNN also allows you to seemlessly save local disk space in your cache after the model is downloaded.
235
+
236
+ To compress the cached model, simply run:
237
+ ```bash
238
+ python zipnn_compress_path.py safetensors --model royleibov/granite-3b-code-base-128k-ZipNN-Compressed --hf_cache
239
+ ```
240
+
241
+ The model will be decompressed automatically and safely as long as `zipnn_hf()` is added at the top of the file like in the [example above](#use-this-model).
242
+
243
+ To decompress manualy, simply run:
244
+ ```bash
245
+ python zipnn_decompress_path.py --model royleibov/granite-3b-code-base-128k-ZipNN-Compressed --hf_cache
246
+ ```
247
+
248
+
249
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png)
250
 
251
  # Granite-3B-Code-Base-128K
 
270
  ```python
271
  import torch
272
  from transformers import AutoModelForCausalLM, AutoTokenizer
273
+ from zipnn import zipnn_hf
274
+
275
+ zipnn_hf()
276
+
277
  device = "cuda" # or "cpu"
278
+ model_path = "royleibov/granite-3b-code-base-128k-ZipNN-Compressed"
279
  tokenizer = AutoTokenizer.from_pretrained(model_path)
280
  # drop device_map if running on CPU
281
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
282
+
283
  model.eval()
284
  # change input text as desired
285
  input_text = "def generate():"
 
304
  We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.
305
 
306
  ## Ethical Considerations and Limitations
307
+ The use of Large Language Models involves risks and ethical considerations people must be aware of. Regarding code generation, caution is urged against complete reliance on specific code models for crucial decisions or impactful information as the generated code is not guaranteed to work as intended. **Granite-3B-Code-Base-128K** model is not the exception in this regard. Even though this model is suited for multiple code-related tasks, it has not undergone any safety alignment, there it may produce problematic outputs. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios by copying source code verbatim from the training dataset due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use **Granite-3B-Code-Base-128K** model with ethical intentions and in a responsible way.