shailja commited on
Commit
0f48bb0
·
1 Parent(s): 29c4f4e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +347 -0
README.md ADDED
@@ -0,0 +1,347 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: true
4
+ widget:
5
+ - text: module display_hello_word
6
+ example_title: Hello world
7
+ group: Verilog
8
+ license: bigcode-openrail-m
9
+ datasets:
10
+ - shailja/Verilog_GitHub
11
+ library_name: transformers
12
+ tags:
13
+ - code
14
+ model-index:
15
+ - name: VeriGen
16
+ results:
17
+ - task:
18
+ type: text-generation
19
+ dataset:
20
+ type: openai_humaneval
21
+ name: HumanEval (Prompted)
22
+ metrics:
23
+ - name: pass@1
24
+ type: pass@1
25
+ value:
26
+ verified: false
27
+ - task:
28
+ type: text-generation
29
+ dataset:
30
+ type: openai_humaneval
31
+ name: HumanEval
32
+ metrics:
33
+ - name: pass@1
34
+ type: pass@1
35
+ value:
36
+ verified: false
37
+ - task:
38
+ type: text-generation
39
+ dataset:
40
+ type: mbpp
41
+ name: MBPP
42
+ metrics:
43
+ - name: pass@1
44
+ type: pass@1
45
+ value:
46
+ verified: false
47
+ - task:
48
+ type: text-generation
49
+ dataset:
50
+ type: ds1000
51
+ name: DS-1000 (Overall Completion)
52
+ metrics:
53
+ - name: pass@1
54
+ type: pass@1
55
+ value:
56
+ verified: false
57
+ - task:
58
+ type: text-generation
59
+ dataset:
60
+ type: nuprl/MultiPL-E
61
+ name: MultiPL-HumanEval (C++)
62
+ metrics:
63
+ - name: pass@1
64
+ type: pass@1
65
+ value:
66
+ verified: false
67
+ - task:
68
+ type: text-generation
69
+ dataset:
70
+ type: nuprl/MultiPL-E
71
+ name: MultiPL-HumanEval (C#)
72
+ metrics:
73
+ - name: pass@1
74
+ type: pass@1
75
+ value:
76
+ verified: false
77
+ - task:
78
+ type: text-generation
79
+ dataset:
80
+ type: nuprl/MultiPL-E
81
+ name: MultiPL-HumanEval (D)
82
+ metrics:
83
+ - name: pass@1
84
+ type: pass@1
85
+ value: 0.1357
86
+ verified: false
87
+ - task:
88
+ type: text-generation
89
+ dataset:
90
+ type: nuprl/MultiPL-E
91
+ name: MultiPL-HumanEval (Go)
92
+ metrics:
93
+ - name: pass@1
94
+ type: pass@1
95
+ value:
96
+ verified: false
97
+ - task:
98
+ type: text-generation
99
+ dataset:
100
+ type: nuprl/MultiPL-E
101
+ name: MultiPL-HumanEval (Java)
102
+ metrics:
103
+ - name: pass@1
104
+ type: pass@1
105
+ value:
106
+ verified: false
107
+ - task:
108
+ type: text-generation
109
+ dataset:
110
+ type: nuprl/MultiPL-E
111
+ name: MultiPL-HumanEval (Julia)
112
+ metrics:
113
+ - name: pass@1
114
+ type: pass@1
115
+ value:
116
+ verified: false
117
+ - task:
118
+ type: text-generation
119
+ dataset:
120
+ type: nuprl/MultiPL-E
121
+ name: MultiPL-HumanEval (JavaScript)
122
+ metrics:
123
+ - name: pass@1
124
+ type: pass@1
125
+ value:
126
+ verified: false
127
+ - task:
128
+ type: text-generation
129
+ dataset:
130
+ type: nuprl/MultiPL-E
131
+ name: MultiPL-HumanEval (Lua)
132
+ metrics:
133
+ - name: pass@1
134
+ type: pass@1
135
+ value:
136
+ verified: false
137
+ - task:
138
+ type: text-generation
139
+ dataset:
140
+ type: nuprl/MultiPL-E
141
+ name: MultiPL-HumanEval (PHP)
142
+ metrics:
143
+ - name: pass@1
144
+ type: pass@1
145
+ value:
146
+ verified: false
147
+ - task:
148
+ type: text-generation
149
+ dataset:
150
+ type: nuprl/MultiPL-E
151
+ name: MultiPL-HumanEval (Perl)
152
+ metrics:
153
+ - name: pass@1
154
+ type: pass@1
155
+ value:
156
+ verified: false
157
+ - task:
158
+ type: text-generation
159
+ dataset:
160
+ type: nuprl/MultiPL-E
161
+ name: MultiPL-HumanEval (Python)
162
+ metrics:
163
+ - name: pass@1
164
+ type: pass@1
165
+ value:
166
+ verified: false
167
+ - task:
168
+ type: text-generation
169
+ dataset:
170
+ type: nuprl/MultiPL-E
171
+ name: MultiPL-HumanEval (R)
172
+ metrics:
173
+ - name: pass@1
174
+ type: pass@1
175
+ value:
176
+ verified: false
177
+ - task:
178
+ type: text-generation
179
+ dataset:
180
+ type: nuprl/MultiPL-E
181
+ name: MultiPL-HumanEval (Ruby)
182
+ metrics:
183
+ - name: pass@1
184
+ type: pass@1
185
+ value:
186
+ verified: false
187
+ - task:
188
+ type: text-generation
189
+ dataset:
190
+ type: nuprl/MultiPL-E
191
+ name: MultiPL-HumanEval (Racket)
192
+ metrics:
193
+ - name: pass@1
194
+ type: pass@1
195
+ value:
196
+ verified: false
197
+ - task:
198
+ type: text-generation
199
+ dataset:
200
+ type: nuprl/MultiPL-E
201
+ name: MultiPL-HumanEval (Rust)
202
+ metrics:
203
+ - name: pass@1
204
+ type: pass@1
205
+ value:
206
+ verified: false
207
+ - task:
208
+ type: text-generation
209
+ dataset:
210
+ type: nuprl/MultiPL-E
211
+ name: MultiPL-HumanEval (Scala)
212
+ metrics:
213
+ - name: pass@1
214
+ type: pass@1
215
+ value:
216
+ verified: false
217
+ - task:
218
+ type: text-generation
219
+ dataset:
220
+ type: nuprl/MultiPL-E
221
+ name: MultiPL-HumanEval (Bash)
222
+ metrics:
223
+ - name: pass@1
224
+ type: pass@1
225
+ value:
226
+ verified: false
227
+ - task:
228
+ type: text-generation
229
+ dataset:
230
+ type: nuprl/MultiPL-E
231
+ name: MultiPL-HumanEval (Swift)
232
+ metrics:
233
+ - name: pass@1
234
+ type: pass@1
235
+ value:
236
+ verified: false
237
+ - task:
238
+ type: text-generation
239
+ dataset:
240
+ type: nuprl/MultiPL-E
241
+ name: MultiPL-HumanEval (TypeScript)
242
+ metrics:
243
+ - name: pass@1
244
+ type: pass@1
245
+ value:
246
+ verified: false
247
+ extra_gated_prompt: >-
248
+ ## Model License Agreement
249
+
250
+ Please read the BigCode [OpenRAIL-M
251
+ license](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement)
252
+ agreement before accepting it.
253
+
254
+ extra_gated_fields:
255
+ I accept the above license agreement, and will use the Model complying with the set of use restrictions and sharing requirements: checkbox
256
+ ---
257
+
258
+
259
+ # VeriGen
260
+
261
+
262
+ ## Table of Contents
263
+
264
+ 1. [Model Summary](##model-summary)
265
+ 2. [Use](##use)
266
+ 3. [Limitations](##limitations)
267
+ 4. [Training](##training)
268
+ 5. [License](##license)
269
+ 6. [Citation](##citation)
270
+
271
+ ## Model Summary
272
+
273
+ The CodeGen models are 15.5B parameter models trained on 80+ programming languages from [The Stack (v1.2)](https://huggingface.co/datasets/bigcode/the-stack), with opt-out requests excluded. The model uses [Multi Query Attention](https://arxiv.org/abs/1911.02150), [a context window of 8192 tokens](https://arxiv.org/abs/2205.14135), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 1 trillion tokens.
274
+
275
+ - **Repository:** [shailja-thakur/VGen](https://github.com/shailja-thakur/VGen)
276
+ - **Baseline LLM** [SalesForce/CodeGen](https://github.com/salesforce/CodeGen)
277
+ - **Paper:** [ Benchmarking Large Language Models for Automated Verilog RTL Code Generation](https://arxiv.org/abs/2212.11140)
278
+ - **Point of Contact:** [contact@shailja](mailto:[email protected])
279
+ - **Languages:** Verilog (Hardware Description Language)
280
+
281
+
282
+ ## Use
283
+
284
+ ### Intended use
285
+
286
+ The model was trained on Verilog from GitHub and textbooks. As such it is _not_ an instruction model and commands like "Write a module that implements a 2-to-1 Mux." do not work well. However, by additing a partial line of module header like "module mux" in addition with the text in the prompt turns it into a capable Verilog teaching assistant.
287
+
288
+ **Feel free to share your generations in the Community tab!**
289
+
290
+ ### Generation
291
+ ```python
292
+ # pip install -q transformers
293
+ import torch
294
+ from transformers import AutoTokenizer, AutoModelForCausalLM
295
+ # Prompt
296
+ prompt = "//module half adder "
297
+ device='cuda'
298
+ # Load model and tokenizer
299
+ model_name = "shailja/CodeGen_2B_Verilog"
300
+ tokenizer = AutoTokenizer.from_pretrained("shailja/fine-tuned-codegen-2B-Verilog")
301
+ model = AutoModelForCausalLM.from_pretrained("shailja/fine-tuned-codegen-2B-Verilog").to(device)
302
+
303
+ # Sample
304
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
305
+ sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9)
306
+
307
+ print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")
308
+ ```
309
+
310
+
311
+ ### Attribution & Other Requirements
312
+
313
+ The pretraining dataset of the model was not filtered for permissive licenses only. Nevertheless, the model can generate source code verbatim from the dataset. The code's license might require attribution and/or other specific requirements that must be respected.
314
+
315
+ # Limitations
316
+
317
+ The model has been trained on Verilog source code from open sources. The predominant natural language in source code is English, although other languages are also present. As such the model is capable of generating Verilog snippets provided some context but the generated code is not guaranteed to work as intended. It can be inefficient, contain bugs or exploits. See [the paper](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) for an in-depth discussion of the model limitations.
318
+
319
+ # Training
320
+
321
+ ## Model
322
+
323
+ - **Architecture:** GPT-2 model with multi-query attention
324
+ - **Pretraining steps:** 150k
325
+ - **Pretraining tokens:** ~72B
326
+ - **Precision:** fp16
327
+
328
+ ## Hardware
329
+
330
+ - **GPUs:** 3 Tesla A100
331
+ - **Training time:** 8 days
332
+
333
+
334
+ # License
335
+ The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement [here](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement).
336
+ # Citation
337
+ ```
338
+ @misc{https://doi.org/10.48550/arxiv.2212.11140,
339
+ doi = {10.48550/ARXIV.2212.11140},
340
+ url = {https://arxiv.org/abs/2212.11140},
341
+ author = {Thakur, Shailja and Ahmad, Baleegh and Fan, Zhenxing and Pearce, Hammond and Tan, Benjamin and Karri, Ramesh and Dolan-Gavitt, Brendan and Garg, Siddharth},
342
+ title = {Benchmarking Large Language Models for Automated Verilog RTL Code Generation},
343
+ publisher = {arXiv},
344
+ year = {2022},
345
+ copyright = {arXiv.org perpetual, non-exclusive license}
346
+ }
347
+ ```