henryen
/

OriGen

PEFT

Safetensors

Model card Files Files and versions Community

henryen commited on Sep 6, 2024

Commit

96364e0

1 Parent(s): 1e9933d

update readme

Browse files

Files changed (2) hide show

README.md +63 -4
figures/evaluation.png +0 -0

README.md CHANGED Viewed

@@ -5,7 +5,66 @@ library_name: peft
 ---
-### Model Sources
-<!-- Provide the basic links for the model. -->
-- **Repository:** https://github.com/pku-liang/OriGen
-- **Paper:** https://arxiv.org/abs/2407.16237

 ---
+# OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection
+### Introduction
+OriGen is a fine-tuned lora model designed for Verilog code generation. It is trained on top of DeepSeek Coder 7B using datasets generated from code-to-code augmentation and self-reflection.
+**Repository:** [pku-liang/OriGen](https://github.com/pku-liang/OriGen)
+### Evaluation Results
+<img src="figures/evaluation.png" alt="evaluation" width="1000"/>
+### Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
+import torch
+from peft import PeftModel
+model_name = "deepseek-ai/deepseek-coder-7b-instruct-v1.5"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    low_cpu_mem_usage=True,
+    torch_dtype=torch.float16,
+    attn_implementation="flash_attention_2",
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(model, model_id="henryen/OriGen")
+model.eval()
+streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
+prompt = "### Instruction: Please act as a professional Verilog designer. and provide Verilog code based on the given instruction. Generate a concise Verilog module for a 8 bit full adder, don't include any unnecessary code.\n### Response: "
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=1000,
+    do_sample=False,
+    temperature=0,
+    eos_token_id=tokenizer.eos_token_id,
+    pad_token_id=tokenizer.pad_token_id,
+    streamer=streamer
+)
+```
+### Paper
+**Arxiv:** https://arxiv.org/abs/2407.16237
+Please cite our paper if you use this model.
+```
+@article{2024origen,
+  title={OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection},
+  author={Cui, Fan and Yin, Chenyang and Zhou, Kexing and Xiao, Youwei and Sun, Guangyu and Xu, Qiang and Guo, Qipeng and Song, Demin and Lin, Dahua and Zhang, Xingcheng and others},
+  journal={arXiv preprint arXiv:2407.16237},
+  year={2024}
+}
+```

figures/evaluation.png ADDED Viewed