|
--- |
|
datasets: |
|
- HuggingFaceH4/CodeAlpaca_20K |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
tags: |
|
- code |
|
- LLaMa2 |
|
--- |
|
|
|
# LLaMaCoder |
|
|
|
## Model Description |
|
|
|
`LLaMaCoder` is based on LLaMa2 7B language model, finetuned using LoRA adaptors. |
|
|
|
## Usage |
|
|
|
Generate code with LLaMaCoder in 4bit model according to the following python snippet: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer |
|
import torch |
|
|
|
MODEL_NAME = "Sakuna/LLaMaCoderAll" |
|
device = "cuda:0" |
|
|
|
|
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=torch.float16, |
|
) |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
MODEL_NAME, |
|
quantization_config=bnb_config, |
|
trust_remote_code=True |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True) |
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
|
model = model.to(device) |
|
model.eval() |
|
|
|
prompt = "Write a Java program to calculate the factorial of a given number k" |
|
input = f"{prompt}\n### Solution:\n" |
|
device = "cuda:0" |
|
|
|
inputs = tokenizer(input, return_tensors="pt").to(device) |
|
outputs = model.generate(**inputs, max_length=256, temperature=0.7) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |