|
--- |
|
library_name: peft |
|
license: mit |
|
language: |
|
- en |
|
- it |
|
- fr |
|
datasets: |
|
- kaitchup/opus-Italian-to-English |
|
- kaitchup/opus-French-to-English |
|
tags: |
|
- translation |
|
--- |
|
# Model Card for Model ID |
|
|
|
This is an adapter for Meta's Llama 2 7B fine-tuned for translating Italian text into English. |
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
- **Developed by:** Bhuvnesh Saini |
|
- **Model type:** LoRA Adapter for Llama 2 7B |
|
- **Language(s) (NLP):** French, Italian, English |
|
- **License:** MIT license |
|
|
|
|
|
|
|
## Uses |
|
|
|
This adapter must be loaded on top of Llama 2 7B. It has been fine-tuned with QLoRA. For optimal results, the base model must be loaded with the exact same configuration used during fine-tuning. |
|
You can use the following code to load the model: |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
import torch |
|
from peft import PeftModel |
|
|
|
base_model = "meta-llama/Llama-2-7b-hf" |
|
compute_dtype = getattr(torch, "float16") |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=compute_dtype, |
|
bnb_4bit_use_double_quant=True, |
|
) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
original_model_directory, device_map={"": 0}, quantization_config=bnb_config |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True) |
|
model = PeftModel.from_pretrained(model, "kaitchup/Llama-2-7b-mt-Italian-to-English") |
|
``` |
|
|
|
Then, run the model as follows: |
|
|
|
``` |
|
my_text = "" #put your text to translate here |
|
|
|
prompt = my_text+" ###>" |
|
|
|
tokenized_input = tokenizer(prompt, return_tensors="pt") |
|
input_ids = tokenized_input["input_ids"].cuda() |
|
|
|
generation_output = model.generate( |
|
input_ids=input_ids, |
|
num_beams=10, |
|
return_dict_in_generate=True, |
|
output_scores=True, |
|
max_new_tokens=130 |
|
|
|
) |
|
for seq in generation_output.sequences: |
|
output = tokenizer.decode(seq, skip_special_tokens=True) |
|
print(output.split("###>")[1].strip()) |
|
``` |
|
|
|
|