Bloomz-7b1-instruct
This is Bloomz-7b1-mt model fine-tuned with multilingual instruction dataset and using Peft Lora fine-tuning. Following languages are supported: English, German, French, Spanish, Hindi, Indonesian, Japanese, Malaysian, Portuguese, Russian, Thai, Vietnamese and Chinese.
Usage
Following is the code to do the inference using this model:
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
peft_model_id = "cahya/bloomz-7b1-instruct"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True,
load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)
batch = tokenizer("User: How old is the universe?\nAssistant: ", return_tensors='pt').to(0)
with torch.cuda.amp.autocast():
output_tokens = model.generate(**batch, max_new_tokens=200,
min_length=50,
do_sample=True,
top_k=40,
top_p=0.9,
temperature=0.2,
repetition_penalty=1.2,
num_return_sequences=1)
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))