Schaapje 2B Chat V1.0
Collection
4 items
•
Updated
This is the DPO aligned model based on the SFT trained model Schaapje-2B-Chat-SFT-V1.0.
General Dutch Chat and/or Instruction following works quitte well with this model.
A basic example of how to use this DPO aligned model for Chat or Instruction following.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
device = 'cuda'
model_name = 'robinsmits/Schaapje-2B-Chat-V1.0'
model = AutoModelForCausalLM.from_pretrained(model_name,
device_map = "auto",
torch_dtype = torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_name)
messages = [{"role": "user", "content": "Hoi hoe gaat het ermee?"}]
chat = tokenizer.apply_chat_template(messages,
tokenize = False,
add_generation_prompt = True)
input_tokens = tokenizer(chat, return_tensors = "pt").to('cuda')
output = model.generate(**input_tokens,
max_new_tokens = 512,
do_sample = True)
output = tokenizer.decode(output[0], skip_special_tokens = False)
print(output)
As with all LLM's this model can also experience bias and hallucinations. Regardless of how you use this model always perform the necessary testing and validation.
The following dataset was used for DPO alignment:
The notebook used to train this DPO aligned model is available at the following link: Schaapje-2B-Chat-DPO-V1.0
Base model
ibm-granite/granite-3.0-2b-base