Schaapje logo

Schaapje-2B-Chat-V1.0

Model description

This is the DPO aligned model based on the SFT trained model Schaapje-2B-Chat-SFT-V1.0.

General Dutch Chat and/or Instruction following works quitte well with this model.

Model usage

A basic example of how to use this DPO aligned model for Chat or Instruction following.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

device = 'cuda'
model_name = 'robinsmits/Schaapje-2B-Chat-V1.0'

model = AutoModelForCausalLM.from_pretrained(model_name, 
                                             device_map = "auto", 
                                             torch_dtype = torch.bfloat16)

tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [{"role": "user", "content": "Hoi hoe gaat het ermee?"}]

chat = tokenizer.apply_chat_template(messages, 
                                     tokenize = False, 
                                     add_generation_prompt = True)

input_tokens = tokenizer(chat, return_tensors = "pt").to('cuda')

output = model.generate(**input_tokens, 
                        max_new_tokens = 512,
                        do_sample = True)

output = tokenizer.decode(output[0], skip_special_tokens = False)
print(output)

Intended uses & limitations

As with all LLM's this model can also experience bias and hallucinations. Regardless of how you use this model always perform the necessary testing and validation.

Datasets and Licenses

The following dataset was used for DPO alignment:

Model Training

The notebook used to train this DPO aligned model is available at the following link: Schaapje-2B-Chat-DPO-V1.0

Downloads last month
68
Safetensors
Model size
2.53B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for robinsmits/Schaapje-2B-Chat-V1.0

Dataset used to train robinsmits/Schaapje-2B-Chat-V1.0

Collection including robinsmits/Schaapje-2B-Chat-V1.0