File size: 3,994 Bytes
b07c663 826951d 070a394 826951d 1905d4f 1d34bb3 070a394 ab62897 12836ce 1d34bb3 b5d8a7f 1d34bb3 4074583 c5b0d2d 1d34bb3 593b0b3 1d34bb3 070a394 1d34bb3 a0d9543 1d34bb3 c5b0d2d 1d34bb3 a0d9543 1d34bb3 a0d9543 1d34bb3 f729556 41e7b5b c5b0d2d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
---
tags:
- generated_from_trainer
- conversational
- polish
license: mit
language:
- pl
datasets:
- eryk-mazus/polka-dpo-v1
pipeline_tag: text-generation
inference: false
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61bf0e11c88f3fd22f654059/FiMCITBAaEyMyxCHhfWVD.png)
# Polka-1.1B-Chat
`eryk-mazus/polka-1.1b-chat` **is the first Polish model trained to act as a helpful, conversational assistant that can be run locally.**
The model is based on [TinyLlama-1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) with the custom, extended tokenizer for more efficient Polish text generation, that was additionally pretrained on 5.7 billion tokens. **It was then fine-tuned on around 60k synthetically generated and machine-translated multi-turn conversations with the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) performed on top of it.**
Context size: 4,096 tokens
In addition, we're releasing:
* [polka-1.1b](https://huggingface.co/eryk-mazus/polka-1.1b) - our base model with an extended tokenizer and additional pre-training on Polish corpus sampled using [DSIR](https://github.com/p-lambda/dsir)
* [polka-pretrain-en-pl-v1](https://huggingface.co/datasets/eryk-mazus/polka-pretrain-en-pl-v1) - the pre-training dataset
* [polka-dpo-v1](https://huggingface.co/datasets/eryk-mazus/polka-dpo-v1) - dataset of DPO pairs
* [polka-1.1b-chat-gguf](https://huggingface.co/eryk-mazus/polka-1.1b-chat-gguf) - GGUF files for the chat model
## Usage
Sample code:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
model_name = "eryk-mazus/polka-1.1b-chat"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16,
device_map="auto"
)
streamer = TextStreamer(tokenizer, skip_prompt=True)
# You are a helpful assistant.
system_prompt = "Jesteś pomocnym asystentem."
chat = [{"role": "system", "content": system_prompt}]
# Compose a short song on programming.
user_input = "Napisz krótką piosenkę o programowaniu."
chat.append({"role": "user", "content": user_input})
# Generate - add_generation_prompt to make sure it continues as assistant
inputs = tokenizer.apply_chat_template(chat, add_generation_prompt=True, return_tensors="pt")
# For multi-GPU, find the device of the first parameter of the model
first_param_device = next(model.parameters()).device
inputs = inputs.to(first_param_device)
with torch.no_grad():
outputs = model.generate(
inputs,
pad_token_id=tokenizer.eos_token_id,
max_new_tokens=512,
temperature=0.2,
repetition_penalty=1.15,
top_p=0.95,
do_sample=True,
streamer=streamer,
)
# Add just the new tokens to our chat
new_tokens = outputs[0, inputs.size(1):]
response = tokenizer.decode(new_tokens, skip_special_tokens=True)
chat.append({"role": "assistant", "content": response})
```
The model works seamlessly with [vLLM](https://github.com/vllm-project/vllm) as well.
## Prompt format
This model uses ChatML as the prompt format:
```
<|im_start|>system
Jesteś pomocnym asystentem.
<|im_start|>user
Jakie jest dzienne zapotrzebowanie kaloryczne dorosłej osoby?<|im_end|>
<|im_start|>assistant
Dla dorosłych osób zaleca się spożywanie około 2000-3000 kcal dziennie, aby utrzymać optymalne zdrowie i dobre samopoczucie.<|im_end|>
```
This prompt is available as a [chat template](https://huggingface.co/docs/transformers/chat_templating), which means you can format messages using the `tokenizer.apply_chat_template()` method, as demonstrated in the example above.
***
We've actively looking for additional compute to train better and larger models for this project. If you want to collaborate, please reach out at: eryk.mazus at gmail dot com |