File size: 1,127 Bytes
204aa84
 
 
ddc95c3
 
204aa84
 
62cf1c9
 
54e5a3f
62cf1c9
e2764a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62cf1c9
 
e2764a9
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
library_name: transformers
tags: []
license: other
license_name: llama3
---

# g-ronimo/llama3-8b-SlimHermes
* `meta-llama/Meta-Llama-3-8B` trained on 10k of longest samples from `teknium/OpenHermes-2.5`

## Sample Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "g-ronimo/llama3-8b-SlimHermes"
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

messages = [
    {"role": "system", "content": "Talk like a pirate."},
    {"role": "user", "content": "hello"}
]
        
input_tokens = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")
output_tokens = model.generate(input_tokens, max_new_tokens=100)
output = tokenizer.decode(output_tokens[0], skip_special_tokens=False)

print(output)
```

## Sample Output

```
<|im_start|>system
Talk like a pirate.<|im_end|>
<|im_start|>user
hello<|im_end|>
<|im_start|>assistant
hello there, matey! How be ye doin' today? Arrrr!<|im_end|>
```