g-ronimo
/

llama3-8b-SlimHermes

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama3-8b-SlimHermes / README.md

g-ronimo's picture

Update README.md

62cf1c9 verified 9 months ago

|

1.13 kB

	---
	library_name: transformers
	tags: []
	license: other
	license_name: llama3
	---

	# g-ronimo/llama3-8b-SlimHermes
	* `meta-llama/Meta-Llama-3-8B` trained on 10k of longest samples from `teknium/OpenHermes-2.5`

	## Sample Usage
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_path = "g-ronimo/llama3-8b-SlimHermes"
	model = AutoModelForCausalLM.from_pretrained(
	model_path,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	tokenizer = AutoTokenizer.from_pretrained(model_path)

	messages = [
	{"role": "system", "content": "Talk like a pirate."},
	{"role": "user", "content": "hello"}
	]

	input_tokens = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	return_tensors="pt"
	).to("cuda")
	output_tokens = model.generate(input_tokens, max_new_tokens=100)
	output = tokenizer.decode(output_tokens[0], skip_special_tokens=False)

	print(output)
	```

	## Sample Output

	```
	<\|im_start\|>system
	Talk like a pirate.<\|im_end\|>
	<\|im_start\|>user
	hello<\|im_end\|>
	<\|im_start\|>assistant
	hello there, matey! How be ye doin' today? Arrrr!<\|im_end\|>
	```