Files changed (1) hide show
  1. README.md +110 -0
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: MaziyarPanahi/Llama-3-8B-Instruct-v0.8
3
+ library_name: transformers
4
+ tags:
5
+ - axolotl
6
+ - finetune
7
+ - facebook
8
+ - meta
9
+ - pytorch
10
+ - llama
11
+ - llama-3
12
+ language:
13
+ - en
14
+ pipeline_tag: text-generation
15
+ license: other
16
+ license_name: llama3
17
+ license_link: LICENSE
18
+ inference: false
19
+ model_creator: MaziyarPanahi
20
+ model_name: Llama-3-8B-Instruct-v0.8
21
+ quantized_by: MaziyarPanahi
22
+ ---
23
+
24
+ <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
25
+
26
+
27
+ # Llama-3-8B-Instruct-v0.8
28
+
29
+ This model was developed based on `MaziyarPanahi/Llama-3-8B-Instruct-v0.4` model.
30
+
31
+ # Quantized GGUF
32
+
33
+ All GGUF models are available here: [MaziyarPanahi/Llama-3-8B-Instruct-v0.8-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-v0.8-GGUF)
34
+
35
+
36
+ # Prompt Template
37
+
38
+ This model uses `ChatML` prompt template:
39
+
40
+ ```
41
+ <|begin_of_text|><|start_header_id|>system<|end_header_id|>
42
+
43
+ {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
44
+
45
+ {prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
46
+ ````
47
+
48
+ # How to use
49
+
50
+ You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-v0.8` as the model name in Hugging Face's
51
+ transformers library.
52
+
53
+ ```python
54
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
55
+ from transformers import pipeline
56
+ import torch
57
+
58
+ model_id = "MaziyarPanahi/Llama-3-8B-Instruct-v0.8"
59
+
60
+ model = AutoModelForCausalLM.from_pretrained(
61
+ model_id,
62
+ torch_dtype=torch.bfloat16,
63
+ device_map="auto",
64
+ trust_remote_code=True,
65
+ # attn_implementation="flash_attention_2"
66
+ )
67
+
68
+ tokenizer = AutoTokenizer.from_pretrained(
69
+ model_id,
70
+ trust_remote_code=True
71
+ )
72
+
73
+ streamer = TextStreamer(tokenizer)
74
+
75
+ pipeline = pipeline(
76
+ "text-generation",
77
+ model=model,
78
+ tokenizer=tokenizer,
79
+ model_kwargs={"torch_dtype": torch.bfloat16},
80
+ streamer=streamer
81
+ )
82
+
83
+ # Then you can use the pipeline to generate text.
84
+
85
+ messages = [
86
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
87
+ {"role": "user", "content": "Who are you?"},
88
+ ]
89
+
90
+ prompt = tokenizer.apply_chat_template(
91
+ messages,
92
+ tokenize=False,
93
+ add_generation_prompt=True
94
+ )
95
+
96
+ terminators = [
97
+ tokenizer.eos_token_id,
98
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
99
+ ]
100
+
101
+ outputs = pipeline(
102
+ prompt,
103
+ max_new_tokens=512,
104
+ eos_token_id=terminators,
105
+ do_sample=True,
106
+ temperature=0.6,
107
+ top_p=0.95,
108
+ )
109
+ print(outputs[0]["generated_text"][len(prompt):])
110
+ ```