Muhammad2003 commited on
Commit
e10656d
·
verified ·
1 Parent(s): f008cde

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -227
README.md CHANGED
@@ -3,230 +3,4 @@ language:
3
  - en
4
  license: other
5
  library_name: transformers
6
- tags:
7
- - axolotl
8
- - finetune
9
- - dpo
10
- - facebook
11
- - meta
12
- - pytorch
13
- - llama
14
- - llama-3
15
- base_model: meta-llama/Meta-Llama-3-8B-Instruct
16
- datasets:
17
- - argilla/ultrafeedback-binarized-preferences
18
- model_name: Llama-3-8B-Instruct-DPO-v0.2
19
- pipeline_tag: text-generation
20
- license_name: llama3
21
- license_link: LICENSE
22
- inference: false
23
- model_creator: MaziyarPanahi
24
- quantized_by: MaziyarPanahi
25
- model-index:
26
- - name: Llama-3-8B-Instruct-DPO-v0.2
27
- results:
28
- - task:
29
- type: text-generation
30
- name: Text Generation
31
- dataset:
32
- name: AI2 Reasoning Challenge (25-Shot)
33
- type: ai2_arc
34
- config: ARC-Challenge
35
- split: test
36
- args:
37
- num_few_shot: 25
38
- metrics:
39
- - type: acc_norm
40
- value: 62.46
41
- name: normalized accuracy
42
- source:
43
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
44
- name: Open LLM Leaderboard
45
- - task:
46
- type: text-generation
47
- name: Text Generation
48
- dataset:
49
- name: HellaSwag (10-Shot)
50
- type: hellaswag
51
- split: validation
52
- args:
53
- num_few_shot: 10
54
- metrics:
55
- - type: acc_norm
56
- value: 79.5
57
- name: normalized accuracy
58
- source:
59
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
60
- name: Open LLM Leaderboard
61
- - task:
62
- type: text-generation
63
- name: Text Generation
64
- dataset:
65
- name: MMLU (5-Shot)
66
- type: cais/mmlu
67
- config: all
68
- split: test
69
- args:
70
- num_few_shot: 5
71
- metrics:
72
- - type: acc
73
- value: 68.21
74
- name: accuracy
75
- source:
76
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
77
- name: Open LLM Leaderboard
78
- - task:
79
- type: text-generation
80
- name: Text Generation
81
- dataset:
82
- name: TruthfulQA (0-shot)
83
- type: truthful_qa
84
- config: multiple_choice
85
- split: validation
86
- args:
87
- num_few_shot: 0
88
- metrics:
89
- - type: mc2
90
- value: 53.27
91
- source:
92
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
93
- name: Open LLM Leaderboard
94
- - task:
95
- type: text-generation
96
- name: Text Generation
97
- dataset:
98
- name: Winogrande (5-shot)
99
- type: winogrande
100
- config: winogrande_xl
101
- split: validation
102
- args:
103
- num_few_shot: 5
104
- metrics:
105
- - type: acc
106
- value: 75.93
107
- name: accuracy
108
- source:
109
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
110
- name: Open LLM Leaderboard
111
- - task:
112
- type: text-generation
113
- name: Text Generation
114
- dataset:
115
- name: GSM8k (5-shot)
116
- type: gsm8k
117
- config: main
118
- split: test
119
- args:
120
- num_few_shot: 5
121
- metrics:
122
- - type: acc
123
- value: 70.81
124
- name: accuracy
125
- source:
126
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2
127
- name: Open LLM Leaderboard
128
- ---
129
-
130
- <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
131
-
132
-
133
- # Llama-3-8B-Instruct-DPO-v0.2
134
-
135
- This model is a fine-tune (DPO) of `meta-llama/Meta-Llama-3-8B-Instruct` model.
136
-
137
- # Quantized GGUF
138
-
139
- All GGUF models are available here: [MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2-GGUF)
140
-
141
-
142
- # Prompt Template
143
-
144
- This model uses `ChatML` prompt template:
145
-
146
- ```
147
- <|im_start|>system
148
- {System}
149
- <|im_end|>
150
- <|im_start|>user
151
- {User}
152
- <|im_end|>
153
- <|im_start|>assistant
154
- {Assistant}
155
- ````
156
-
157
- # How to use
158
-
159
- You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2` as the model name in Hugging Face's
160
- transformers library.
161
-
162
- ```python
163
- from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
164
- from transformers import pipeline
165
- import torch
166
-
167
- model_id = "MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.2"
168
-
169
- model = AutoModelForCausalLM.from_pretrained(
170
- model_id,
171
- torch_dtype=torch.bfloat16,
172
- device_map="auto",
173
- trust_remote_code=True,
174
- # attn_implementation="flash_attention_2"
175
- )
176
-
177
- tokenizer = AutoTokenizer.from_pretrained(
178
- model_id,
179
- trust_remote_code=True
180
- )
181
-
182
- streamer = TextStreamer(tokenizer)
183
-
184
- pipeline = pipeline(
185
- "text-generation",
186
- model=model,
187
- tokenizer=tokenizer,
188
- model_kwargs={"torch_dtype": torch.bfloat16},
189
- streamer=streamer
190
- )
191
-
192
- # Then you can use the pipeline to generate text.
193
-
194
- messages = [
195
- {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
196
- {"role": "user", "content": "Who are you?"},
197
- ]
198
-
199
- prompt = tokenizer.apply_chat_template(
200
- messages,
201
- tokenize=False,
202
- add_generation_prompt=True
203
- )
204
-
205
- terminators = [
206
- tokenizer.eos_token_id,
207
- tokenizer.convert_tokens_to_ids("<|im_end|>")
208
- ]
209
-
210
- outputs = pipeline(
211
- prompt,
212
- max_new_tokens=256,
213
- eos_token_id=terminators,
214
- do_sample=True,
215
- temperature=0.6,
216
- top_p=0.95,
217
- )
218
- print(outputs[0]["generated_text"][len(prompt):])
219
- ```
220
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
221
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__Llama-3-8B-Instruct-DPO-v0.2)
222
-
223
- | Metric |Value|
224
- |---------------------------------|----:|
225
- |Avg. |68.36|
226
- |AI2 Reasoning Challenge (25-Shot)|62.46|
227
- |HellaSwag (10-Shot) |79.50|
228
- |MMLU (5-Shot) |68.21|
229
- |TruthfulQA (0-shot) |53.27|
230
- |Winogrande (5-shot) |75.93|
231
- |GSM8k (5-shot) |70.81|
232
-
 
3
  - en
4
  license: other
5
  library_name: transformers
6
+ ---