chencyudel commited on
Commit
23f0368
·
verified ·
1 Parent(s): 33d7b53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +241 -114
README.md CHANGED
@@ -1,12 +1,10 @@
1
  ---
2
- license: apache-2.0
3
  tasks:
4
  - code-generation
5
  ---
6
- # Model Card for CodeFuse-CodeLlama-34B
7
- <p align="center">
8
- <img src="https://modelscope.cn/api/v1/models/codefuse-ai/CodeFuse-QWen-14B/repo?Revision=master&FilePath=LOGO.jpg&View=true" width="800"/>
9
- <p>
10
 
11
  [[中文]](#chinese) [[English]](#english)
12
 
@@ -16,13 +14,27 @@ tasks:
16
 
17
  ## Model Description
18
 
19
- CodeFuse-CodeLlama-34B is a 34B Code-LLM finetuned by QLoRA of multiple code tasks(600k instrunctions/answers) on the base model CodeLlama-34b-Python.
20
- The context length of finetuning is 4K while it is able to be finetuned by 16k context if necessary.
21
  <br>
22
 
23
  ## News and Updates
24
 
25
- 🔥🔥🔥 CodeFuse-CodeLlama34B-MFT has achived 74.4% of pass@1 on HumanEval, which is SOTA at present.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  <br>
28
 
@@ -36,12 +48,21 @@ The context length of finetuning is 4K while it is able to be finetuned by 16k c
36
 
37
  + If you wish to see a demo of the model, you can visit ✨[CodeFuse Demo](https://github.com/codefuse-ai/codefuse)✨✨
38
 
 
39
 
40
  ## Performance
41
 
 
 
42
  | Model | HumanEval(pass@1) | Date |
43
  |:----------------------------|:-----------------:|:-------:|
44
- | **CodeFuse-CodeLlama-34B** | **74.4%** | 2023.9 |
 
 
 
 
 
 
45
  | WizardCoder-Python-34B-V1.0 | 73.2% | 2023.8 |
46
  | GPT-4(zero-shot) | 67.0% | 2023.3 |
47
  | PanGu-Coder2 15B | 61.6% | 2023.8 |
@@ -50,7 +71,14 @@ The context length of finetuning is 4K while it is able to be finetuned by 16k c
50
  | GPT-3.5(zero-shot) | 48.1% | 2022.11 |
51
  | OctoCoder | 46.2% | 2023.8 |
52
  | StarCoder-15B | 33.6% | 2023.5 |
53
- | LLaMA 2 70B(zero-shot) | 29.9% | 2023.7 |
 
 
 
 
 
 
 
54
 
55
  <br>
56
 
@@ -58,7 +86,7 @@ The context length of finetuning is 4K while it is able to be finetuned by 16k c
58
 
59
  * python>=3.8
60
  * pytorch>=2.0.0
61
- * transformers==4.32.0
62
  * Sentencepiece
63
  * CUDA 11.4
64
  <br>
@@ -66,93 +94,143 @@ The context length of finetuning is 4K while it is able to be finetuned by 16k c
66
  ## Inference String Format
67
 
68
  The inference string is a concatenated string formed by combining conversation data(system, human and bot contents) in the training data format. It is used as input during the inference process.
69
- Here is an example format of the concatenated string:
70
 
 
71
  ```python
72
  """
73
- <|role_start|>system<|role_end|>System instruction
74
- <|role_start|>human<|role_end|>Human 1st round input
75
- <|role_start|>bot<|role_end|>Bot 1st round output</s>
76
- <|role_start|>human<|role_end|>Human 2nd round input
77
- <|role_start|>bot<|role_end|>Bot 2nd round output</s>
 
 
 
 
 
78
  ...
79
  ...
80
  ...
81
- <|role_start|>human<|role_end|>Human nth round input
82
- <|role_start|>bot<|role_end|>{Bot output to be genreated}</s>
 
83
  """
84
  ```
85
 
86
- When applying inference, you always make your input string end with "<|role_start|>bot<|role_end|>" to ask the model generating answers.
 
 
 
 
 
87
 
88
- ## Quickstart
 
 
 
 
 
89
 
90
- ```bash
91
- pip install -r requirements.txt
92
  ```
 
 
 
 
 
 
 
 
 
 
 
 
93
 
94
- ```python
95
- import torch
96
- from modelscope import AutoTokenizer, AutoModelForCausalLM, snapshot_download
97
 
 
98
 
99
- model_dir = snapshot_download('codefuse-ai/CodeFuse-CodeLlama-34B', revision='v1.0.0')
100
- tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True, use_fast=False, legacy=False)
101
- tokenizer.padding_side = "left"
102
- tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids("<unk>")
103
- tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids("</s>")
104
- model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True,
105
- device_map='auto',
106
- torch_dtype=torch.bfloat16)
107
 
108
- HUMAN_ROLE_START_TAG = "<|role_start|>human<|role_end|>"
109
- BOT_ROLE_START_TAG = "<|role_start|>bot<|role_end|>"
110
 
111
- text = f"{HUMAN_ROLE_START_TAG}write a python function of quick sort.{BOT_ROLE_START_TAG}"
112
- inputs = tokenizer(text, return_tensors='pt', padding=True, add_special_tokens=False).to("cuda")
113
- outputs = model.generate(
114
- inputs=inputs["input_ids"],
115
- attention_mask=inputs["attention_mask"],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
  max_new_tokens=512,
 
 
117
  top_p=0.95,
118
- temperature=0.1,
119
- do_sample=True,
120
- eos_token_id=tokenizer.eos_token_id,
121
- pad_token_id=tokenizer.pad_token_id
122
- )
123
- gen_text = tokenizer.batch_decode(outputs[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
124
- print(gen_text)
 
 
125
  ```
126
 
127
- ## MD5
128
- We notice that the file may be corrupted during transfer process. Please check MD5 value before use.
129
 
130
- | Model File | MD5 Value |
131
- |:---------------------------------|:--------------------------------:|
132
- | pytorch_model-00001-of-00007.bin | 8d544b1bcb3449934184d4141137329c |
133
- | pytorch_model-00002-of-00007.bin | 9d5dbb30911e48a42fb6d0fcabb322a4 |
134
- | pytorch_model-00003-of-00007.bin | b0d4aecee0457d9332005a187e1fffed |
135
- | pytorch_model-00004-of-00007.bin | 5c7e002de5eab77d0194a2b0f6de0c24 |
136
- | pytorch_model-00005-of-00007.bin | d22a511aa26b5b17117b665a877490ab |
137
- | pytorch_model-00006-of-00007.bin | a5c28ac277fac07d16dd66537e54d109 |
138
- | pytorch_model-00007-of-00007.bin | a967e2c6195477b7407089c0bffa2d53 |
139
 
140
 
141
  <a id="chinese"></a>
142
 
143
  ## 模型简介
144
 
145
- CodeFuse-CodeLlama34B-MFT 是一个通过QLoRA对基座模型CodeLlama-34b-Python进行多代码任务微调的代码大模型。模型微调采用了4k上下文。如果有必要,可以扩展到16k。
146
  <br>
147
 
148
  ## 新闻
149
 
150
- 🔥🔥🔥 CodeFuse-CodeLlama34B-MFT模型在HumanEval pass@1上可以达到74.4%, 为当前开源SOTA
 
 
 
 
 
 
 
 
 
 
 
 
151
 
152
  <br>
153
 
154
  ## 代码社区
155
- **大本营**: 🏡 https://github.com/codefuse-ai (**欢迎为我们的项目一键三连 Star🌟 + Fork🚀 + Watch👀**)
156
 
157
  + 如果您想自己微调该模型,可以访问 ✨[MFTCoder](https://github.com/codefuse-ai/MFTCoder)✨✨
158
 
@@ -160,11 +238,18 @@ CodeFuse-CodeLlama34B-MFT 是一个通过QLoRA对基座模型CodeLlama-34b-Pytho
160
 
161
  + 如果您想观看该模型示例,可以访问 ✨[CodeFuse Demo](https://github.com/codefuse-ai/codefuse)✨✨
162
 
163
- ## 评测表现(代码)
 
 
 
 
 
 
164
 
165
  | 模型 | HumanEval(pass@1) | 日期 |
166
  |:----------------------------|:-----------------:|:-------:|
167
- | **CodeFuse-CodeLlama-34B** | **74.4%** | 2023.9 |
 
168
  | WizardCoder-Python-34B-V1.0 | 73.2% | 2023.8 |
169
  | GPT-4(zero-shot) | 67.0% | 2023.3 |
170
  | PanGu-Coder2 15B | 61.6% | 2023.8 |
@@ -173,83 +258,125 @@ CodeFuse-CodeLlama34B-MFT 是一个通过QLoRA对基座模型CodeLlama-34b-Pytho
173
  | GPT-3.5(zero-shot) | 48.1% | 2022.11 |
174
  | OctoCoder | 46.2% | 2023.8 |
175
  | StarCoder-15B | 33.6% | 2023.5 |
176
- | LLaMA 2 70B(zero-shot) | 29.9% | 2023.7 |
177
- <br>
 
 
 
 
 
178
 
179
  ## Requirements
180
 
181
  * python>=3.8
182
  * pytorch>=2.0.0
183
- * transformers==4.32.0
 
184
  * CUDA 11.4
185
  <br>
186
 
187
  ## 推理数据格式
188
 
189
- 推理数据为模型在训练数据格式下拼接的字符串形式,它也是推理时输入prompt拼接的方式:
190
 
 
191
  ```python
192
  """
193
- <|role_start|>system<|role_end|>这是System指令
194
- <|role_start|>human<|role_end|>这是第1轮用户输入的问题
195
- <|role_start|>bot<|role_end|>这是第1轮模型生成的内容</s>
196
- <|role_start|>human<|role_end|>这是第2轮用户输入的问题
197
- <|role_start|>bot<|role_end|>这是第2轮模型生成的内容</s>
 
 
 
 
 
198
  ...
199
  ...
200
  ...
201
- <|role_start|>human<|role_end|>这是第n轮用户输入的问题
202
- <|role_start|>bot<|role_end|>{模型现在要生成的内容}</s>
 
 
 
 
 
 
 
 
 
 
 
203
  """
204
  ```
205
 
206
- 推理时,请确保拼接的prompt字符串以"<|role_start|>bot<|role_end|>"结尾,引导模型生成回答。
207
 
208
- ## 快速使用
209
 
210
  ```python
211
- import torch
212
- from modelscope import AutoTokenizer, AutoModelForCausalLM, snapshot_download
 
 
 
 
 
 
 
 
 
 
213
 
 
214
 
215
- model_dir = snapshot_download('codefuse-ai/CodeFuse-CodeLlama-34B', revision='v1.0.0')
216
- tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True, use_fast=False, legacy=False)
217
- tokenizer.padding_side = "left"
218
- tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids("<unk>")
219
- tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids("</s>")
220
- model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True,
221
- device_map='auto',
222
- torch_dtype=torch.bfloat16)
223
 
224
- HUMAN_ROLE_START_TAG = "<|role_start|>human<|role_end|>"
225
- BOT_ROLE_START_TAG = "<|role_start|>bot<|role_end|>"
226
 
227
- text = f"{HUMAN_ROLE_START_TAG}write a python function of quick sort.{BOT_ROLE_START_TAG}"
228
- inputs = tokenizer(text, return_tensors='pt', padding=True, add_special_tokens=False).to("cuda")
229
- outputs = model.generate(
230
- inputs=inputs["input_ids"],
231
- attention_mask=inputs["attention_mask"],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
232
  max_new_tokens=512,
 
 
233
  top_p=0.95,
234
- temperature=0.1,
235
- do_sample=True,
236
- eos_token_id=tokenizer.eos_token_id,
237
- pad_token_id=tokenizer.pad_token_id
238
- )
239
- gen_text = tokenizer.batch_decode(outputs[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
240
- print(gen_text)
 
 
241
  ```
242
 
243
-
244
- ## MD5
245
- 我们发现模型文件可能会在传输过程中损坏,使用前请检查文件MD5值。
246
-
247
- | 模型文件 | MD5值 |
248
- |:---------------------------------|:--------------------------------:|
249
- | pytorch_model-00001-of-00007.bin | 8d544b1bcb3449934184d4141137329c |
250
- | pytorch_model-00002-of-00007.bin | 9d5dbb30911e48a42fb6d0fcabb322a4 |
251
- | pytorch_model-00003-of-00007.bin | b0d4aecee0457d9332005a187e1fffed |
252
- | pytorch_model-00004-of-00007.bin | 5c7e002de5eab77d0194a2b0f6de0c24 |
253
- | pytorch_model-00005-of-00007.bin | d22a511aa26b5b17117b665a877490ab |
254
- | pytorch_model-00006-of-00007.bin | a5c28ac277fac07d16dd66537e54d109 |
255
- | pytorch_model-00007-of-00007.bin | a967e2c6195477b7407089c0bffa2d53 |
 
1
  ---
2
+ license: other
3
  tasks:
4
  - code-generation
5
  ---
6
+ # Model Card for CodeFuse-Mixtral-8x7B
7
+ ![logo](LOGO.jpg)
 
 
8
 
9
  [[中文]](#chinese) [[English]](#english)
10
 
 
14
 
15
  ## Model Description
16
 
17
+ CodeFuse-Mixtral-8x7B is a Code-LLM finetuned by QLoRA on multiple code-related tasks on the base model Mixtral-8x7B-v0.1(Mixture of Experts).
18
+
19
  <br>
20
 
21
  ## News and Updates
22
 
23
+ 🔥🔥🔥 2024-01-12 CodeFuse-DeepSeek-33B has been released, achieving a pass@1 (greedy decoding) score of 78.65% on HumanEval.
24
+
25
+ 🔥🔥🔥 2024-01-12 CodeFuse-Mixtral-8x7B has been released, achieving a pass@1 (greedy decoding) score of 56.1% on HumanEval, which is a 15% increase compared to Mixtral-8x7b's 40%.
26
+
27
+ 🔥🔥 2023-11-10 CodeFuse-CodeGeeX2-6B has been released, achieving a pass@1 (greedy decoding) score of 45.12% on HumanEval, which is a 9.22% increase compared to CodeGeeX2 35.9%.
28
+
29
+ 🔥🔥 2023-10-20 CodeFuse-QWen-14B technical documentation has been released. For those interested, please refer to the CodeFuse article on our WeChat official account via the provided link.(https://mp.weixin.qq.com/s/PCQPkvbvfxSPzsqjOILCDw)
30
+
31
+ 🔥🔥 2023-10-16 CodeFuse-QWen-14B has been released, achieving a pass@1 (greedy decoding) score of 48.78% on HumanEval, which is a 16% increase compared to Qwen-14b's 32.3%.
32
+
33
+ 🔥🔥 2023-09-27 CodeFuse-StarCoder-15B has been released, achieving a pass@1 (greedy decoding) score of 54.9% on HumanEval, which is a 21% increase compared to StarCoder's 33.6%.
34
+
35
+ 🔥🔥 2023-09-26 We are pleased to announce the release of the [4-bit quantized version](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B-4bits/summary) of [CodeFuse-CodeLlama-34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary). Despite the quantization process, the model still achieves a remarkable 73.8% accuracy (greedy decoding) on the HumanEval pass@1 metric.
36
+
37
+ 🔥🔥 2023-09-11 [CodeFuse-CodeLlama34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary) has achieved 74.4% of pass@1 (greedy decoding) on HumanEval, which is SOTA results for openspurced LLMs at present.
38
 
39
  <br>
40
 
 
48
 
49
  + If you wish to see a demo of the model, you can visit ✨[CodeFuse Demo](https://github.com/codefuse-ai/codefuse)✨✨
50
 
51
+ <br>
52
 
53
  ## Performance
54
 
55
+ ### Code
56
+
57
  | Model | HumanEval(pass@1) | Date |
58
  |:----------------------------|:-----------------:|:-------:|
59
+ | **CodeFuse-DeepSeek-33B** | **78.65%** | 2024.01 |
60
+ | **CodeFuse-Mixtral-8x7B** | **56.10%** | 2024.01 |
61
+ | **CodeFuse-CodeLlama-34B** | 74.4% | 2023.9 |
62
+ |**CodeFuse-CodeLlama-34B-4bits** | 73.8% | 2023.9 |
63
+ | **CodeFuse-StarCoder-15B** | 54.9% | 2023.9 |
64
+ | **CodeFuse-QWen-14B** | 48.78% | 2023.10 |
65
+ | **CodeFuse-CodeGeeX2-6B** | 45.12% | 2023.11 |
66
  | WizardCoder-Python-34B-V1.0 | 73.2% | 2023.8 |
67
  | GPT-4(zero-shot) | 67.0% | 2023.3 |
68
  | PanGu-Coder2 15B | 61.6% | 2023.8 |
 
71
  | GPT-3.5(zero-shot) | 48.1% | 2022.11 |
72
  | OctoCoder | 46.2% | 2023.8 |
73
  | StarCoder-15B | 33.6% | 2023.5 |
74
+ | Qwen-14b | 32.3% | 2023.10 |
75
+
76
+
77
+
78
+
79
+ ### NLP
80
+
81
+ ![NLP Performance Radar](codefuse-deepseek-33b-nlp.png)
82
 
83
  <br>
84
 
 
86
 
87
  * python>=3.8
88
  * pytorch>=2.0.0
89
+ * transformers>=4.33.2
90
  * Sentencepiece
91
  * CUDA 11.4
92
  <br>
 
94
  ## Inference String Format
95
 
96
  The inference string is a concatenated string formed by combining conversation data(system, human and bot contents) in the training data format. It is used as input during the inference process.
97
+ Here are examples of prompts used to request the model:
98
 
99
+ **Multi-Round with System Prompt:**
100
  ```python
101
  """
102
+ <s>system
103
+ System instruction
104
+ <s>human
105
+ Human 1st round input
106
+ <s>bot
107
+ Bot 1st round output</s>
108
+ <s>human
109
+ Human 2nd round input
110
+ <s>bot
111
+ Bot 2nd round output</s>
112
  ...
113
  ...
114
  ...
115
+ <s>human
116
+ Human nth round input
117
+ <s>bot
118
  """
119
  ```
120
 
121
+ **Single-Round without System Prompt:**
122
+ ```python
123
+ """
124
+ <s>human
125
+ User prompt...
126
+ <s>bot
127
 
128
+ """
129
+ ```
130
+
131
+ In this format, the system section is optional and the conversation can be either single-turn or multi-turn. When applying inference, you always make your input string end with "\<s\>bot" to ask the model generating answers.
132
+
133
+ For example, the format used to infer HumanEval is like the following:
134
 
 
 
135
  ```
136
+ <s>human
137
+ # language: Python
138
+ from typing import List
139
+ def separate_paren_groups(paren_string: str) -> List[str]:
140
+ """ Input to this function is a string containing multiple groups of nested parentheses. Your goal is to
141
+ separate those group into separate strings and return the list of those.
142
+ Separate groups are balanced (each open brace is properly closed) and not nested within each other
143
+ Ignore any spaces in the input string.
144
+ >>> separate_paren_groups('( ) (( )) (( )( ))')
145
+ ['()', '(())', '(()())']
146
+ """
147
+ <s>bot
148
 
149
+ ```
 
 
150
 
151
+ Specifically, we also add the Programming Language Tag (e.g. "```# language: Python```" for Python) used by CodeGeex models.
152
 
153
+ ## Quickstart
 
 
 
 
 
 
 
154
 
 
 
155
 
156
+ ```python
157
+ import torch
158
+ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
159
+
160
+ def load_model_tokenizer(model_path):
161
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, use_fast=False, legacy=False)
162
+ tokenizer.eos_token = "</s>"
163
+ tokenizer.pad_token = "</s>"
164
+ tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
165
+ tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
166
+ tokenizer.padding_side = "left"
167
+
168
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto',torch_dtype=torch.bfloat16, trust_remote_code=True)
169
+ return model, tokenizer
170
+
171
+
172
+ HUMAN_ROLE_START_TAG = "<s>human\n"
173
+ BOT_ROLE_START_TAG = "<s>bot\n"
174
+
175
+ text_list = [f'{HUMAN_ROLE_START_TAG}Write a QuickSort program\n#Python\n{BOT_ROLE_START_TAG}']
176
+
177
+ model, tokenizer = load_model_tokenizer("codefuse-ai/CodeFuse-DeepSeek-33B")
178
+ inputs = tokenizer(text_list, return_tensors='pt', padding=True, add_special_tokens=False).to('cuda')
179
+ input_ids = inputs["input_ids"]
180
+ attention_mask = inputs["attention_mask"]
181
+ generation_config = GenerationConfig(
182
+ eos_token_id=tokenizer.eos_token_id,
183
+ pad_token_id=tokenizer.pad_token_id,
184
+ temperature=0.1,
185
  max_new_tokens=512,
186
+ num_return_sequences=1,
187
+ num_beams=1,
188
  top_p=0.95,
189
+ do_sample=False
190
+ )
191
+ outputs = model.generate(
192
+ inputs= input_ids,
193
+ attention_mask=attention_mask,
194
+ **generation_config.to_dict()
195
+ )
196
+ gen_text = tokenizer.batch_decode(outputs[:, input_ids.shape[1]:], skip_special_tokens=True)
197
+ print(gen_text[0])
198
  ```
199
 
 
 
200
 
201
+
202
+
203
+
204
+
 
 
 
 
 
205
 
206
 
207
  <a id="chinese"></a>
208
 
209
  ## 模型简介
210
 
211
+ CodeFuse-DeepSeek-33B 是一个通过QLoRA对基座模型DeepSeek-Coder-33B进行多代码任务微调而得到的代码大模型。
212
  <br>
213
 
214
  ## 新闻
215
 
216
+ 🔥🔥🔥 2024-01-12 CodeFuse-DeepSeek-33B模型发布,模型在HumanEval pass@1指标为78.65% (贪婪解码)
217
+
218
+ 🔥🔥🔥 2023-11-10 开源了CodeFuse-CodeGeeX2-6B模型,在HumanEval pass@1(greedy decoding)上可以达到48.12%, 比CodeGeeX2提高了9.22%的代码能力(HumanEval)
219
+
220
+ 🔥🔥🔥 2023-10-20 公布了CodeFuse-QWen-14B技术文档,感兴趣详见微信公众号CodeFuse文章:https://mp.weixin.qq.com/s/PCQPkvbvfxSPzsqjOILCDw
221
+
222
+ 🔥🔥🔥 2023-10-16开源了CodeFuse-QWen-14B模型,在HumanEval pass@1(greedy decoding)上可以达到48.78%, 比Qwen-14b提高了16%的代码能力(HumanEval)
223
+
224
+ 🔥🔥🔥 2023-09-27开源了CodeFuse-StarCoder-15B模型,在HumanEval pass@1(greedy decoding)上可以达到54.9%, 比StarCoder提高了21%的代码能力(HumanEval)
225
+
226
+ 🔥🔥🔥 2023-09-26 [CodeFuse-CodeLlama-34B 4bits](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B-4bits/summary)量化版本发布,量化后模型在HumanEval pass@1指标为73.8% (贪婪解码)。
227
+
228
+ 🔥🔥🔥 2023-09-11 [CodeFuse-CodeLlama-34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary)发布,HumanEval pass@1指标达到74.4% (贪婪解码), 为当前开源SOTA。
229
 
230
  <br>
231
 
232
  ## 代码社区
233
+ **大本营**: 🏡 https://github.com/codefuse-ai (**请支持我们的项目Star🌟 + Fork🚀 + Watch👀**)
234
 
235
  + 如果您想自己微调该模型,可以访问 ✨[MFTCoder](https://github.com/codefuse-ai/MFTCoder)✨✨
236
 
 
238
 
239
  + 如果您想观看该模型示例,可以访问 ✨[CodeFuse Demo](https://github.com/codefuse-ai/codefuse)✨✨
240
 
241
+ <br>
242
+
243
+
244
+ ## 评测表现
245
+
246
+ ### 代码
247
+
248
 
249
  | 模型 | HumanEval(pass@1) | 日期 |
250
  |:----------------------------|:-----------------:|:-------:|
251
+ | **CodeFuse-CodeLlama-34B** | 74.4% | 2023.9 |
252
+ |**CodeFuse-CodeLlama-34B-4bits** | 73.8% | 2023.9 |
253
  | WizardCoder-Python-34B-V1.0 | 73.2% | 2023.8 |
254
  | GPT-4(zero-shot) | 67.0% | 2023.3 |
255
  | PanGu-Coder2 15B | 61.6% | 2023.8 |
 
258
  | GPT-3.5(zero-shot) | 48.1% | 2022.11 |
259
  | OctoCoder | 46.2% | 2023.8 |
260
  | StarCoder-15B | 33.6% | 2023.5 |
261
+ | Qwen-14b | 32.3% | 2023.10 |
262
+ | **CodeFuse-StarCoder-15B** | 54.9% | 2023.9 |
263
+ | **CodeFuse-QWen-14B** | 48.78% | 2023.8 |
264
+ | **CodeFuse-CodeGeeX2-6B** | 45.12% | 2023.11 |
265
+ | **CodeFuse-DeepSeek-33B**. | **78.65%** | 2024.01 |
266
+
267
+
268
 
269
  ## Requirements
270
 
271
  * python>=3.8
272
  * pytorch>=2.0.0
273
+ * transformers>=4.33.2
274
+ * Sentencepiece
275
  * CUDA 11.4
276
  <br>
277
 
278
  ## 推理数据格式
279
 
280
+ 推理数据为模型在训练数据格式下拼接的字符串形式,它也是推理时输入prompt拼接的方式. 下面分别是带系统提示的多轮会话格式和不带系统提示的单轮会话格式:
281
 
282
+ **带System提示的多轮会话格式:**
283
  ```python
284
  """
285
+ <s>system
286
+ System instruction
287
+ <s>human
288
+ Human 1st round input
289
+ <s>bot
290
+ Bot 1st round output<|end▁of▁sentence|>
291
+ <s>human
292
+ Human 2nd round input
293
+ <s>bot
294
+ Bot 2nd round output<|end▁of▁sentence|>
295
  ...
296
  ...
297
  ...
298
+ <s>human
299
+ Human nth round input
300
+ <s>bot
301
+ """
302
+ ```
303
+
304
+ **不带System提示的单轮会话格式:**
305
+ ```python
306
+ """
307
+ <s>human
308
+ User prompt...
309
+ <s>bot
310
+
311
  """
312
  ```
313
 
314
+ 在这个格式中,System提示是可选的(按需设定),支持单轮会话也支持多轮会话。推理时,请确保拼接的prompt字符串以"\<s\>bot\n"结尾,引导模型生成回答。
315
 
316
+ 例如,推理HumanEval数据时使用的格式如下所示:
317
 
318
  ```python
319
+ <s>human
320
+ # language: Python
321
+ from typing import List
322
+ def separate_paren_groups(paren_string: str) -> List[str]:
323
+ """ Input to this function is a string containing multiple groups of nested parentheses. Your goal is to
324
+ separate those group into separate strings and return the list of those.
325
+ Separate groups are balanced (each open brace is properly closed) and not nested within each other
326
+ Ignore any spaces in the input string.
327
+ >>> separate_paren_groups('( ) (( )) (( )( ))')
328
+ ['()', '(())', '(()())']
329
+ """
330
+ <s>bot
331
 
332
+ ```
333
 
334
+ 特别地,我们也使用了CodeGeeX系列模型采用的编程语言区分标签(例如,对于Python语言,我们会使用"```# language: Python```")。
 
 
 
 
 
 
 
335
 
336
+ ## 快速使用
 
337
 
338
+ ```python
339
+ import torch
340
+ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
341
+
342
+ def load_model_tokenizer(model_path):
343
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, use_fast=False, legacy=False)
344
+ tokenizer.eos_token = "<|end▁of▁sentence|>"
345
+ tokenizer.pad_token = "<|end▁of▁sentence|>"
346
+ tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
347
+ tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
348
+ tokenizer.padding_side = "left"
349
+
350
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto',torch_dtype=torch.bfloat16, trust_remote_code=True)
351
+ return model, tokenizer
352
+
353
+
354
+ HUMAN_ROLE_START_TAG = "<s>human\n"
355
+ BOT_ROLE_START_TAG = "<s>bot\n"
356
+
357
+
358
+ text_list = [f'{HUMAN_ROLE_START_TAG}请写一个快排程序\n#Python\n{BOT_ROLE_START_TAG}']
359
+
360
+ model, tokenizer = load_model_tokenizer("codefuse-ai/CodeFuse-Mixtral-8x7b")
361
+ inputs = tokenizer(text_list, return_tensors='pt', padding=True, add_special_tokens=False).to('cuda')
362
+ input_ids = inputs["input_ids"]
363
+ attention_mask = inputs["attention_mask"]
364
+ generation_config = GenerationConfig(
365
+ eos_token_id=tokenizer.eos_token_id,
366
+ pad_token_id=tokenizer.pad_token_id,
367
+ temperature=0.2,
368
  max_new_tokens=512,
369
+ num_return_sequences=1,
370
+ num_beams=1,
371
  top_p=0.95,
372
+ do_sample=False
373
+ )
374
+ outputs = model.generate(
375
+ inputs= input_ids,
376
+ attention_mask=attention_mask,
377
+ **generation_config.to_dict()
378
+ )
379
+ gen_text = tokenizer.batch_decode(outputs[:, input_ids.shape[1]:], skip_special_tokens=True)
380
+ print(gen_text[0])
381
  ```
382