MediaTek-Research
/

Breeze-7B-Instruct-v0_1

@@ -24,9 +24,11 @@ Practicality-wise:
 - Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization.
 - In particular, Breeze-7B-Instruct-64k can perform tasks at a document level, not a chapter level.
 Performance-wise:
 - Breeze-7B-Instruct demonstrates impressive performance in benchmarks for Traditional Chinese and English, when compared to similar sized open-source contemporaries such as Taiwan-LLM-7B/13B-chat, QWen-7B-Chat, and Yi-6B-Chat. [See [Chat Model Performance](#chat-model-performance).]
 *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*
 ## Features
@@ -107,9 +109,8 @@ Performance-wise:
 \* Taiwan-LLM models responds to multi-turn questions (English) in Traditional Chinese.
-**Details of MT-Bench-tw (0 shot):**
-| Models                                              | STEM    |Extraction|Reasoning| Math   | Coding  | Roleplay| Writing |Humanities|↑ AVG   |
 |-----------------------------------------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
 | gpt-3.5-turbo                                       |  7.8    |  6.1    |   5.1   |   6.4   |  6.2    |   8.7   |   7.4   |   9.3   |   7.1   |
 | Yi-34B-Chat                                         |  9.0    |  4.8    |   5.7   |   4.0   |  4.7    |   8.5   |   8.7   |   9.8   |   6.9   |
@@ -121,9 +122,8 @@ Performance-wise:
 | Taiwan-LLM-13B-v2.0-chat                            |  6.1    |  3.4    |   4.1   |   2.3   |  3.1    |   7.4   |   6.6   |   6.8   |   5.0   |
 | Taiwan-LLM-7B-v2.1-chat                             |  5.2    |  2.6    |   2.3   |   1.2   |  3.4    |   6.6   |   5.7   |   6.8   |   4.2   |
-**Details of TMMLU+ (0 shot):**
-| Model                                               | STEM         | Social Science | Humanities | Other      | ↑ AVG   |
 |-----------------------------------------------------|--------------|----------------|------------|------------|---------|
 | Yi-34B-Chat                                         | 47.65        | 64.25          | 52.73      | 54.91      | 54.87   |
 | Qwen-14B-Chat                                       | 43.83        | 55.00          | 48.55      | 46.22      | 48.41   |
@@ -174,19 +174,15 @@ pip install flash-attn
 ```
 Then load the model in transformers:
 ```python
-from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("MediaTek-Research/Breeze-7B-Instruct-v0.1")
-tokenizer = AutoTokenizer.from_pretrained("MediaTek-Research/Breeze-7B-Instruct-v0.1")
-# you can also using pipeline
-generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
-generator(
-    "請問台灣最高的山是",
-    max_length=30,
-    num_return_sequences=1,
 )
 ```
 The structure of the query template follows that of Mistral-7B-Instruct, as shown below.

 - Breeze-7B-Instruct can be used as is for common tasks such as Q&A, RAG, multi-round chat, and summarization.
 - In particular, Breeze-7B-Instruct-64k can perform tasks at a document level, not a chapter level.
 Performance-wise:
 - Breeze-7B-Instruct demonstrates impressive performance in benchmarks for Traditional Chinese and English, when compared to similar sized open-source contemporaries such as Taiwan-LLM-7B/13B-chat, QWen-7B-Chat, and Yi-6B-Chat. [See [Chat Model Performance](#chat-model-performance).]
 *A project by the members (in alphabetical order): Chan-Jan Hsu 許湛然, Chang-Le Liu 劉昶樂, Feng-Ting Liao 廖峰挺, Po-Chun Hsu 許博竣, Yi-Chang Chen 陳宜昌, and the supervisor Da-Shan Shiu 許大山.*
 ## Features
 \* Taiwan-LLM models responds to multi-turn questions (English) in Traditional Chinese.
+| Details of MT-Bench-tw (0 shot):<br/>Models         | STEM    |Extraction|Reasoning| Math   | Coding  | Roleplay| Writing |Humanities|↑ AVG   |
 |-----------------------------------------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
 | gpt-3.5-turbo                                       |  7.8    |  6.1    |   5.1   |   6.4   |  6.2    |   8.7   |   7.4   |   9.3   |   7.1   |
 | Yi-34B-Chat                                         |  9.0    |  4.8    |   5.7   |   4.0   |  4.7    |   8.5   |   8.7   |   9.8   |   6.9   |
 | Taiwan-LLM-13B-v2.0-chat                            |  6.1    |  3.4    |   4.1   |   2.3   |  3.1    |   7.4   |   6.6   |   6.8   |   5.0   |
 | Taiwan-LLM-7B-v2.1-chat                             |  5.2    |  2.6    |   2.3   |   1.2   |  3.4    |   6.6   |   5.7   |   6.8   |   4.2   |
+| Details of TMMLU+ (0 shot):<br/>Model               | STEM         | Social Science | Humanities | Other      | ↑ AVG   |
 |-----------------------------------------------------|--------------|----------------|------------|------------|---------|
 | Yi-34B-Chat                                         | 47.65        | 64.25          | 52.73      | 54.91      | 54.87   |
 | Qwen-14B-Chat                                       | 43.83        | 55.00          | 48.55      | 46.22      | 48.41   |
 ```
 Then load the model in transformers:
 ```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model = AutoModelForCausalLM.from_pretrained(
+    model="MediaTek-Research/Breeze-7B-Instruct-v0.1",
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    use_flash_attn_2=True # optional
 )
 ```
 The structure of the query template follows that of Mistral-7B-Instruct, as shown below.