File size: 8,716 Bytes
40927c8 98e13cc 40927c8 e0110ac 2d86351 7c210e5 d8fd1db e0110ac e9cc961 c4eade6 e9cc961 a033654 d8fd1db 002a533 d26125e e0110ac 0aa06b3 215ecb7 64674b4 d8ed2fe 40927c8 64674b4 54384f9 d8fd1db 002a533 d26125e 54384f9 0aa06b3 215ecb7 64674b4 693d098 64674b4 54384f9 4347362 d8fd1db 4347362 d26125e 4347362 c9b58fa 4347362 d8fd1db 215ecb7 d26125e 215ecb7 c9b58fa ec897e6 d8fd1db ec897e6 4b793c5 d8fd1db ec897e6 568e672 5b30dfd ec897e6 568e672 1f7e09e 568e672 8c55329 568e672 09c1838 568e672 5698793 568e672 09c1838 568e672 7ec5904 568e672 57a1889 7ec5904 8a04019 ec897e6 568e672 57a1889 568e672 57a1889 568e672 872ce75 57a1889 872ce75 57a1889 568e672 e384aa3 568e672 57a1889 ec8e71e 568e672 57a1889 568e672 57a1889 568e672 57a1889 28e4465 568e672 4b793c5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 |
---
license: apache-2.0
extra_gated_prompt: "We will release in the nearly future."
extra_gated_fields:
Name: text
Company: text
Title: text
---
# Model Card for MediaTek Research Breeze-7B-FC-v1_0
MediaTek Research Breeze-7B-FC (hereinafter referred to as Breeze-7B-FC) is an advanced language model developed by MediaTek Research, building on [Breeze-7B-Base](https://huggingface.co/MediaTek-Research/Breeze-7B-Base-v1_0). Breeze-7B-FC extends its predecessor by incorporating a key feature: function calling. These enhancements make Breeze-7B-FC more versatile and capable of handling a wider range of tasks efficiently.
## 🏆 Performance
| Models | #Parameters | Organization | License | 🧰 Function Calling? | 💬 Instrustion Following? |
|--------------------------------------------------------------------------------------------|-------------|------------|------------|-------------------|----------|
| [Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0)| 7B | MediaTek Research | Apache 2.0 | ❌ | ✅ |
| [**Breeze-7B-FC-v1_0**](https://huggingface.co/MediaTek-Research/Breeze-7B-FC-v1_0) | 7B | MediaTek Research | Apache 2.0 | ✅ | ✅ |
| [Gorilla-OpenFunctions-v2](https://huggingface.co/MediaTek-Research/Breeze-7B-FC-v1_0) | 7B | Gorilla LLM | Apache 2.0 | ✅ | ❌ |
| [GPT-3.5-Turbo-0125](https://openai.com) | | OpenAI | Proprietary| ✅ | ✅ |
**Evaluate function calling on EN benchmark**
We evaluate the performance of function calling on English with benchmark [Berkeley function-calling leaderboard](https://gorilla.cs.berkeley.edu/blogs/8_berkeley_function_calling_leaderboard.html).
| Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
| **Breeze-7B-FC-v1_0 (FC)** | 86.89 | 76.25 | 90.00 | 93.00 | 84.00 | 84.00 | 100.00 | 92.00 | 88.00 | 77.50 |
| Gorilla-OpenFunctions-v2 (FC) | 85.95 | 60.00 | 94.25 | 95.50 | 86.50 | 86.00 | 97.00 | 96.00 | 80.00 | 75.00 |
| GPT-3.5-Turbo-0125 (FC) | 72.77 | 4.58 | 87.75 | 90.50 | 88.50 | 82.50 | 91.00 | 82.00 | 78.00 | 52.50 |
![](misc/radar_chart_en.png)
**Evaluate function calling on ZHTW benchmark**
We evaluate the performance of function calling on Traditional Chinese with benchmark [function-calling-leaderboard-for-zhtw](https://github.com/mtkresearch/function-calling-leaderboard-for-zhtw).
| Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
| **Breeze-7B-FC-v1_0 (FC)** | 78.18 | 72.50 | 82.00 | 86.00 | 76.50|67.00|88.00|88.00|80.00|60.00|
| Gorilla-OpenFunctions-v2 (FC) | 75.68 | 53.75 | 84.75 | 86.50 | 72.50 | 68.00 | 92.00 | 92.00 | 62.00 | 72.50 |
| GPT-3.5-Turbo-0125 (FC) | 66.15 | 7.50 | 83.75 | 83.50 | 73.00 | 65.50 | 88.00 | 84.00 | 72.00 | 40.00 |
![](misc/radar_chart_zhtw.png)
**Evaluate instrustion following on EN benchmark**
We evaluate the performance of instruction following on English with benchmark [MT-Bench](https://github.com/lm-sys/FastChat/blob/main/fastchat/llm_judge/README.md).
| | Win | Tie | Lose |
|---|---|---|---|
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 29 (18.1%) | 55 (34.3%) | 76 (47.5%) |
**Evaluate instrustion following on ZHTW benchmark**
We evaluate the performance of instruction following on Traditional Chinese with benchmark [MT-Bench-TC](https://github.com/mtkresearch/TCEval).
| | Win | Tie | Lose |
|---|---|---|---|
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 35 (21.9%) | 73 (45.6%) | 52 (32.5%) |
## 👩💻 How to use
**Demo with Kaggle Kernel**
Start from clicking the "Copy & Edit" button on https://www.kaggle.com/code/ycckaggle/run-breeze-fc
**Dependiency**
Install `mtkresearch` package
```
pip install mtkresearch
```
**Hosting the model by VLLM**
```python
from vllm import LLM, SamplingParams
llm = LLM(
model='MediaTek-Research/Breeze-7B-FC-v1_0',
tensor_parallel_size=num_gpu, # number of gpus
gpu_memory_utilization=0.7,
dtype='half'
)
turn_end_token_id = 61876 # <|im_end|>
params = SamplingParams(
temperature=0.01,
top_p=0.01,
max_tokens=4096,
repetition_penalty=1.1,
stop_token_ids=[turn_end_token_id]
)
def _inference(prompt, llm, params):
return llm.generate(prompt, params)[0].outputs[0].text
```
**Instruction following**
```python
from mtkresearch.llm.prompt import MRPromptV2
sys_prompt = ('You are a helpful AI assistant built by MediaTek Research. '
'The user you are helping speaks Traditional Chinese and comes from Taiwan.')
prompt_engine = MRPromptV2()
conversations = [
{"role": "system", "content": sys_prompt},
{"role": "user", "content": "請問什麼是深度學習?"},
]
prompt = prompt_engine.get_prompt(conversations)
output_str = _inference(prompt, llm, params)
result = prompt_engine.parse_generated_str(output_str)
print(result)
# {'role': 'assistant',
# 'content': '深度學習(Deep Learning)是一種機器學習方法,它模仿人類大腦的神經網路結構來
# 處理複雜的數據和任務。在深度學習中,模型由多層人工神經元組成,每個神經元之間有
# 權重連接,並通過非線性轉換進行計算。這些層與層之間的相互作用使模型能夠學習複雜
# 的函數關係或模式,從而解決各種問題,如圖像識別、自然語言理解、語音辨識等。深度
# 學習通常需要大量的數據和強大的計算能力,因此經常使用圖形處理器(GPU)或特殊的
# 加速器來執行。'}
```
**Function Calling**
```python
import json
from mtkresearch.llm.prompt import MRPromptV2
functions = [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
]
def fake_get_current_weather(location, unit=None):
return {'temperature': 30}
mapping = {
'get_current_weather': fake_get_current_weather
}
prompt_engine = MRPromptV2()
# stage 1: query
conversations = [
{"role": "user", "content": "請問台北目前溫度是攝氏幾度?"},
]
prompt = prompt_engine.get_prompt(conversations, functions=functions)
output_str = _inference(prompt, llm, params)
result = prompt_engine.parse_generated_str(output_str)
print(result)
# {'role': 'assistant',
# 'tool_calls': [
# {'id': 'call_U9bYCBRAbF639uUqfwehwSbw', 'type': 'function',
# 'function': {'name': 'get_current_weather', 'arguments': '{"location": "台北, 台灣", "unit": "celsius"}'}}]}
# stage 2: execute called functions
conversations.append(result)
tool_call = result['tool_calls'][0]
func_name = tool_call['function']['name']
func = mapping[func_name]
arguments = json.loads(tool_call['function']['arguments'])
called_result = func(**arguments)
# stage 3: put executed results
conversations.append(
{
'role': 'tool',
'tool_call_id': tool_call['id'],
'name': func_name,
'content': json.dumps(called_result)
}
)
prompt = prompt_engine.get_prompt(conversations, functions=functions)
output_str2 = _inference(prompt, llm, params)
result2 = prompt_engine.parse_generated_str(output_str2)
print(result2)
# {'role': 'assistant', 'content': '台北目前的溫度是攝氏30度'}
```
|