metadata
license: apache-2.0
Model Card for MediaTek Research Breeze-7B-FC-v1_0
Performance
Models | #Parameters | Organization | License | Function Calling? | Instrustion Following? |
---|---|---|---|---|---|
Breeze-7B-Instruct-v1_0 | 7B | MediaTek Research | Apache 2.0 | No | Yes |
Breeze-7B-FC-v1_0 | 7B | MediaTek Research | Apache 2.0 | Yes | Yes |
Gorilla-OpenFunctions-v2 | 7B | Gorilla LLM | Apache 2.0 | Yes | No |
GPT-3.5-Turbo-0125 | OpenAI | Proprietary | Yes | Yes |
π Evaluate function calling on EN benchmark
Berkeley function-calling leaderboard
Models | β Overall | Irrelevance Detection |
AST/ Simple |
AST/ Multiple |
AST/ Parallel |
AST/ Parallel-Multiple |
Exec/ Simple |
Exec/ Multiple |
Exec/ Parallel |
Exec/ Parallel-Multiple |
---|---|---|---|---|---|---|---|---|---|---|
Breeze-7B-FC-v1_0 (FC) | 86.01 | 74.58 | 90.00 | 93.00 | 82.00 | 83.00 | 98.00 | 92.00 | 88.00 | 75.00 |
Gorilla-OpenFunctions-v2 (FC) | 85.95 | 60.00 | 94.25 | 95.50 | 86.50 | 86.00 | 97.00 | 96.00 | 80.00 | 75.00 |
GPT-3.5-Turbo-0125 (FC) | 72.77 | 4.58 | 87.75 | 90.50 | 88.50 | 82.50 | 91.00 | 82.00 | 78.00 | 52.50 |
π Evaluate function calling on ZHTW benchmark
function-calling-leaderboard-for-zhtw
Models | β Overall | Irrelevance Detection |
AST/ Simple |
AST/ Multiple |
AST/ Parallel |
AST/ Parallel-Multiple |
Exec/ Simple |
Exec/ Multiple |
Exec/ Parallel |
Exec/ Parallel-Multiple |
---|---|---|---|---|---|---|---|---|---|---|
Breeze-7B-FC-v1_0 (FC) | 77.70 | 71.67 | 82.00 | 86.50 | 76.00 | 65.50 | 87.00 | 88.00 | 80.00 | 57.50 |
Gorilla-OpenFunctions-v2 (FC) | 75.68 | 53.75 | 84.75 | 86.50 | 72.50 | 68.00 | 92.00 | 92.00 | 62.00 | 72.50 |
GPT-3.5-Turbo-0125 (FC) | 66.15 | 7.50 | 83.75 | 83.50 | 73.00 | 65.50 | 88.00 | 84.00 | 72.00 | 40.00 |
π Evaluate instrustion following on ZHTW benchmark
MT-Bench-TC
Win | Tie | Lose | |
---|---|---|---|
Breeze-7B-FC-v1_0 v.s. Breeze-7B-Instruct-v1_0 | 42 (26.3%) | 71 (44.4%) | 47 (29.4%) |