chenxingphh
commited on
Commit
·
b93cf43
1
Parent(s):
8b987c0
Update README.md
Browse files
README.md
CHANGED
@@ -59,17 +59,17 @@ pipeline_tag: text-generation
|
|
59 |
我们使用[opencompass](https://opencompass.org.cn)对以下通用领域数据集进行了 5-shot
|
60 |
测试。其他模型评估结果取自[opencompass-leaderboard](https://opencompass.org.cn/leaderboard-llm)。
|
61 |
|
62 |
-
| | C-Eval
|
63 |
-
|
64 |
-
| **GPT-4** | 69.9
|
65 |
-
| **ChatGPT** | 52.5
|
66 |
-
| **Claude-1** | 52
|
67 |
-
| **TigerBot-70B-Chat-V2** | 57.7
|
68 |
-
| **WeMix-LLaMA2-70B** | 55.2
|
69 |
-
| **LLaMA-2-70B-Chat** | 44.3
|
70 |
-
| **Qwen-14B-Chat** | 71.7
|
71 |
-
| **Baichuan-13B-Chat** |
|
72 |
-
| **OrionStar-Yi-34B-Chat** | 77.71
|
73 |
|
74 |
# 模型推理
|
75 |
|
|
|
59 |
我们使用[opencompass](https://opencompass.org.cn)对以下通用领域数据集进行了 5-shot
|
60 |
测试。其他模型评估结果取自[opencompass-leaderboard](https://opencompass.org.cn/leaderboard-llm)。
|
61 |
|
62 |
+
| | C-Eval | MMLU | CMMLU |
|
63 |
+
|---------------------------|-----------|--------|-----------|
|
64 |
+
| **GPT-4** | 69.9 | **83** | 71 |
|
65 |
+
| **ChatGPT** | 52.5 | 69.1 | 53.9 |
|
66 |
+
| **Claude-1** | 52 | 65.7 | - |
|
67 |
+
| **TigerBot-70B-Chat-V2** | 57.7 | 65.9 | 59.9 |
|
68 |
+
| **WeMix-LLaMA2-70B** | 55.2 | 71.3 | 56 |
|
69 |
+
| **LLaMA-2-70B-Chat** | 44.3 | 63.8 | 43.3 |
|
70 |
+
| **Qwen-14B-Chat** | 71.7 | 66.4 | 70 |
|
71 |
+
| **Baichuan-13B-Chat** | 53.4 | 50.5 | 50.7 |
|
72 |
+
| **OrionStar-Yi-34B-Chat** | **77.71** | 78.32 | **73.52** |
|
73 |
|
74 |
# 模型推理
|
75 |
|