Update README.md
Browse files
README.md
CHANGED
@@ -385,6 +385,8 @@ The prompt is the same as the [CLIcK paper](https://arxiv.org/abs/2403.06412) pr
|
|
385 |
- GPT-4-turbo: 2024-04-09 version
|
386 |
- GPT-3.5-turbo: 2023-06-13 version
|
387 |
|
|
|
|
|
388 |
| Benchmarks | Phi-3.5-Mini-Instruct | Phi-3.0-Mini-128k-Instruct (June2024) | Llama-3.1-8B-Instruct | GPT-4o | GPT-4o-mini | GPT-4-turbo | GPT-3.5-turbo |
|
389 |
|:-------------------------|------------------------:|--------------------------------:|------------------------:|---------:|--------------:|--------------:|----------------:|
|
390 |
| CLIcK | 42.99 | 29.12 | 47.82 | 80.46 | 68.5 | 72.82 | 50.98 |
|
@@ -393,7 +395,7 @@ The prompt is the same as the [CLIcK paper](https://arxiv.org/abs/2403.06412) pr
|
|
393 |
| KMMLU (5-shot) | 37.35 | 29.98 | 20.21 | 64.28 | 51.62 | 59.29 | 42.28 |
|
394 |
| KMMLU-HARD (0-shot, CoT) | 24 | 25.68 | 24.03 | 39.62 | 24.56 | 30.56 | 20.97 |
|
395 |
| KMMLU-HARD (5-shot) | 24.76 | 25.73 | 15.81 | 40.94 | 24.63 | 31.12 | 21.19 |
|
396 |
-
| Average | 35.62 | 29.99 | 29.29 | 62.54 | 50.08 | 56.74 | 39.61 |
|
397 |
|
398 |
#### CLIcK (Cultural and Linguistic Intelligence in Korean)
|
399 |
|
|
|
385 |
- GPT-4-turbo: 2024-04-09 version
|
386 |
- GPT-3.5-turbo: 2023-06-13 version
|
387 |
|
388 |
+
The overall Korean benchmarks show that the Phi-3.5-Mini-Instruct with only 3.8B params outperforms Llama-3.1-8B-Instruct.
|
389 |
+
|
390 |
| Benchmarks | Phi-3.5-Mini-Instruct | Phi-3.0-Mini-128k-Instruct (June2024) | Llama-3.1-8B-Instruct | GPT-4o | GPT-4o-mini | GPT-4-turbo | GPT-3.5-turbo |
|
391 |
|:-------------------------|------------------------:|--------------------------------:|------------------------:|---------:|--------------:|--------------:|----------------:|
|
392 |
| CLIcK | 42.99 | 29.12 | 47.82 | 80.46 | 68.5 | 72.82 | 50.98 |
|
|
|
395 |
| KMMLU (5-shot) | 37.35 | 29.98 | 20.21 | 64.28 | 51.62 | 59.29 | 42.28 |
|
396 |
| KMMLU-HARD (0-shot, CoT) | 24 | 25.68 | 24.03 | 39.62 | 24.56 | 30.56 | 20.97 |
|
397 |
| KMMLU-HARD (5-shot) | 24.76 | 25.73 | 15.81 | 40.94 | 24.63 | 31.12 | 21.19 |
|
398 |
+
| **Average** | **35.62** | **29.99** | **29.29** | **62.54** | **50.08** | **56.74** | **39.61** |
|
399 |
|
400 |
#### CLIcK (Cultural and Linguistic Intelligence in Korean)
|
401 |
|