nguyenbh commited on
Commit
ccf028f
1 Parent(s): bdf5de1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -385,6 +385,8 @@ The prompt is the same as the [CLIcK paper](https://arxiv.org/abs/2403.06412) pr
385
  - GPT-4-turbo: 2024-04-09 version
386
  - GPT-3.5-turbo: 2023-06-13 version
387
 
 
 
388
  | Benchmarks | Phi-3.5-Mini-Instruct | Phi-3.0-Mini-128k-Instruct (June2024) | Llama-3.1-8B-Instruct | GPT-4o | GPT-4o-mini | GPT-4-turbo | GPT-3.5-turbo |
389
  |:-------------------------|------------------------:|--------------------------------:|------------------------:|---------:|--------------:|--------------:|----------------:|
390
  | CLIcK | 42.99 | 29.12 | 47.82 | 80.46 | 68.5 | 72.82 | 50.98 |
@@ -393,7 +395,7 @@ The prompt is the same as the [CLIcK paper](https://arxiv.org/abs/2403.06412) pr
393
  | KMMLU (5-shot) | 37.35 | 29.98 | 20.21 | 64.28 | 51.62 | 59.29 | 42.28 |
394
  | KMMLU-HARD (0-shot, CoT) | 24 | 25.68 | 24.03 | 39.62 | 24.56 | 30.56 | 20.97 |
395
  | KMMLU-HARD (5-shot) | 24.76 | 25.73 | 15.81 | 40.94 | 24.63 | 31.12 | 21.19 |
396
- | Average | 35.62 | 29.99 | 29.29 | 62.54 | 50.08 | 56.74 | 39.61 |
397
 
398
  #### CLIcK (Cultural and Linguistic Intelligence in Korean)
399
 
 
385
  - GPT-4-turbo: 2024-04-09 version
386
  - GPT-3.5-turbo: 2023-06-13 version
387
 
388
+ The overall Korean benchmarks show that the Phi-3.5-Mini-Instruct with only 3.8B params outperforms Llama-3.1-8B-Instruct.
389
+
390
  | Benchmarks | Phi-3.5-Mini-Instruct | Phi-3.0-Mini-128k-Instruct (June2024) | Llama-3.1-8B-Instruct | GPT-4o | GPT-4o-mini | GPT-4-turbo | GPT-3.5-turbo |
391
  |:-------------------------|------------------------:|--------------------------------:|------------------------:|---------:|--------------:|--------------:|----------------:|
392
  | CLIcK | 42.99 | 29.12 | 47.82 | 80.46 | 68.5 | 72.82 | 50.98 |
 
395
  | KMMLU (5-shot) | 37.35 | 29.98 | 20.21 | 64.28 | 51.62 | 59.29 | 42.28 |
396
  | KMMLU-HARD (0-shot, CoT) | 24 | 25.68 | 24.03 | 39.62 | 24.56 | 30.56 | 20.97 |
397
  | KMMLU-HARD (5-shot) | 24.76 | 25.73 | 15.81 | 40.94 | 24.63 | 31.12 | 21.19 |
398
+ | **Average** | **35.62** | **29.99** | **29.29** | **62.54** | **50.08** | **56.74** | **39.61** |
399
 
400
  #### CLIcK (Cultural and Linguistic Intelligence in Korean)
401