File size: 2,950 Bytes
54ebf45
 
 
 
7800218
54ebf45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
---
datasets:
- teknium/OpenHermes-2.5
---
This is an ExLlamaV2 quantized model in 4bpw of [feeltheAGI/yi-super-9B](https://huggingface.co/feeltheAGI/yi-super-9B) using the default calibration dataset.

# Original Model card:

![1702046172090179.jpg](https://cdn-uploads.huggingface.co/production/uploads/65d1f383351255ba48a4f831/EdV6mhHGCv5w2BIC58vCm.jpeg)

YI-9B-Super

YI-9B-Super is an YI-9B model that has been further fine-tuned with OpenHermes-2.5 dataset. 


Results on some benchmarks : 

|                 Tasks                 |Version|     Filter     |n-shot|  Metric   | Value |   |Stderr|
|---------------------------------------|-------|----------------|------|-----------|------:|---|-----:|
|truthfulqa                             |N/A    |none            |     0|rouge1_max |47.1011|±  |0.8016|
|hellaswag                              |      1|none            |None  |acc        | 0.5758|±  |0.0049|
|                                       |       |none            |None  |acc_norm   | 0.7639|±  |0.0042|
|gsm8k_cot                              |      3|strict-match    |8     |exact_match| 0.5262|±  |0.0138|
|                                       |       |flexible-extract|8     |exact_match| 0.6027|±  |0.0135|
|gsm8k                                  |      3|strict-match    |5     |exact_match| 0.6073|±  |0.0135|
|                                       |       |flexible-extract|5     |exact_match| 0.6126|±  |0.0134|



|      Groups      |Version|Filter|n-shot|  Metric   | Value |   |Stderr|
|------------------|-------|------|------|-----------|------:|---|-----:|
|truthfulqa        |N/A    |none  |     0|rouge1_max |47.1011|±  |0.8016|
|                  |       |none  |     0|bleu_max   |21.9476|±  |0.7162|
|                  |       |none  |     0|rouge2_acc | 0.3293|±  |0.0165|
|                  |       |none  |     0|bleu_acc   | 0.3635|±  |0.0168|
|                  |       |none  |     0|rouge1_acc | 0.3892|±  |0.0171|
|                  |       |none  |     0|rougeL_acc | 0.3782|±  |0.0170|
|                  |       |none  |     0|bleu_diff  |-2.3953|±  |0.6292|
|                  |       |none  |     0|rouge2_diff|-4.6929|±  |0.9130|
|                  |       |none  |     0|rougeL_diff|-4.2677|±  |0.8034|
|                  |       |none  |     0|acc        | 0.4040|±  |0.0113|
|                  |       |none  |     0|rouge1_diff|-3.8975|±  |0.7966|
|                  |       |none  |     0|rougeL_max |43.7954|±  |0.8145|
|                  |       |none  |     0|rouge2_max |32.3573|±  |0.9094|
|mmlu              |N/A    |none  |     0|acc        | 0.6726|±  |0.0037|
| - humanities     |N/A    |none  |None  |acc        | 0.6043|±  |0.0067|
| - other          |N/A    |none  |None  |acc        | 0.7306|±  |0.0077|
| - social_sciences|N/A    |none  |None  |acc        | 0.7741|±  |0.0074|
| - stem           |N/A    |none  |None  |acc        | 0.6181|±  |0.0083|