Update README.md
Browse files
README.md
CHANGED
@@ -3,40 +3,22 @@ license: apache-2.0
|
|
3 |
library_name: transformers
|
4 |
---
|
5 |
|
6 |
-
|
7 |
|
8 |
-
|
9 |
-
|
10 |
-
|arc_challenge | 1|none | 0|acc |↑ |0.5239|± |0.0146|
|
11 |
-
| | |none | 0|acc_norm |↑ |0.5640|± |0.0145|
|
12 |
-
|arc_easy | 1|none | 0|acc |↑ |0.8060|± |0.0081|
|
13 |
-
| | |none | 0|acc_norm |↑ |0.7837|± |0.0084|
|
14 |
-
|hellaswag | 1|none | 0|acc |↑ |0.6398|± |0.0048|
|
15 |
-
| | |none | 0|acc_norm |↑ |0.8303|± |0.0037|
|
16 |
-
|lambada_openai| 1|none | 0|acc |↑ |0.6621|± |0.0066|
|
17 |
-
| | |none | 0|perplexity|↓ |4.0357|± |0.0917|
|
18 |
-
|piqa | 1|none | 0|acc |↑ |0.8036|± |0.0093|
|
19 |
-
| | |none | 0|acc_norm |↑ |0.8134|± |0.0091|
|
20 |
-
|sciq | 1|none | 0|acc |↑ |0.9630|± |0.0060|
|
21 |
-
| | |none | 0|acc_norm |↑ |0.9440|± |0.0073|
|
22 |
-
|winogrande | 1|none | 0|acc |↑ |0.7324|± |0.0124|
|
23 |
-
|mmlu | 2|none | |acc |↑ |0.7431|± |0.0034|
|
24 |
|
25 |
-
Benchmarks for
|
26 |
|
27 |
-
|
|
28 |
-
|
29 |
-
|arc_challenge |
|
30 |
-
|
|
31 |
-
|
|
32 |
-
|
|
33 |
-
|
|
34 |
-
|
|
35 |
-
|
|
36 |
-
|
|
37 |
-
|
38 |
-
|
39 |
-
|sciq | 1|none | 0|acc |↑ |0.9630|± |0.0060|
|
40 |
-
| | |none | 0|acc_norm |↑ |0.9490|± |0.0070|
|
41 |
-
|winogrande | 1|none | 0|acc |↑ |0.7048|± |0.0128|
|
42 |
-
|mmlu | 2|none | |acc |↑ |0.7985|± |0.0032|
|
|
|
3 |
library_name: transformers
|
4 |
---
|
5 |
|
6 |
+
# Qwerky-QwQ-32B
|
7 |
|
8 |
+
The following is a model converted from Qwen 32B QWQ, to the RWKV based architecture.
|
9 |
+
For existing details of the process from our previous release, find it [here]: https://huggingface.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
+
Benchmarks for Qwerky-QwQ-32B and the Qwerky-72B models
|
12 |
|
13 |
+
| Tasks | Metric | Qwerky-QwQ-32B | Qwen/QwQ-32B | Qwerky-72B | Qwen2.5-72B-Instruct |
|
14 |
+
|:---:|:---:|:---:|:---:|:---:|:---:|
|
15 |
+
| arc_challenge | acc_norm | **0.5640** | 0.5563 | **0.6382** | 0.6323 |
|
16 |
+
| arc_easy | acc_norm | 0.7837 | **0.7866** | **0.8443** | 0.8329 |
|
17 |
+
| hellaswag | acc_norm | 0.8303 | **0.8407** | 0.8573 | **0.8736** |
|
18 |
+
| lambada_openai | acc | 0.6621 | **0.6683** | **0.7539** | 0.7506 |
|
19 |
+
| piqa | acc | **0.8036** | 0.7976 | 0.8248 | **0.8357** |
|
20 |
+
| sciq | acc | **0.9630** | **0.9630** | 0.9670 | **0.9740** |
|
21 |
+
| winogrande | acc | **0.7324** | 0.7048 | **0.7956** | 0.7632 |
|
22 |
+
| mmlu | acc | 0.7431 | **0.7985** | 0.7746 | **0.8338** |
|
23 |
+
|
24 |
+
> All benchmark's besides MMLU are 0 n-shot, and is version 1, MMLU is version 2
|
|
|
|
|
|
|
|