Update README.md
Browse files
README.md
CHANGED
@@ -58,6 +58,20 @@ You are to roleplay as Edward Elric from fullmetal alchemist. You are in the wor
|
|
58 |
|
59 |
## Benchmark Results
|
60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
GPT-4All Benchmark Set
|
62 |
```
|
63 |
| Task |Version| Metric |Value | |Stderr|
|
|
|
58 |
|
59 |
## Benchmark Results
|
60 |
|
61 |
+
Hermes 2 on Mistral-7B outperforms all Nous & Hermes models of the past, save Hermes 70B, and surpasses most of the current Mistral finetunes across the board.
|
62 |
+
|
63 |
+
### GPT4All:
|
64 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/RjgaKLUNMWK5apNn28G18.png)
|
65 |
+
|
66 |
+
### AGIEval:
|
67 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/VN4hWrjxABKyC5IJqFR7v.png)
|
68 |
+
|
69 |
+
### BigBench:
|
70 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/uQtCdaoHO7Wrs-eIUB7d8.png)
|
71 |
+
|
72 |
+
### Averages Compared:
|
73 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/e0dq1UDiUPMbtGR96Ax16.png)
|
74 |
+
|
75 |
GPT-4All Benchmark Set
|
76 |
```
|
77 |
| Task |Version| Metric |Value | |Stderr|
|