Update README.md
Browse files
README.md
CHANGED
@@ -89,7 +89,7 @@ Bitte erkläre mir, wie die Zusammenführung von Modellen durch bestehende Spitz
|
|
89 |
## Evaluation
|
90 |
|
91 |
### GPT4ALL:
|
92 |
-
*Compared to
|
93 |
![GPT4ALL diagram](https://vago-solutions.de/wp-content/uploads/2023/11/GPT4All.png "SauerkrautLM-7b-HerO GPT4ALL Diagram")
|
94 |
|
95 |
![GPT4ALL table](https://vago-solutions.de/wp-content/uploads/2023/11/GPT4All-Tabelle.png "SauerkrautLM-7b-HerO GPT4ALL Table")
|
@@ -104,10 +104,10 @@ Bitte erkläre mir, wie die Zusammenführung von Modellen durch bestehende Spitz
|
|
104 |
**performed with newest Language Model Evaluation Harness*
|
105 |
|
106 |
### MMLU:
|
107 |
-
*Compared to Grok0,Grok1,GPT3.5,GPT4*
|
108 |
![MMLU](https://vago-solutions.de/wp-content/uploads/2023/11/MMLU-Benchmark.png "SauerkrautLM-7b-HerO MMLU")
|
109 |
### TruthfulQA:
|
110 |
-
*Compared to GPT3.5,GPT4*
|
111 |
![TruthfulQA](https://vago-solutions.de/wp-content/uploads/2023/11/Truthfulqa-Benchmark.png "SauerkrautLM-7b-HerO TruthfulQA")
|
112 |
|
113 |
### MT-Bench (German):
|
@@ -170,6 +170,7 @@ SauerkrautLM-3b-v1 2.581250
|
|
170 |
open_llama_3b_v2 1.456250
|
171 |
Llama-2-7b 1.181250
|
172 |
```
|
|
|
173 |
### MT-Bench (English):
|
174 |
![MT-Bench English Diagram](https://vago-solutions.de/wp-content/uploads/2023/11/MT-Bench-Englisch.png "SauerkrautLM-7b-HerO MT-Bench English Diagram")
|
175 |
```
|
@@ -197,7 +198,7 @@ SauerkrautLM-7b-HerO <--- 7.409375
|
|
197 |
Mistral-7B-OpenOrca 6.915625
|
198 |
neural-chat-7b-v3-1 6.812500
|
199 |
```
|
200 |
-
|
201 |
|
202 |
### Additional German Benchmark results:
|
203 |
![GermanBenchmarks](https://vago-solutions.de/wp-content/uploads/2023/11/German-benchmarks.png "SauerkrautLM-7b-HerO German Benchmarks")
|
|
|
89 |
## Evaluation
|
90 |
|
91 |
### GPT4ALL:
|
92 |
+
*Compared to relevant German Closed and Open Source models*
|
93 |
![GPT4ALL diagram](https://vago-solutions.de/wp-content/uploads/2023/11/GPT4All.png "SauerkrautLM-7b-HerO GPT4ALL Diagram")
|
94 |
|
95 |
![GPT4ALL table](https://vago-solutions.de/wp-content/uploads/2023/11/GPT4All-Tabelle.png "SauerkrautLM-7b-HerO GPT4ALL Table")
|
|
|
104 |
**performed with newest Language Model Evaluation Harness*
|
105 |
|
106 |
### MMLU:
|
107 |
+
*Compared to Big Boy LLMs (Grok0,Grok1,GPT3.5,GPT4)*
|
108 |
![MMLU](https://vago-solutions.de/wp-content/uploads/2023/11/MMLU-Benchmark.png "SauerkrautLM-7b-HerO MMLU")
|
109 |
### TruthfulQA:
|
110 |
+
*Compared to OpenAI Models (GPT3.5,GPT4)*
|
111 |
![TruthfulQA](https://vago-solutions.de/wp-content/uploads/2023/11/Truthfulqa-Benchmark.png "SauerkrautLM-7b-HerO TruthfulQA")
|
112 |
|
113 |
### MT-Bench (German):
|
|
|
170 |
open_llama_3b_v2 1.456250
|
171 |
Llama-2-7b 1.181250
|
172 |
```
|
173 |
+
**performed with the newest FastChat Version*
|
174 |
### MT-Bench (English):
|
175 |
![MT-Bench English Diagram](https://vago-solutions.de/wp-content/uploads/2023/11/MT-Bench-Englisch.png "SauerkrautLM-7b-HerO MT-Bench English Diagram")
|
176 |
```
|
|
|
198 |
Mistral-7B-OpenOrca 6.915625
|
199 |
neural-chat-7b-v3-1 6.812500
|
200 |
```
|
201 |
+
**performed with the newest FastChat Version*
|
202 |
|
203 |
### Additional German Benchmark results:
|
204 |
![GermanBenchmarks](https://vago-solutions.de/wp-content/uploads/2023/11/German-benchmarks.png "SauerkrautLM-7b-HerO German Benchmarks")
|