Update README.md
Browse files
README.md
CHANGED
@@ -192,8 +192,9 @@ Average Score Comparison between OpenHermes-1 Llama-2 13B and OpenHermes-2 Mistr
|
|
192 |
|
193 |
**HumanEval:**
|
194 |
On code tasks, I first set out to make a hermes-2 coder, but found that it can have generalist improvements to the model, so I settled for slightly less code capabilities, for maximum generalist ones. That said, code capabilities had a decent jump alongside the overall capabilities of the model:
|
195 |
-
Glaive performed HumanEval testing on Hermes-2.5 and found a score of
|
196 |
-
|
|
|
197 |
|
198 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/IeeZnGmEyK73ejq0fKEms.png)
|
199 |
|
|
|
192 |
|
193 |
**HumanEval:**
|
194 |
On code tasks, I first set out to make a hermes-2 coder, but found that it can have generalist improvements to the model, so I settled for slightly less code capabilities, for maximum generalist ones. That said, code capabilities had a decent jump alongside the overall capabilities of the model:
|
195 |
+
Glaive performed HumanEval testing on Hermes-2.5 and found a score of:
|
196 |
+
|
197 |
+
**50.7% @ Pass1**
|
198 |
|
199 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317aade83d8d2fd903192d9/IeeZnGmEyK73ejq0fKEms.png)
|
200 |
|