leaderboard-pr-bot commited on
Commit
3e08bcc
1 Parent(s): f209799

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +22 -15
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  license: apache-2.0
 
 
3
  model-index:
4
  - name: MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
5
  results:
@@ -21,8 +23,7 @@ model-index:
21
  value: 64.59
22
  name: normalized accuracy
23
  source:
24
- url: >-
25
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
26
  name: Open LLM Leaderboard
27
  - task:
28
  type: text-generation
@@ -41,8 +42,7 @@ model-index:
41
  value: 85.39
42
  name: normalized accuracy
43
  source:
44
- url: >-
45
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
46
  name: Open LLM Leaderboard
47
  - task:
48
  type: text-generation
@@ -62,8 +62,7 @@ model-index:
62
  value: 64.27
63
  name: accuracy
64
  source:
65
- url: >-
66
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
67
  name: Open LLM Leaderboard
68
  - task:
69
  type: text-generation
@@ -81,8 +80,7 @@ model-index:
81
  - type: mc2
82
  value: 55.14
83
  source:
84
- url: >-
85
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
86
  name: Open LLM Leaderboard
87
  - task:
88
  type: text-generation
@@ -102,8 +100,7 @@ model-index:
102
  value: 79.64
103
  name: accuracy
104
  source:
105
- url: >-
106
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
107
  name: Open LLM Leaderboard
108
  - task:
109
  type: text-generation
@@ -123,11 +120,8 @@ model-index:
123
  value: 71.65
124
  name: accuracy
125
  source:
126
- url: >-
127
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
128
  name: Open LLM Leaderboard
129
- tags:
130
- - merge
131
  ---
132
 
133
  # MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
@@ -155,4 +149,17 @@ parameters:
155
  - value: 0.5 # fallback for rest of tensors
156
  dtype: bfloat16
157
 
158
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - merge
5
  model-index:
6
  - name: MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
7
  results:
 
23
  value: 64.59
24
  name: normalized accuracy
25
  source:
26
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
27
  name: Open LLM Leaderboard
28
  - task:
29
  type: text-generation
 
42
  value: 85.39
43
  name: normalized accuracy
44
  source:
45
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
46
  name: Open LLM Leaderboard
47
  - task:
48
  type: text-generation
 
62
  value: 64.27
63
  name: accuracy
64
  source:
65
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
66
  name: Open LLM Leaderboard
67
  - task:
68
  type: text-generation
 
80
  - type: mc2
81
  value: 55.14
82
  source:
83
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
84
  name: Open LLM Leaderboard
85
  - task:
86
  type: text-generation
 
100
  value: 79.64
101
  name: accuracy
102
  source:
103
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
104
  name: Open LLM Leaderboard
105
  - task:
106
  type: text-generation
 
120
  value: 71.65
121
  name: accuracy
122
  source:
123
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
124
  name: Open LLM Leaderboard
 
 
125
  ---
126
 
127
  # MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp
 
149
  - value: 0.5 # fallback for rest of tensors
150
  dtype: bfloat16
151
 
152
+ ```
153
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
154
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_PulsarAI__MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp)
155
+
156
+ | Metric |Value|
157
+ |---------------------------------|----:|
158
+ |Avg. |70.11|
159
+ |AI2 Reasoning Challenge (25-Shot)|64.59|
160
+ |HellaSwag (10-Shot) |85.39|
161
+ |MMLU (5-Shot) |64.27|
162
+ |TruthfulQA (0-shot) |55.14|
163
+ |Winogrande (5-shot) |79.64|
164
+ |GSM8k (5-shot) |71.65|
165
+