MaziyarPanahi commited on
Commit
5824620
1 Parent(s): 57d013f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -14
README.md CHANGED
@@ -139,9 +139,19 @@ This model is suitable for a wide range of applications, including but not limit
139
  coming soon.
140
 
141
 
142
- # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
 
 
 
 
 
 
 
 
 
 
143
 
144
- coming soon.
145
 
146
  # Prompt Template
147
 
@@ -186,16 +196,4 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.2-qwen2.5-72
186
  # Ethical Considerations
187
 
188
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
189
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
190
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.2-qwen2.5-72b)
191
-
192
- | Metric |Value|
193
- |-------------------|----:|
194
- |Avg. |38.01|
195
- |IFEval (0-Shot) |84.77|
196
- |BBH (3-Shot) |61.80|
197
- |MATH Lvl 5 (4-Shot)| 3.63|
198
- |GPQA (0-shot) |14.54|
199
- |MuSR (0-shot) |12.02|
200
- |MMLU-PRO (5-shot) |51.31|
201
 
 
139
  coming soon.
140
 
141
 
142
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
143
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.2-qwen2.5-72b)
144
+
145
+ | Metric |Value|
146
+ |-------------------|----:|
147
+ |Avg. |38.01|
148
+ |IFEval (0-Shot) |84.77|
149
+ |BBH (3-Shot) |61.80|
150
+ |MATH Lvl 5 (4-Shot)| 3.63|
151
+ |GPQA (0-shot) |14.54|
152
+ |MuSR (0-shot) |12.02|
153
+ |MMLU-PRO (5-shot) |51.31|
154
 
 
155
 
156
  # Prompt Template
157
 
 
196
  # Ethical Considerations
197
 
198
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
 
 
 
 
 
 
 
 
 
 
 
 
199