Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +110 -2
README.md CHANGED
@@ -1,8 +1,103 @@
1
  ---
2
- license: mit
3
  language:
4
  - de
5
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  ---
7
  ![SauerkrautLM-Phi-3-medium](https://vago-solutions.ai/wp-content/uploads/2024/06/SauerkrautLM-phi3-medium.png "SauerkrautLM-Phi-3-medium")
8
  ## VAGO solutions SauerkrautLM-Phi-3-medium
@@ -89,4 +184,17 @@ If you are interested in customized LLMs for business applications, please get i
89
  We are also keenly seeking support and investment for our startup, VAGO solutions where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.ai/#Kontakt)
90
 
91
  ## Acknowledgement
92
- Many thanks to [unsloth](https://huggingface.co/unsloth/) and [Microsoft](https://huggingface.co/microsoft) for providing such valuable model to the Open-Source community.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - de
4
  - en
5
+ license: mit
6
+ model-index:
7
+ - name: SauerkrautLM-Phi-3-medium
8
+ results:
9
+ - task:
10
+ type: text-generation
11
+ name: Text Generation
12
+ dataset:
13
+ name: IFEval (0-Shot)
14
+ type: HuggingFaceH4/ifeval
15
+ args:
16
+ num_few_shot: 0
17
+ metrics:
18
+ - type: inst_level_strict_acc and prompt_level_strict_acc
19
+ value: 44.09
20
+ name: strict accuracy
21
+ source:
22
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Phi-3-medium
23
+ name: Open LLM Leaderboard
24
+ - task:
25
+ type: text-generation
26
+ name: Text Generation
27
+ dataset:
28
+ name: BBH (3-Shot)
29
+ type: BBH
30
+ args:
31
+ num_few_shot: 3
32
+ metrics:
33
+ - type: acc_norm
34
+ value: 49.63
35
+ name: normalized accuracy
36
+ source:
37
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Phi-3-medium
38
+ name: Open LLM Leaderboard
39
+ - task:
40
+ type: text-generation
41
+ name: Text Generation
42
+ dataset:
43
+ name: MATH Lvl 5 (4-Shot)
44
+ type: hendrycks/competition_math
45
+ args:
46
+ num_few_shot: 4
47
+ metrics:
48
+ - type: exact_match
49
+ value: 14.12
50
+ name: exact match
51
+ source:
52
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Phi-3-medium
53
+ name: Open LLM Leaderboard
54
+ - task:
55
+ type: text-generation
56
+ name: Text Generation
57
+ dataset:
58
+ name: GPQA (0-shot)
59
+ type: Idavidrein/gpqa
60
+ args:
61
+ num_few_shot: 0
62
+ metrics:
63
+ - type: acc_norm
64
+ value: 11.3
65
+ name: acc_norm
66
+ source:
67
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Phi-3-medium
68
+ name: Open LLM Leaderboard
69
+ - task:
70
+ type: text-generation
71
+ name: Text Generation
72
+ dataset:
73
+ name: MuSR (0-shot)
74
+ type: TAUR-Lab/MuSR
75
+ args:
76
+ num_few_shot: 0
77
+ metrics:
78
+ - type: acc_norm
79
+ value: 20.7
80
+ name: acc_norm
81
+ source:
82
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Phi-3-medium
83
+ name: Open LLM Leaderboard
84
+ - task:
85
+ type: text-generation
86
+ name: Text Generation
87
+ dataset:
88
+ name: MMLU-PRO (5-shot)
89
+ type: TIGER-Lab/MMLU-Pro
90
+ config: main
91
+ split: test
92
+ args:
93
+ num_few_shot: 5
94
+ metrics:
95
+ - type: acc
96
+ value: 40.72
97
+ name: accuracy
98
+ source:
99
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Phi-3-medium
100
+ name: Open LLM Leaderboard
101
  ---
102
  ![SauerkrautLM-Phi-3-medium](https://vago-solutions.ai/wp-content/uploads/2024/06/SauerkrautLM-phi3-medium.png "SauerkrautLM-Phi-3-medium")
103
  ## VAGO solutions SauerkrautLM-Phi-3-medium
 
184
  We are also keenly seeking support and investment for our startup, VAGO solutions where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.ai/#Kontakt)
185
 
186
  ## Acknowledgement
187
+ Many thanks to [unsloth](https://huggingface.co/unsloth/) and [Microsoft](https://huggingface.co/microsoft) for providing such valuable model to the Open-Source community.
188
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
189
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_VAGOsolutions__SauerkrautLM-Phi-3-medium)
190
+
191
+ | Metric |Value|
192
+ |-------------------|----:|
193
+ |Avg. |30.09|
194
+ |IFEval (0-Shot) |44.09|
195
+ |BBH (3-Shot) |49.63|
196
+ |MATH Lvl 5 (4-Shot)|14.12|
197
+ |GPQA (0-shot) |11.30|
198
+ |MuSR (0-shot) |20.70|
199
+ |MMLU-PRO (5-shot) |40.72|
200
+