Adding Evaluation Results

#1
by kwaabot - opened
Files changed (1) hide show
  1. README.md +117 -8
README.md CHANGED
@@ -2,15 +2,110 @@
2
  license: llama3.1
3
  library_name: transformers
4
  tags:
5
- - moe
6
- - frankenmoe
7
- - merge
8
- - mergekit
9
  base_model:
10
- - Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base
11
- - ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.2
12
- - rombodawg/rombos_Replete-Coder-Instruct-8b-Merged
13
- - 3rd-Degree-Burn/Llama-3.1-8B-Squareroot-v0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
  # L3.1-Moe-4x8B-v0.2
@@ -71,3 +166,17 @@ experts:
71
  - "solve"
72
  - "count"
73
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: llama3.1
3
  library_name: transformers
4
  tags:
5
+ - moe
6
+ - frankenmoe
7
+ - merge
8
+ - mergekit
9
  base_model:
10
+ - Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base
11
+ - ArliAI/Llama-3.1-8B-ArliAI-RPMax-v1.2
12
+ - rombodawg/rombos_Replete-Coder-Instruct-8b-Merged
13
+ - 3rd-Degree-Burn/Llama-3.1-8B-Squareroot-v0
14
+ model-index:
15
+ - name: L3.1-Moe-4x8B-v0.2
16
+ results:
17
+ - task:
18
+ type: text-generation
19
+ name: Text Generation
20
+ dataset:
21
+ name: IFEval (0-Shot)
22
+ type: HuggingFaceH4/ifeval
23
+ args:
24
+ num_few_shot: 0
25
+ metrics:
26
+ - type: inst_level_strict_acc and prompt_level_strict_acc
27
+ value: 54.07
28
+ name: strict accuracy
29
+ source:
30
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=moeru-ai/L3.1-Moe-4x8B-v0.2
31
+ name: Open LLM Leaderboard
32
+ - task:
33
+ type: text-generation
34
+ name: Text Generation
35
+ dataset:
36
+ name: BBH (3-Shot)
37
+ type: BBH
38
+ args:
39
+ num_few_shot: 3
40
+ metrics:
41
+ - type: acc_norm
42
+ value: 21.34
43
+ name: normalized accuracy
44
+ source:
45
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=moeru-ai/L3.1-Moe-4x8B-v0.2
46
+ name: Open LLM Leaderboard
47
+ - task:
48
+ type: text-generation
49
+ name: Text Generation
50
+ dataset:
51
+ name: MATH Lvl 5 (4-Shot)
52
+ type: hendrycks/competition_math
53
+ args:
54
+ num_few_shot: 4
55
+ metrics:
56
+ - type: exact_match
57
+ value: 5.29
58
+ name: exact match
59
+ source:
60
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=moeru-ai/L3.1-Moe-4x8B-v0.2
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: GPQA (0-shot)
67
+ type: Idavidrein/gpqa
68
+ args:
69
+ num_few_shot: 0
70
+ metrics:
71
+ - type: acc_norm
72
+ value: 2.24
73
+ name: acc_norm
74
+ source:
75
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=moeru-ai/L3.1-Moe-4x8B-v0.2
76
+ name: Open LLM Leaderboard
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: MuSR (0-shot)
82
+ type: TAUR-Lab/MuSR
83
+ args:
84
+ num_few_shot: 0
85
+ metrics:
86
+ - type: acc_norm
87
+ value: 2.29
88
+ name: acc_norm
89
+ source:
90
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=moeru-ai/L3.1-Moe-4x8B-v0.2
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: MMLU-PRO (5-shot)
97
+ type: TIGER-Lab/MMLU-Pro
98
+ config: main
99
+ split: test
100
+ args:
101
+ num_few_shot: 5
102
+ metrics:
103
+ - type: acc
104
+ value: 19.58
105
+ name: accuracy
106
+ source:
107
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=moeru-ai/L3.1-Moe-4x8B-v0.2
108
+ name: Open LLM Leaderboard
109
  ---
110
 
111
  # L3.1-Moe-4x8B-v0.2
 
166
  - "solve"
167
  - "count"
168
  ```
169
+
170
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
171
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_moeru-ai__L3.1-Moe-4x8B-v0.2)
172
+
173
+ | Metric |Value|
174
+ |-------------------|----:|
175
+ |Avg. |17.47|
176
+ |IFEval (0-Shot) |54.07|
177
+ |BBH (3-Shot) |21.34|
178
+ |MATH Lvl 5 (4-Shot)| 5.29|
179
+ |GPQA (0-shot) | 2.24|
180
+ |MuSR (0-shot) | 2.29|
181
+ |MMLU-PRO (5-shot) |19.58|
182
+