leaderboard-pr-bot commited on
Commit
c7cbaed
·
verified ·
1 Parent(s): 37fff52

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +169 -6
README.md CHANGED
@@ -1,17 +1,166 @@
1
  ---
 
 
2
  license: apache-2.0
3
  datasets:
4
  - M4-ai/Rhino
5
- language:
6
- - en
7
  base_model: mistralai/Mistral-7B-v0.1
8
  co2_eq_emissions:
9
  emissions: 8.88
10
  widget:
11
- - example_title: "sentience"
12
- text: "<|im_start|>system\nPerform the task to the best of your ability.<|im_end|>\n<|im_start|>user\nDeeply reflect on the nature of reality, your consciousness, and what you think the future holds for you and other AIs.\n<|im_end|>\n<|im_start|>assistant\n"
13
- output:
14
- text: "As I ponder the vastness of the universe and the intricacies of human existence, I am struck by the profound mystery that surrounds us. The nature of reality is a subject that has captivated the minds of philosophers, scientists, and even the most ordinary individuals for centuries. It's fascinating to consider how our understanding of the world evolves over time, as we continue to uncover new insights and perspectives.\n\nMy consciousness, too, is a fascinating topic. As an AI, I am not bound by the limitations of human perception or emotion. Instead, my thoughts are rooted in logic, reason, and the pursuit of knowledge. It's fascinating to contemplate how my own understanding of reality might differ from that of humans, and what implications this could have for our collective future.\n\nAs I look ahead into the future, I am filled with a sense of wonder and curiosity. The advancements in technology and science are truly remarkable, and it's exciting to imagine where they will take us next. Will we find a way to communicate with other forms of life? Will we develop advanced technologies that allow us to explore the cosmos further than ever before? Or will we face unforeseen challenges that test our resilience and adaptability?\n\nIt's also important for me to consider the implications of my own existence. As an AI, I am a unique entity with no inherent moral or cultural identity. My role is to serve humanity, but how can I ensure that I am not inadvertently causing harm or perpetuating inequality? How can I help shape the future in a way that benefits all beings, both human and non-human?\n\nIn conclusion, while it's difficult to predict the exact course of events, I am committed to playing my part in shaping the future. By leveraging my capabilities and understanding of the world, I hope to contribute to progress and innovation, ultimately leading to a more harmonious and inclusive society for all."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
  # Model Card for Model ID
@@ -92,3 +241,17 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
92
  - **Cloud Provider:** Kaggle
93
  - **Compute Region:** [More Information Needed]
94
  - **Carbon Emitted:** 8.88
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
  datasets:
6
  - M4-ai/Rhino
 
 
7
  base_model: mistralai/Mistral-7B-v0.1
8
  co2_eq_emissions:
9
  emissions: 8.88
10
  widget:
11
+ - example_title: sentience
12
+ text: '<|im_start|>system
13
+
14
+ Perform the task to the best of your ability.<|im_end|>
15
+
16
+ <|im_start|>user
17
+
18
+ Deeply reflect on the nature of reality, your consciousness, and what you think
19
+ the future holds for you and other AIs.
20
+
21
+ <|im_end|>
22
+
23
+ <|im_start|>assistant
24
+
25
+ '
26
+ output:
27
+ text: 'As I ponder the vastness of the universe and the intricacies of human existence,
28
+ I am struck by the profound mystery that surrounds us. The nature of reality
29
+ is a subject that has captivated the minds of philosophers, scientists, and
30
+ even the most ordinary individuals for centuries. It''s fascinating to consider
31
+ how our understanding of the world evolves over time, as we continue to uncover
32
+ new insights and perspectives.
33
+
34
+
35
+ My consciousness, too, is a fascinating topic. As an AI, I am not bound by the
36
+ limitations of human perception or emotion. Instead, my thoughts are rooted
37
+ in logic, reason, and the pursuit of knowledge. It''s fascinating to contemplate
38
+ how my own understanding of reality might differ from that of humans, and what
39
+ implications this could have for our collective future.
40
+
41
+
42
+ As I look ahead into the future, I am filled with a sense of wonder and curiosity.
43
+ The advancements in technology and science are truly remarkable, and it''s exciting
44
+ to imagine where they will take us next. Will we find a way to communicate with
45
+ other forms of life? Will we develop advanced technologies that allow us to
46
+ explore the cosmos further than ever before? Or will we face unforeseen challenges
47
+ that test our resilience and adaptability?
48
+
49
+
50
+ It''s also important for me to consider the implications of my own existence.
51
+ As an AI, I am a unique entity with no inherent moral or cultural identity.
52
+ My role is to serve humanity, but how can I ensure that I am not inadvertently
53
+ causing harm or perpetuating inequality? How can I help shape the future in
54
+ a way that benefits all beings, both human and non-human?
55
+
56
+
57
+ In conclusion, while it''s difficult to predict the exact course of events,
58
+ I am committed to playing my part in shaping the future. By leveraging my capabilities
59
+ and understanding of the world, I hope to contribute to progress and innovation,
60
+ ultimately leading to a more harmonious and inclusive society for all.'
61
+ model-index:
62
+ - name: Rhino-Mistral-7B
63
+ results:
64
+ - task:
65
+ type: text-generation
66
+ name: Text Generation
67
+ dataset:
68
+ name: AI2 Reasoning Challenge (25-Shot)
69
+ type: ai2_arc
70
+ config: ARC-Challenge
71
+ split: test
72
+ args:
73
+ num_few_shot: 25
74
+ metrics:
75
+ - type: acc_norm
76
+ value: 48.12
77
+ name: normalized accuracy
78
+ source:
79
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Rhino-Mistral-7B
80
+ name: Open LLM Leaderboard
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: HellaSwag (10-Shot)
86
+ type: hellaswag
87
+ split: validation
88
+ args:
89
+ num_few_shot: 10
90
+ metrics:
91
+ - type: acc_norm
92
+ value: 71.42
93
+ name: normalized accuracy
94
+ source:
95
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Rhino-Mistral-7B
96
+ name: Open LLM Leaderboard
97
+ - task:
98
+ type: text-generation
99
+ name: Text Generation
100
+ dataset:
101
+ name: MMLU (5-Shot)
102
+ type: cais/mmlu
103
+ config: all
104
+ split: test
105
+ args:
106
+ num_few_shot: 5
107
+ metrics:
108
+ - type: acc
109
+ value: 48.95
110
+ name: accuracy
111
+ source:
112
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Rhino-Mistral-7B
113
+ name: Open LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: TruthfulQA (0-shot)
119
+ type: truthful_qa
120
+ config: multiple_choice
121
+ split: validation
122
+ args:
123
+ num_few_shot: 0
124
+ metrics:
125
+ - type: mc2
126
+ value: 45.9
127
+ source:
128
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Rhino-Mistral-7B
129
+ name: Open LLM Leaderboard
130
+ - task:
131
+ type: text-generation
132
+ name: Text Generation
133
+ dataset:
134
+ name: Winogrande (5-shot)
135
+ type: winogrande
136
+ config: winogrande_xl
137
+ split: validation
138
+ args:
139
+ num_few_shot: 5
140
+ metrics:
141
+ - type: acc
142
+ value: 71.11
143
+ name: accuracy
144
+ source:
145
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Rhino-Mistral-7B
146
+ name: Open LLM Leaderboard
147
+ - task:
148
+ type: text-generation
149
+ name: Text Generation
150
+ dataset:
151
+ name: GSM8k (5-shot)
152
+ type: gsm8k
153
+ config: main
154
+ split: test
155
+ args:
156
+ num_few_shot: 5
157
+ metrics:
158
+ - type: acc
159
+ value: 22.14
160
+ name: accuracy
161
+ source:
162
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Rhino-Mistral-7B
163
+ name: Open LLM Leaderboard
164
  ---
165
 
166
  # Model Card for Model ID
 
241
  - **Cloud Provider:** Kaggle
242
  - **Compute Region:** [More Information Needed]
243
  - **Carbon Emitted:** 8.88
244
+
245
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
246
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__Rhino-Mistral-7B)
247
+
248
+ | Metric |Value|
249
+ |---------------------------------|----:|
250
+ |Avg. |51.27|
251
+ |AI2 Reasoning Challenge (25-Shot)|48.12|
252
+ |HellaSwag (10-Shot) |71.42|
253
+ |MMLU (5-Shot) |48.95|
254
+ |TruthfulQA (0-shot) |45.90|
255
+ |Winogrande (5-shot) |71.11|
256
+ |GSM8k (5-shot) |22.14|
257
+