Adding the Open Portuguese LLM Leaderboard Evaluation Results

#9
Files changed (1) hide show
  1. README.md +19 -1
README.md CHANGED
@@ -13,9 +13,9 @@ tags:
13
  - preference
14
  - ultrafeedback
15
  - moe
 
16
  datasets:
17
  - argilla/ultrafeedback-binarized-preferences-cleaned
18
- base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
19
  pipeline_tag: text-generation
20
  model-index:
21
  - name: notux-8x7b-v1
@@ -108,3 +108,21 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
108
  |Winogrande (5-shot) |81.61|
109
  |GSM8k (5-shot) |61.64|
110
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - preference
14
  - ultrafeedback
15
  - moe
16
+ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
17
  datasets:
18
  - argilla/ultrafeedback-binarized-preferences-cleaned
 
19
  pipeline_tag: text-generation
20
  model-index:
21
  - name: notux-8x7b-v1
 
108
  |Winogrande (5-shot) |81.61|
109
  |GSM8k (5-shot) |61.64|
110
 
111
+
112
+ # Open Portuguese LLM Leaderboard Evaluation Results
113
+
114
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/argilla/notux-8x7b-v1) and on the [πŸš€ Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
115
+
116
+ | Metric | Value |
117
+ |--------------------------|--------|
118
+ |Average |**73.1**|
119
+ |ENEM Challenge (No Images)| 70.96|
120
+ |BLUEX (No Images) | 60.22|
121
+ |OAB Exams | 49.52|
122
+ |Assin2 RTE | 92.66|
123
+ |Assin2 STS | 82.40|
124
+ |FaQuAD NLI | 79.85|
125
+ |HateBR Binary | 77.91|
126
+ |PT Hate Speech Binary | 73.30|
127
+ |tweetSentBR | 71.08|
128
+