Commit
41d1f5c
1 Parent(s): 7022940

Adding the Open Portuguese LLM Leaderboard Evaluation Results (#1)

Browse files

- Adding the Open Portuguese LLM Leaderboard Evaluation Results (a06ba8fb77222a3d4102fe93daeb0d6118392694)
- Fixing some errors of the leaderboard evaluation results in the ModelCard yaml (b05e57809b567538709c97cff369e646f6fbb617)


Co-authored-by: Open PT LLM Leaderboard PR Bot <[email protected]>

Files changed (1) hide show
  1. README.md +172 -9
README.md CHANGED
@@ -1,21 +1,21 @@
1
  ---
 
 
2
  license: apache-2.0
 
 
 
3
  datasets:
4
  - nicholasKluge/Pt-Corpus-Instruct
5
- language:
6
- - pt
7
  metrics:
8
  - perplexity
9
- library_name: transformers
10
  pipeline_tag: text-generation
11
- tags:
12
- - text-generation-inference
13
  widget:
14
- - text: "A PUCRS é uma universidade "
15
  example_title: Exemplo
16
- - text: "A muitos anos atrás, em uma galáxia muito distante, vivia uma raça de"
17
  example_title: Exemplo
18
- - text: "Em meio a um escândalo, a frente parlamentar pediu ao Senador Silva para"
19
  example_title: Exemplo
20
  inference:
21
  parameters:
@@ -30,6 +30,153 @@ co2_eq_emissions:
30
  training_type: pre-training
31
  geographical_location: Germany
32
  hardware_used: NVIDIA A100-SXM4-40GB
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ---
34
  # TeenyTinyLlama-460m
35
 
@@ -224,4 +371,20 @@ This repository was built as part of the RAIES ([Rede de Inteligência Artificia
224
 
225
  ## License
226
 
227
- TeenyTinyLlama-460m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - pt
4
  license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - text-generation-inference
8
  datasets:
9
  - nicholasKluge/Pt-Corpus-Instruct
 
 
10
  metrics:
11
  - perplexity
 
12
  pipeline_tag: text-generation
 
 
13
  widget:
14
+ - text: 'A PUCRS é uma universidade '
15
  example_title: Exemplo
16
+ - text: A muitos anos atrás, em uma galáxia muito distante, vivia uma raça de
17
  example_title: Exemplo
18
+ - text: Em meio a um escândalo, a frente parlamentar pediu ao Senador Silva para
19
  example_title: Exemplo
20
  inference:
21
  parameters:
 
30
  training_type: pre-training
31
  geographical_location: Germany
32
  hardware_used: NVIDIA A100-SXM4-40GB
33
+ model-index:
34
+ - name: TeenyTinyLlama-460m
35
+ results:
36
+ - task:
37
+ type: text-generation
38
+ name: Text Generation
39
+ dataset:
40
+ name: ENEM Challenge (No Images)
41
+ type: eduagarcia/enem_challenge
42
+ split: train
43
+ args:
44
+ num_few_shot: 3
45
+ metrics:
46
+ - type: acc
47
+ value: 20.15
48
+ name: accuracy
49
+ source:
50
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
51
+ name: Open Portuguese LLM Leaderboard
52
+ - task:
53
+ type: text-generation
54
+ name: Text Generation
55
+ dataset:
56
+ name: BLUEX (No Images)
57
+ type: eduagarcia-temp/BLUEX_without_images
58
+ split: train
59
+ args:
60
+ num_few_shot: 3
61
+ metrics:
62
+ - type: acc
63
+ value: 25.73
64
+ name: accuracy
65
+ source:
66
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
67
+ name: Open Portuguese LLM Leaderboard
68
+ - task:
69
+ type: text-generation
70
+ name: Text Generation
71
+ dataset:
72
+ name: OAB Exams
73
+ type: eduagarcia/oab_exams
74
+ split: train
75
+ args:
76
+ num_few_shot: 3
77
+ metrics:
78
+ - type: acc
79
+ value: 27.02
80
+ name: accuracy
81
+ source:
82
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
83
+ name: Open Portuguese LLM Leaderboard
84
+ - task:
85
+ type: text-generation
86
+ name: Text Generation
87
+ dataset:
88
+ name: Assin2 RTE
89
+ type: assin2
90
+ split: test
91
+ args:
92
+ num_few_shot: 15
93
+ metrics:
94
+ - type: f1_macro
95
+ value: 53.61
96
+ name: f1-macro
97
+ source:
98
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
99
+ name: Open Portuguese LLM Leaderboard
100
+ - task:
101
+ type: text-generation
102
+ name: Text Generation
103
+ dataset:
104
+ name: Assin2 STS
105
+ type: eduagarcia/portuguese_benchmark
106
+ split: test
107
+ args:
108
+ num_few_shot: 15
109
+ metrics:
110
+ - type: pearson
111
+ value: 13.0
112
+ name: pearson
113
+ source:
114
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
115
+ name: Open Portuguese LLM Leaderboard
116
+ - task:
117
+ type: text-generation
118
+ name: Text Generation
119
+ dataset:
120
+ name: FaQuAD NLI
121
+ type: ruanchaves/faquad-nli
122
+ split: test
123
+ args:
124
+ num_few_shot: 15
125
+ metrics:
126
+ - type: f1_macro
127
+ value: 46.41
128
+ name: f1-macro
129
+ source:
130
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
131
+ name: Open Portuguese LLM Leaderboard
132
+ - task:
133
+ type: text-generation
134
+ name: Text Generation
135
+ dataset:
136
+ name: HateBR Binary
137
+ type: ruanchaves/hatebr
138
+ split: test
139
+ args:
140
+ num_few_shot: 25
141
+ metrics:
142
+ - type: f1_macro
143
+ value: 33.59
144
+ name: f1-macro
145
+ source:
146
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
147
+ name: Open Portuguese LLM Leaderboard
148
+ - task:
149
+ type: text-generation
150
+ name: Text Generation
151
+ dataset:
152
+ name: PT Hate Speech Binary
153
+ type: hate_speech_portuguese
154
+ split: test
155
+ args:
156
+ num_few_shot: 25
157
+ metrics:
158
+ - type: f1_macro
159
+ value: 22.99
160
+ name: f1-macro
161
+ source:
162
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
163
+ name: Open Portuguese LLM Leaderboard
164
+ - task:
165
+ type: text-generation
166
+ name: Text Generation
167
+ dataset:
168
+ name: tweetSentBR
169
+ type: eduagarcia-temp/tweetsentbr
170
+ split: test
171
+ args:
172
+ num_few_shot: 25
173
+ metrics:
174
+ - type: f1_macro
175
+ value: 17.28
176
+ name: f1-macro
177
+ source:
178
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicholasKluge/TeenyTinyLlama-460m
179
+ name: Open Portuguese LLM Leaderboard
180
  ---
181
  # TeenyTinyLlama-460m
182
 
 
371
 
372
  ## License
373
 
374
+ TeenyTinyLlama-460m is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
375
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
376
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/nicholasKluge/TeenyTinyLlama-460m)
377
+
378
+ | Metric | Value |
379
+ |--------------------------|---------|
380
+ |Average |**28.86**|
381
+ |ENEM Challenge (No Images)| 20.15|
382
+ |BLUEX (No Images) | 25.73|
383
+ |OAB Exams | 27.02|
384
+ |Assin2 RTE | 53.61|
385
+ |Assin2 STS | 13|
386
+ |FaQuAD NLI | 46.41|
387
+ |HateBR Binary | 33.59|
388
+ |PT Hate Speech Binary | 22.99|
389
+ |tweetSentBR | 17.28|
390
+