AkimfromParis commited on
Commit
067f637
·
verified ·
1 Parent(s): 6e6645b

Update About page datasets in English

Browse files
Files changed (1) hide show
  1. src/about.py +154 -6
src/about.py CHANGED
@@ -79,19 +79,167 @@ INTRODUCTION_TEXT = """
79
 
80
  This leaderboard was built by [LLM-Jp](https://llm-jp.nii.ac.jp/en/), a cross-organizational project for the research and development of Japanese large language models (LLMs). Organized by the National Institute of Informatics, LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose.
81
 
82
- When you submit a model on the "Submit here!" page, it is automatically evaluated on a set of benchmarks. Before you submit it, please describe in details your LLM in the Hugging Face 's model card.
83
 
84
- This Open Japanese LLM Leaderboard assesses language understanding, of Japanese LLMs with more than 52 benchmarks from classical to modern NLP tasks such as Natural language inference, Question Answering, Machine Translation, Code Generation, Mathematical reasoning, Summarization, etc.
85
-
86
- For more information about benchmarks, and datasets, please consult the "About" page or directly to the evaluation tool, [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval).
87
-
88
- For more details, please refer to the website of [LLM-Jp](https://llm-jp.nii.ac.jp/en/)
89
 
90
  """
91
 
92
  # Which evaluations are you running? how can people reproduce what you have?
93
  LLM_BENCHMARKS_TEXT = f"""
94
  ## How it works
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
  ## Reproducibility
97
  To reproduce our results, here is the commands you can run:
 
79
 
80
  This leaderboard was built by [LLM-Jp](https://llm-jp.nii.ac.jp/en/), a cross-organizational project for the research and development of Japanese large language models (LLMs). Organized by the National Institute of Informatics, LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose.
81
 
82
+ When you submit a model on the "Submit here!" page, it is automatically evaluated on a set of benchmarks.This Open Japanese LLM Leaderboard assesses language understanding, of Japanese LLMs with more than 52 benchmarks from classical to modern NLP tasks such as Natural language inference, Question Answering, Machine Translation, Code Generation, Mathematical reasoning, Summarization, etc.
83
 
84
+ For more information about benchmarks, and datasets, please consult the "About" page. For more details, please refer to the website of [LLM-Jp](https://llm-jp.nii.ac.jp/en/)
 
 
 
 
85
 
86
  """
87
 
88
  # Which evaluations are you running? how can people reproduce what you have?
89
  LLM_BENCHMARKS_TEXT = f"""
90
  ## How it works
91
+ 📈 We evaluate Japanese Large Language Models on 52 key benchmarks leveraging our evaluation tool [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval), a unified framework to evaluate Japanese LLMs on various evaluation tasks.
92
+
93
+ Benchmarks:
94
+ NLI (Natural Language Inference)
95
+ ---
96
+
97
+ `Jamp`
98
+
99
+ Source:https://github.com/tomo-ut/temporalNLI_dataset
100
+ License:CC BY-SA 4.0
101
+
102
+ ###JaNLI
103
+
104
+ Source:https://github.com/verypluming/JaNLI
105
+ License:CC BY-SA 4.0
106
+
107
+ ###JNLI
108
+
109
+ Source:https://github.com/yahoojapan/JGLUE
110
+ License:CC BY-SA 4.0
111
+
112
+ ###JSeM
113
+
114
+ Source:https://github.com/DaisukeBekki/JSeM
115
+ License:BSD 3-Clause
116
+
117
+ ###JSICK
118
+
119
+ Source:https://github.com/verypluming/JSICK
120
+ License:CC BY-SA 4.0
121
+
122
+ QA (Question Answering)
123
+
124
+ ###JEMHopQA
125
+
126
+ Source:https://github.com/aiishii/JEMHopQA
127
+ License:CC BY-SA 4.0
128
+
129
+ ###NIILC
130
+
131
+ Source:https://github.com/mynlp/niilc-qa
132
+ License:CC BY-SA 4.0
133
+
134
+ ###JAQKET (AIO)
135
+
136
+ Source:https://www.nlp.ecei.tohoku.ac.jp/projects/jaqket/
137
+ License:CC BY-SA 4.0(Other licenses are required for corporate usage)
138
+
139
+ RC (Reading Comprehension)
140
+
141
+ ###JSQuAD
142
+
143
+ Source:https://github.com/yahoojapan/JGLUE
144
+ License:CC BY-SA 4.0
145
+
146
+ MC (Multiple Choice question answering)
147
+
148
+ ###JCommonsenseMorality
149
+
150
+ Source:https://github.com/Language-Media-Lab/commonsense-moral-ja
151
+ License:MIT License
152
+
153
+ ###JCommonsenseQA
154
+
155
+ Source:https://github.com/yahoojapan/JGLUE
156
+ License:CC BY-SA 4.0
157
+
158
+ ###Kyoto University Commonsense Inference dataset (KUCI)
159
+
160
+ Source:https://github.com/ku-nlp/KUCI
161
+ License:CC BY-SA 4.0
162
+
163
+ EL (Entity Linking)
164
+
165
+ ###chABSA
166
+
167
+ Source:https://github.com/chakki-works/chABSA-dataset
168
+ License:CC BY 4.0
169
+
170
+ FA (Fundamental Analysis)
171
+
172
+ ###Wikipedia Annotated Corpus
173
+
174
+ Source:https://github.com/ku-nlp/WikipediaAnnotatedCorpus
175
+ License:CC BY-SA 4.0
176
+ List of tasks:
177
+
178
+ Reading Prediction
179
+ Named-entity recognition (NER)
180
+ Dependency Parsing
181
+ Predicate-argument structure analysis (PAS)
182
+ Coreference Resolution
183
+
184
+ MR (Mathematical Reasoning)
185
+
186
+ ###MAWPS
187
+
188
+ Source:https://github.com/nlp-waseda/chain-of-thought-ja-dataset
189
+ License:Apache-2.0
190
+
191
+ ###MGSM
192
+
193
+ Source:https://huggingface.co/datasets/juletxara/mgsm
194
+ License:MIT License
195
+
196
+ MT (Machine Translation)
197
+
198
+ ###Asian Language Treebank (ALT) - Parallel Corpus
199
+
200
+ Source: https://www2.nict.go.jp/astrec-att/member/mutiyama/ALT/index.html
201
+ License:CC BY 4.0
202
+
203
+ ###WikiCorpus (Japanese-English Bilingual Corpus of Wikipedia's articles about the city of Kyoto)
204
+
205
+ Source: https://alaginrc.nict.go.jp/WikiCorpus/
206
+ License:CC BY-SA 3.0 deed
207
+
208
+ STS (Semantic Textual Similarity)
209
+
210
+ This task is supported by llm-jp-eval, but it is not included in the evaluation score average.
211
+
212
+ ###JSTS
213
+
214
+ Source:https://github.com/yahoojapan/JGLUE
215
+ License:CC BY-SA 4.0
216
+
217
+ HE (Human Examination)
218
+
219
+ ###MMLU
220
+
221
+ Source:https://github.com/hendrycks/test
222
+ License:MIT License
223
+
224
+ ###JMMLU
225
+
226
+ Source:https://github.com/nlp-waseda/JMMLU
227
+ License:CC BY-SA 4.0(3 tasks under the CC BY-NC-ND 4.0 license)
228
+
229
+ CG (Code Generation)
230
+
231
+ ###MBPP
232
+
233
+ Source:https://huggingface.co/datasets/llm-jp/mbpp-ja
234
+ License:CC-BY-4.0
235
+
236
+ SUM (Summarization)
237
+
238
+ ###XL-Sum
239
+
240
+ Source:https://github.com/csebuetnlp/xl-sum
241
+ License:CC BY-NC-SA 4.0(Due to the non-commercial license, this dataset will not be used, unless you specifically agree to the license and terms of use)
242
+
243
 
244
  ## Reproducibility
245
  To reproduce our results, here is the commands you can run: