Fill-Mask
Transformers
Safetensors
Japanese
modernbert
speed commited on
Commit
5f6b1b9
·
verified ·
1 Parent(s): 4abd878

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -15
README.md CHANGED
@@ -75,22 +75,21 @@ For reference, Warner et al.'s ModernBERT uses 1.72T tokens for stage 1, 250B to
75
 
76
  ## Evaluation
77
 
78
- For the sentence classification task evaluation, the datasets JSTS, JNLI, and JCoLA from [JGLUE](https://aclanthology.org/2022.lrec-1.317/) were used. For the evaluation of the Zero-shot Sentence Retrieval task, the [miracl/miracl](https://huggingface.co/datasets/miracl/miracl) dataset (ja subset) was used.
79
-
80
  Evaluation code can be found at https://github.com/speed1313/bert-eval
81
 
82
- | Model | JSTS (pearson) | JNLI(acc) | JCoLA(acc) | Avg(JGLUE) | miracl(recall@10) | Avg |
83
- |------------------------------------------------|--------|--------|---------|--------------|----------|--------|
84
- | tohoku-nlp/bert-base-japanese-v3 | 0.9196 | 0.9117 | 0.8798 | 0.9037 | 0.74 | 0.8628 |
85
- | sbintuitions/modernbert-ja-130m | 0.9159 | 0.9273 | 0.8682 | 0.9038 | 0.5069 | 0.8046 |
86
- | sbintuitions/modernbert-ja-310m | 0.9317 | 0.9326 | 0.8832 | 0.9158 | 0.6569 | 0.8511 |
87
- | llm-jp-modernbert-base-v3-stage1-500k | 0.9247 | 0.917 | 0.8555 | 0.8991 | 0.5515 | 0.8122 |
88
- | llm-jp-modernbert-base-v3-stage2-200k | 0.9238 | 0.9108 | 0.8439 | 0.8928 | 0.5384 | 0.8042 |
89
- | llm-jp-modernbert-base-v4-ja-stage1-100k | 0.9213 | 0.9182 | 0.8613 | 0.9003 | N/A | N/A |
90
- | llm-jp-modernbert-base-v4-ja-stage1-300k | 0.9199 | 0.9187 | 0.852 | 0.8969 | N/A | N/A |
91
- | llm-jp-modernbert-base-v4-ja-stage1-400k | 0.9214 | 0.9203 | 0.8555 | 0.8991 | N/A | N/A |
92
- | llm-jp-modernbert-base-v4-ja-stage1-500k | 0.9212 | 0.9195 | 0.8451 | 0.8953 | 0.6025 | 0.8221 |
93
- | llm-jp-modernbert-base-v4-ja-stage2-200k | 0.9177 | 0.9133 | 0.8439 | 0.8916 | 0.5739 | 0.8122 |
94
-
95
 
96
 
 
75
 
76
  ## Evaluation
77
 
78
+ JSTS, JNLI, and JCoLA from [JGLUE](https://aclanthology.org/2022.lrec-1.317/) were used.
 
79
  Evaluation code can be found at https://github.com/speed1313/bert-eval
80
 
81
+ | Model | JSTS (pearson) | JNLI (accuracy) | JCoLA(accuracy) | Avg |
82
+ |-------------------------------------------------------|--------|--------|---------|--------------|
83
+ | tohoku-nlp/bert-base-japanese-v3 | 0.920 | 0.912 | 0.880 | 0.904 |
84
+ | sbintuitions/modernbert-ja-130m | 0.916 | 0.927 | 0.868 | 0.904 |
85
+ | sbintuitions/modernbert-ja-310m | 0.932 | 0.933 | 0.883 | 0.916 |
86
+ | speed/llm-jp-modernbert-base-v3-ja-stage1-500k | 0.925 | 0.917 | 0.856 | 0.899 |
87
+ | speed/llm-jp-modernbert-base-v3-ja-stage2-200k | 0.924 | 0.911 | 0.844 | 0.893 |
88
+ | speed/llm-jp-modernbert-base-v4-ja-stage1-100k | 0.921 | 0.918 | 0.861 | 0.900 |
89
+ | speed/llm-jp-modernbert-base-v4-ja-stage1-200k | 0.920 | 0.927 | 0.850 | 0.899 |
90
+ | speed/llm-jp-modernbert-base-v4-ja-stage1-300k | 0.920 | 0.919 | 0.852 | 0.897 |
91
+ | speed/llm-jp-modernbert-base-v4-ja-stage1-400k | 0.921 | 0.920 | 0.856 | 0.899 |
92
+ | speed/llm-jp-modernbert-base-v4-ja-stage1-500k | 0.921 | 0.919 | 0.845 | 0.895 |
93
+ | speed/llm-jp-modernbert-base-v4-ja-stage2-200k | 0.918 | 0.913 | 0.844 | 0.892 |
94
 
95