llm-jp
/

llm-jp-modernbert-base

Fill-Mask

Transformers

Safetensors

Japanese

modernbert

Model card Files Files and versions Community

speed commited on Mar 21

Commit

5f6b1b9

verified ·

1 Parent(s): 4abd878

Update README.md

Browse files

Files changed (1) hide show

README.md +14 -15

README.md CHANGED Viewed

@@ -75,22 +75,21 @@ For reference, Warner et al.'s ModernBERT uses 1.72T tokens for stage 1, 250B to
 ## Evaluation
-For the sentence classification task evaluation, the datasets JSTS, JNLI, and JCoLA from [JGLUE](https://aclanthology.org/2022.lrec-1.317/) were used. For the evaluation of the Zero-shot Sentence Retrieval task, the [miracl/miracl](https://huggingface.co/datasets/miracl/miracl) dataset (ja subset) was used.
 Evaluation code can be found at https://github.com/speed1313/bert-eval
-| Model                                          |   JSTS (pearson) |   JNLI(acc) |   JCoLA(acc) |   Avg(JGLUE) | miracl(recall@10)   |    Avg |
-|------------------------------------------------|--------|--------|---------|--------------|----------|--------|
-| tohoku-nlp/bert-base-japanese-v3               | 0.9196 | 0.9117 |  0.8798 |       0.9037 | 0.74     | 0.8628 |
-| sbintuitions/modernbert-ja-130m                | 0.9159 | 0.9273 |  0.8682 |       0.9038 | 0.5069   | 0.8046 |
-| sbintuitions/modernbert-ja-310m                | 0.9317 | 0.9326 |  0.8832 |       0.9158 | 0.6569   | 0.8511 |
-| llm-jp-modernbert-base-v3-stage1-500k   | 0.9247 | 0.917  |  0.8555 |       0.8991 | 0.5515   | 0.8122 |
-| llm-jp-modernbert-base-v3-stage2-200k   | 0.9238 | 0.9108 |  0.8439 |       0.8928 | 0.5384   | 0.8042 |
-| llm-jp-modernbert-base-v4-ja-stage1-100k | 0.9213 | 0.9182 |  0.8613 |       0.9003 | N/A      | N/A |
-| llm-jp-modernbert-base-v4-ja-stage1-300k | 0.9199 | 0.9187 |  0.852  |       0.8969 | N/A      | N/A |
-| llm-jp-modernbert-base-v4-ja-stage1-400k | 0.9214 | 0.9203 |  0.8555 |       0.8991 | N/A      | N/A |
-| llm-jp-modernbert-base-v4-ja-stage1-500k | 0.9212 | 0.9195 |  0.8451 |       0.8953 | 0.6025   | 0.8221 |
-| llm-jp-modernbert-base-v4-ja-stage2-200k | 0.9177 | 0.9133 |  0.8439 |       0.8916 | 0.5739   | 0.8122 |

 ## Evaluation
+JSTS, JNLI, and JCoLA from [JGLUE](https://aclanthology.org/2022.lrec-1.317/) were used.
 Evaluation code can be found at https://github.com/speed1313/bert-eval
+| Model                                                 |   JSTS (pearson) |   JNLI (accuracy) |   JCoLA(accuracy) |   Avg |
+|-------------------------------------------------------|--------|--------|---------|--------------|
+| tohoku-nlp/bert-base-japanese-v3                      |  0.920 |  0.912 |   0.880 |        0.904 |
+| sbintuitions/modernbert-ja-130m                       |  0.916 |  0.927 |   0.868 |        0.904 |
+| sbintuitions/modernbert-ja-310m                       |  0.932 |  0.933 |   0.883 |        0.916 |
+| speed/llm-jp-modernbert-base-v3-ja-stage1-500k        |  0.925 |  0.917 |   0.856 |        0.899 |
+| speed/llm-jp-modernbert-base-v3-ja-stage2-200k        |  0.924 |  0.911 |   0.844 |        0.893 |
+| speed/llm-jp-modernbert-base-v4-ja-stage1-100k        |  0.921 |  0.918 |   0.861 |        0.900 |
+| speed/llm-jp-modernbert-base-v4-ja-stage1-200k        |  0.920 |  0.927 |   0.850 |        0.899 |
+| speed/llm-jp-modernbert-base-v4-ja-stage1-300k        |  0.920 |  0.919 |   0.852 |        0.897 |
+| speed/llm-jp-modernbert-base-v4-ja-stage1-400k        |  0.921 |  0.920 |   0.856 |        0.899 |
+| speed/llm-jp-modernbert-base-v4-ja-stage1-500k        |  0.921 |  0.919 |   0.845 |        0.895 |
+| speed/llm-jp-modernbert-base-v4-ja-stage2-200k        |  0.918 |  0.913 |   0.844 |        0.892 |