Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ widget:
|
|
12 |
|
13 |
# What is this?
|
14 |
|
15 |
-
A
|
16 |
|
17 |
# How to use
|
18 |
|
@@ -47,6 +47,6 @@ Initially, only the word token embeddings are trained using 1.000.000 samples. F
|
|
47 |
|
48 |
# Evaluation
|
49 |
|
50 |
-
|
51 |
|
52 |
-
|
|
|
12 |
|
13 |
# What is this?
|
14 |
|
15 |
+
A pre-trained BERT model (base version, ~110 M parameters) for Danish NLP. The model was not pre-trained from scratch but adapted from the English version with a tokenizer trained on Danish text.
|
16 |
|
17 |
# How to use
|
18 |
|
|
|
47 |
|
48 |
# Evaluation
|
49 |
|
50 |
+
The performance of the pretrained model was evaluated using [ScandEval](https://github.com/ScandEval/ScandEval).
|
51 |
|
52 |
+
'| task | dataset | summary |\n|:-------------------------|:-------------|:-------------------------------------------------------------------------------------------|\n| sentiment-classification | swerec | mcc = 63.02, mcc_se = 2.16, macro_f1 = 62.2, macro_f1_se = 3.61 |\n| sentiment-classification | angry-tweets | mcc = 47.21, mcc_se = 0.53, macro_f1 = 64.21, macro_f1_se = 0.53 |\n| sentiment-classification | norec | mcc = 42.23, mcc_se = 8.69, macro_f1 = 57.24, macro_f1_se = 7.67 |\n| named-entity-recognition | suc3 | micro_f1 = 50.03, micro_f1_se = 4.16, micro_f1_no_misc = 53.55, micro_f1_no_misc_se = 4.57 |\n| named-entity-recognition | dane | micro_f1 = 76.44, micro_f1_se = 1.36, micro_f1_no_misc = 80.61, micro_f1_no_misc_se = 1.11 |\n| named-entity-recognition | norne-nb | micro_f1 = 68.38, micro_f1_se = 1.72, micro_f1_no_misc = 73.08, micro_f1_no_misc_se = 1.66 |\n| named-entity-recognition | norne-nn | micro_f1 = 60.45, micro_f1_se = 1.71, micro_f1_no_misc = 64.39, micro_f1_no_misc_se = 1.8 |\n| linguistic-acceptability | scala-sv | mcc = 5.01, mcc_se = 5.41, macro_f1 = 49.46, macro_f1_se = 3.67 |\n| linguistic-acceptability | scala-da | mcc = 54.74, mcc_se = 12.22, macro_f1 = 76.25, macro_f1_se = 6.09 |\n| linguistic-acceptability | scala-nb | mcc = 19.18, mcc_se = 14.01, macro_f1 = 55.3, macro_f1_se = 8.85 |\n| linguistic-acceptability | scala-nn | mcc = 5.72, mcc_se = 5.91, macro_f1 = 49.56, macro_f1_se = 3.73 |\n| question-answering | scandiqa-da | em = 26.36, em_se = 1.17, f1 = 32.41, f1_se = 1.1 |\n| question-answering | scandiqa-no | em = 26.14, em_se = 1.59, f1 = 32.02, f1_se = 1.59 |\n| question-answering | scandiqa-sv | em = 26.38, em_se = 1.1, f1 = 32.33, f1_se = 1.05 |\n| speed | speed | speed = 4.55, speed_se = 0.0 |'
|