File size: 6,560 Bytes
c83614a a6a95d9 c83614a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
---
library_name: transformers
tags: []
---
norbert3-small trained on wikiann (fo/is), sucx3 (se), dane (da) and norne (nb/nn)
added a custom clf head along with a character-level cnn for adding a tiny extra signal for the classification.
results:
```css
Eval on wikiann - fo
index 0
tokens [Byrta, -, Aftur, og, aftur]
ner_tags [3, 0, 0, 0, 0]
subset fo
dataset wikiann
Name: 0, dtype: object
shape: (100, 5)
100%
5/5 [00:01<00:00, 3.92it/s]
Loss: 0.2276667356491089
O O
B-ORG B-ORG
B-ORG B-ORG
O O
O O
O O
O O
O O
O O
O O
Validation Loss: 0.26530784368515015
Validation Accuracy: 0.9228951181745751
precision recall f1-score support
LOC 0.86 0.81 0.83 154
ORG 0.67 0.73 0.70 125
PER 0.87 0.91 0.89 79
micro avg 0.79 0.80 0.80 358
macro avg 0.80 0.82 0.81 358
weighted avg 0.79 0.80 0.80 358
________________________________________
Eval on wikiann - is
index 100
tokens [Beltaþyrill, ''Ceryle, alcyon, '', Sjaldséð]
ner_tags [5, 0, 0, 0, 0]
subset is
dataset wikiann
Name: 0, dtype: object
shape: (1000, 5)
100%
50/50 [00:10<00:00, 5.02it/s]
Loss: 0.22668001055717468
O O
B-LOC B-LOC
B-LOC B-LOC
B-LOC B-LOC
B-LOC B-LOC
B-LOC B-LOC
B-LOC B-LOC
O O
O O
O O
Validation Loss: 0.2526825902983546
Validation Accuracy: 0.9360383541181041
precision recall f1-score support
LOC 0.84 0.85 0.84 1983
ORG 0.81 0.80 0.80 1762
PER 0.89 0.89 0.89 1020
micro avg 0.84 0.84 0.84 4765
macro avg 0.84 0.85 0.85 4765
weighted avg 0.84 0.84 0.84 4765
________________________________________
Eval on dane - default
index 1100
tokens [To, kendte, russiske, historikere, Andronik, ...
ner_tags [0, 0, 7, 0, 1, 2, 0, 1, 2, 0, 0, 0, 0, 5, 0, ...
subset default
dataset dane
Name: 0, dtype: object
shape: (565, 5)
100%
29/29 [00:06<00:00, 4.75it/s]
Loss: 0.12037135660648346
O O
O O
O O
O O
O O
B-MISC B-MISC
O O
O O
B-PER B-PER
B-PER B-PER
Validation Loss: 0.11113663488228259
Validation Accuracy: 0.972018408457994
precision recall f1-score support
LOC 0.78 0.86 0.82 225
MISC 0.72 0.52 0.61 333
ORG 0.72 0.69 0.71 379
PER 0.96 0.92 0.94 298
micro avg 0.80 0.73 0.76 1235
macro avg 0.80 0.75 0.77 1235
weighted avg 0.79 0.73 0.76 1235
________________________________________
Eval on norne - bokmaal-7
index 1665
tokens [Honnørordene, er, ", dristig, formspråk, ", ,...
ner_tags [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
subset bokmaal-7
dataset norne
Name: 0, dtype: object
shape: (1939, 5)
100%
97/97 [00:20<00:00, 4.56it/s]
Loss: 0.0011819382198154926
O O
O O
O O
O O
O O
O O
O O
O O
O O
O O
Validation Loss: 0.04194018930858649
Validation Accuracy: 0.9876322465792248
precision recall f1-score support
LOC 0.85 0.90 0.87 498
MISC 0.81 0.74 0.78 363
ORG 0.77 0.83 0.80 499
PER 0.93 0.96 0.95 845
micro avg 0.86 0.88 0.87 2205
macro avg 0.84 0.86 0.85 2205
weighted avg 0.86 0.88 0.87 2205
________________________________________
Eval on norne - nynorsk-7
index 3604
tokens [Den, er, mettande, og, smakfull, ,, og, det, ...
ner_tags [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
subset nynorsk-7
dataset norne
Name: 0, dtype: object
shape: (1511, 5)
100%
76/76 [00:15<00:00, 5.82it/s]
Loss: 0.0790824368596077
O O
O O
O O
O O
O O
O O
O O
O O
O O
O O
Validation Loss: 0.05325472676725583
Validation Accuracy: 0.9867293689853402
precision recall f1-score support
LOC 0.77 0.91 0.84 365
MISC 0.80 0.76 0.78 295
ORG 0.83 0.82 0.82 397
PER 0.98 0.95 0.97 664
micro avg 0.87 0.88 0.87 1721
macro avg 0.85 0.86 0.85 1721
weighted avg 0.87 0.88 0.87 1721
________________________________________
Eval on sucx3_ner - original_cased
index 5115
tokens [Just, i, dag, är, Saabs, företagsledning, där...
ner_tags [0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
subset original_cased
dataset sucx3_ner
Name: 0, dtype: object
shape: (14383, 5)
100%
720/720 [02:36<00:00, 5.02it/s]
Loss: 0.04177908971905708
Loss: 0.08230985613484489
Loss: 0.08399457804886486
Loss: 0.06163447560524267
Loss: 0.04787629511204947
Loss: 0.03949779063830233
Loss: 0.03397762095776484
Loss: 0.030040143460689266
O O
O O
O O
O O
O O
B-ORG B-ORG
B-ORG B-ORG
O O
O O
O O
Validation Loss: 0.02938824465528948
Validation Accuracy: 0.9919830972756728
precision recall f1-score support
LOC 0.88 0.91 0.90 4202
MISC 0.65 0.59 0.62 1899
ORG 0.74 0.76 0.75 3015
PER 0.92 0.93 0.92 5778
micro avg 0.84 0.84 0.84 14894
macro avg 0.80 0.80 0.80 14894
weighted avg 0.84 0.84 0.84 14894
________________________________________
```
|