osanseviero commited on
Commit
f83aa2e
1 Parent(s): 0d1f014

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD Greek GDT v2.5
2
 
3
  * Author: Prokopidis, Prokopis
4
  * URL: https://github.com/UniversalDependencies/UD_Greek-GDT
 
1
+ # UD Greek GDT v2.8
2
 
3
  * Author: Prokopidis, Prokopis
4
  * URL: https://github.com/UniversalDependencies/UD_Greek-GDT
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - el
7
- license: cc-by-nc-sa-4.0
8
  model-index:
9
  - name: el_core_news_lg
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7644628099
18
  - name: NER Recall
19
  type: recall
20
- value: 0.7773109244
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7708333333
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9349701721
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.9164619165
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.9255583127
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.9209876543
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8808176273
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.8808176273
58
  ---
59
  ### Details: https://spacy.io/models/el#el_core_news_lg
60
 
@@ -63,12 +63,12 @@ Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, se
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `el_core_news_lg` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
- | **Sources** | [UD Greek GDT v2.5](https://github.com/UniversalDependencies/UD_Greek-GDT) (Prokopidis, Prokopis)<br />[Greek NER Corpus (Google Summer of Code 2018)](https://github.com/eellak/gsoc2018-spacy) (Giannis Daras)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-NC-SA 3.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -92,15 +92,21 @@ Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, se
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 100.00 |
95
- | `TAG_ACC` | 93.50 |
96
- | `POS_ACC` | 96.49 |
97
- | `MORPH_ACC` | 91.68 |
98
- | `LEMMA_ACC` | 56.56 |
99
- | `DEP_UAS` | 88.08 |
100
- | `DEP_LAS` | 84.93 |
101
- | `SENTS_P` | 91.65 |
102
- | `SENTS_R` | 92.56 |
103
- | `SENTS_F` | 92.10 |
104
- | `ENTS_P` | 76.45 |
105
- | `ENTS_R` | 77.73 |
106
- | `ENTS_F` | 77.08 |
 
 
 
 
 
 
 
4
  - token-classification
5
  language:
6
  - el
7
+ license: cc-by-nc-sa-3.0
8
  model-index:
9
  - name: el_core_news_lg
10
  results:
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7682926829
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.7941176471
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.7809917355
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9335897057
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.9305210918
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.9305210918
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.9305210918
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.8820360598
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.8820360598
58
  ---
59
  ### Details: https://spacy.io/models/el#el_core_news_lg
60
 
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `el_core_news_lg` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
+ | **Sources** | [UD Greek GDT v2.8](https://github.com/UniversalDependencies/UD_Greek-GDT) (Prokopidis, Prokopis)<br />[Greek NER Corpus (Google Summer of Code 2018)](https://github.com/eellak/gsoc2018-spacy) (Giannis Daras)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-NC-SA 3.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
 
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 100.00 |
95
+ | `TOKEN_P` | 99.90 |
96
+ | `TOKEN_R` | 99.95 |
97
+ | `TOKEN_F` | 99.93 |
98
+ | `SENTS_P` | 93.05 |
99
+ | `SENTS_R` | 93.05 |
100
+ | `SENTS_F` | 93.05 |
101
+ | `DEP_UAS` | 88.20 |
102
+ | `DEP_LAS` | 84.77 |
103
+ | `ENTS_P` | 76.83 |
104
+ | `ENTS_R` | 79.41 |
105
+ | `ENTS_F` | 78.10 |
106
+ | `POS_ACC` | 96.36 |
107
+ | `MORPH_ACC` | 91.42 |
108
+ | `MORPH_MICRO_P` | 96.32 |
109
+ | `MORPH_MICRO_R` | 96.24 |
110
+ | `MORPH_MICRO_F` | 96.28 |
111
+ | `TAG_ACC` | 93.36 |
112
+ | `LEMMA_ACC` | 56.46 |
accuracy.json CHANGED
@@ -1,252 +1,168 @@
1
  {
2
  "token_acc": 1.0,
3
- "tag_acc": 0.9349701721,
4
- "pos_acc": 0.9649460139,
5
- "morph_acc": 0.9168268994,
6
- "lemma_acc": 0.5655968052,
7
- "dep_uas": 0.8808176273,
8
- "dep_las": 0.8492774328,
9
- "sents_p": 0.9164619165,
10
- "sents_r": 0.9255583127,
11
- "sents_f": 0.9209876543,
12
- "speed": 3094.9893702595,
13
- "morph_per_feat": {
14
- "Abbr": {
15
- "p": 0.9638554217,
16
- "r": 0.8602150538,
17
- "f": 0.9090909091
18
- },
19
- "Case": {
20
- "p": 0.9394544245,
21
- "r": 0.9416472157,
22
- "f": 0.940549542
23
- },
24
- "Gender": {
25
- "p": 0.9451097804,
26
- "r": 0.9473157719,
27
- "f": 0.9462114904
28
- },
29
- "Number": {
30
- "p": 0.9825673534,
31
- "r": 0.9844110855,
32
- "f": 0.9834883553
33
- },
34
- "Aspect": {
35
- "p": 0.9562118126,
36
- "r": 0.9427710843,
37
- "f": 0.9494438827
38
- },
39
- "Mood": {
40
- "p": 0.9935414424,
41
- "r": 0.9924731183,
42
- "f": 0.993006993
43
- },
44
- "Person": {
45
- "p": 0.9859675037,
46
- "r": 0.978021978,
47
- "f": 0.9819786686
48
- },
49
- "Tense": {
50
- "p": 0.9868938401,
51
- "r": 0.9817470665,
52
- "f": 0.9843137255
53
- },
54
- "VerbForm": {
55
- "p": 0.9867617108,
56
- "r": 0.9728915663,
57
- "f": 0.9797775531
58
- },
59
- "Voice": {
60
- "p": 0.9765784114,
61
- "r": 0.9628514056,
62
- "f": 0.9696663296
63
- },
64
- "Definite": {
65
- "p": 0.9886685552,
66
- "r": 0.9977129788,
67
- "f": 0.9931701764
68
- },
69
- "PronType": {
70
- "p": 0.9858447489,
71
- "r": 0.9885531136,
72
- "f": 0.9871970736
73
- },
74
- "Foreign": {
75
- "p": 0.7928571429,
76
- "r": 0.6894409938,
77
- "f": 0.7375415282
78
- },
79
- "NumType": {
80
- "p": 0.9692307692,
81
- "r": 0.9219512195,
82
- "f": 0.945
83
- },
84
- "Poss": {
85
- "p": 0.9213483146,
86
- "r": 0.9213483146,
87
- "f": 0.9213483146
88
- },
89
- "Degree": {
90
- "p": 0.8666666667,
91
- "r": 0.6842105263,
92
- "f": 0.7647058824
93
- }
94
- },
95
  "dep_las_per_type": {
96
  "root": {
97
- "p": 0.8771498771,
98
- "r": 0.8858560794,
99
- "f": 0.8814814815
100
  },
101
  "nmod": {
102
- "p": 0.8143100511,
103
- "r": 0.8213058419,
104
- "f": 0.8177929855
105
  },
106
  "vocative": {
107
- "p": 1.0,
108
  "r": 0.5714285714,
109
- "f": 0.7272727273
110
  },
111
  "cc": {
112
- "p": 0.85,
113
  "r": 0.8473520249,
114
- "f": 0.848673947
115
  },
116
  "conj": {
117
- "p": 0.5,
118
- "r": 0.4862637363,
119
- "f": 0.4930362117
120
  },
121
  "aux": {
122
- "p": 0.9814814815,
123
  "r": 0.9742647059,
124
- "f": 0.9778597786
125
  },
126
  "advmod": {
127
- "p": 0.7733711048,
128
- "r": 0.7690140845,
129
- "f": 0.7711864407
130
  },
131
  "ccomp": {
132
- "p": 0.8260869565,
133
- "r": 0.8260869565,
134
- "f": 0.8260869565
135
  },
136
  "det": {
137
- "p": 0.9628232759,
138
- "r": 0.9711956522,
139
- "f": 0.966991342
140
  },
141
  "obj": {
142
- "p": 0.8652694611,
143
- "r": 0.8784194529,
144
- "f": 0.8717948718
145
  },
146
  "flat": {
147
- "p": 0.6788990826,
148
- "r": 0.7474747475,
149
- "f": 0.7115384615
150
  },
151
  "case": {
152
- "p": 0.9605963791,
153
- "r": 0.9636752137,
154
- "f": 0.9621333333
155
  },
156
  "amod": {
157
- "p": 0.925093633,
158
  "r": 0.8992718447,
159
- "f": 0.912
160
  },
161
  "obl": {
162
- "p": 0.7770992366,
163
- "r": 0.8015748031,
164
- "f": 0.7891472868
165
  },
166
  "acl:relcl": {
167
- "p": 0.7471264368,
168
- "r": 0.7065217391,
169
- "f": 0.7262569832
170
  },
171
  "mark": {
172
- "p": 0.9097222222,
173
- "r": 0.9160839161,
174
- "f": 0.9128919861
175
  },
176
  "nsubj:pass": {
177
- "p": 0.7865853659,
178
- "r": 0.7818181818,
179
- "f": 0.7841945289
180
  },
181
  "nsubj": {
182
- "p": 0.7690531178,
183
- "r": 0.7780373832,
184
- "f": 0.7735191638
185
  },
186
  "cop": {
187
- "p": 0.7857142857,
188
- "r": 0.7623762376,
189
- "f": 0.7738693467
190
  },
191
  "parataxis": {
192
- "p": 0.125,
193
- "r": 0.0588235294,
194
- "f": 0.08
195
  },
196
  "nummod": {
197
- "p": 0.8488372093,
198
- "r": 0.8795180723,
199
- "f": 0.8639053254
200
  },
201
  "advcl": {
202
- "p": 0.5041322314,
203
- "r": 0.5754716981,
204
- "f": 0.5374449339
205
  },
206
  "xcomp": {
207
- "p": 0.7972972973,
208
- "r": 0.7023809524,
209
- "f": 0.746835443
210
  },
211
  "csubj": {
212
- "p": 0.8823529412,
213
- "r": 0.6818181818,
214
- "f": 0.7692307692
 
 
 
 
 
 
 
 
 
 
215
  },
216
  "fixed": {
217
- "p": 0.3,
218
- "r": 0.4285714286,
219
- "f": 0.3529411765
220
  },
221
  "compound": {
222
  "p": 0.0,
223
  "r": 0.0,
224
  "f": 0.0
225
  },
226
- "appos": {
227
- "p": 0.3076923077,
228
- "r": 0.2448979592,
229
- "f": 0.2727272727
230
- },
231
  "acl": {
232
- "p": 0.7916666667,
233
- "r": 0.4318181818,
234
- "f": 0.5588235294
235
  },
236
  "csubj:pass": {
237
- "p": 1.0,
238
  "r": 0.8333333333,
239
- "f": 0.9090909091
240
  },
241
  "obl:agent": {
242
- "p": 0.8,
243
- "r": 0.48,
244
- "f": 0.6
245
- },
246
- "dep": {
247
- "p": 0.0,
248
- "r": 0.0,
249
- "f": 0.0
250
  },
251
  "orphan": {
252
  "p": 0.0,
@@ -254,9 +170,9 @@
254
  "f": 0.0
255
  },
256
  "iobj": {
257
- "p": 0.6666666667,
258
  "r": 1.0,
259
- "f": 0.8
260
  },
261
  "expl": {
262
  "p": 0.0,
@@ -264,39 +180,129 @@
264
  "f": 0.0
265
  }
266
  },
267
- "ents_p": 0.7644628099,
268
- "ents_r": 0.7773109244,
269
- "ents_f": 0.7708333333,
270
  "ents_per_type": {
271
- "PERSON": {
272
- "p": 0.9,
273
- "r": 0.84375,
274
- "f": 0.8709677419
275
- },
276
  "GPE": {
277
- "p": 0.806122449,
278
- "r": 0.908045977,
279
- "f": 0.8540540541
280
  },
281
  "ORG": {
282
- "p": 0.6351351351,
283
- "r": 0.661971831,
284
- "f": 0.6482758621
285
  },
286
- "PRODUCT": {
287
- "p": 0.4,
288
- "r": 0.25,
289
- "f": 0.3076923077
290
  },
291
  "EVENT": {
292
- "p": 0.6,
293
- "r": 0.5,
294
- "f": 0.5454545455
295
  },
296
  "LOC": {
297
  "p": 0.0,
298
  "r": 0.0,
299
  "f": 0.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
300
  }
301
- }
 
 
302
  }
 
1
  {
2
  "token_acc": 1.0,
3
+ "token_p": 0.9990295973,
4
+ "token_r": 0.9995068547,
5
+ "token_f": 0.9992604644,
6
+ "sents_p": 0.9305210918,
7
+ "sents_r": 0.9305210918,
8
+ "sents_f": 0.9305210918,
9
+ "dep_uas": 0.8820360598,
10
+ "dep_las": 0.8477352682,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  "dep_las_per_type": {
12
  "root": {
13
+ "p": 0.8933002481,
14
+ "r": 0.8933002481,
15
+ "f": 0.8933002481
16
  },
17
  "nmod": {
18
+ "p": 0.795,
19
+ "r": 0.8195876289,
20
+ "f": 0.807106599
21
  },
22
  "vocative": {
23
+ "p": 0.6666666667,
24
  "r": 0.5714285714,
25
+ "f": 0.6153846154
26
  },
27
  "cc": {
28
+ "p": 0.8473520249,
29
  "r": 0.8473520249,
30
+ "f": 0.8473520249
31
  },
32
  "conj": {
33
+ "p": 0.5391304348,
34
+ "r": 0.510989011,
35
+ "f": 0.5246826516
36
  },
37
  "aux": {
38
+ "p": 0.9778597786,
39
  "r": 0.9742647059,
40
+ "f": 0.9760589319
41
  },
42
  "advmod": {
43
+ "p": 0.7726027397,
44
+ "r": 0.7943661972,
45
+ "f": 0.7833333333
46
  },
47
  "ccomp": {
48
+ "p": 0.7777777778,
49
+ "r": 0.7101449275,
50
+ "f": 0.7424242424
51
  },
52
  "det": {
53
+ "p": 0.9606469003,
54
+ "r": 0.9684782609,
55
+ "f": 0.9645466847
56
  },
57
  "obj": {
58
+ "p": 0.8639053254,
59
+ "r": 0.8875379939,
60
+ "f": 0.8755622189
61
  },
62
  "flat": {
63
+ "p": 0.6315789474,
64
+ "r": 0.7272727273,
65
+ "f": 0.676056338
66
  },
67
  "case": {
68
+ "p": 0.9680511182,
69
+ "r": 0.9711538462,
70
+ "f": 0.9696
71
  },
72
  "amod": {
73
+ "p": 0.9297365119,
74
  "r": 0.8992718447,
75
+ "f": 0.9142504627
76
  },
77
  "obl": {
78
+ "p": 0.7926634769,
79
+ "r": 0.7826771654,
80
+ "f": 0.7876386688
81
  },
82
  "acl:relcl": {
83
+ "p": 0.7125748503,
84
+ "r": 0.6467391304,
85
+ "f": 0.6780626781
86
  },
87
  "mark": {
88
+ "p": 0.8928571429,
89
+ "r": 0.8741258741,
90
+ "f": 0.8833922261
91
  },
92
  "nsubj:pass": {
93
+ "p": 0.7784810127,
94
+ "r": 0.7454545455,
95
+ "f": 0.7616099071
96
  },
97
  "nsubj": {
98
+ "p": 0.7938388626,
99
+ "r": 0.7827102804,
100
+ "f": 0.7882352941
101
  },
102
  "cop": {
103
+ "p": 0.8163265306,
104
+ "r": 0.7920792079,
105
+ "f": 0.8040201005
106
  },
107
  "parataxis": {
108
+ "p": 0.5,
109
+ "r": 0.2941176471,
110
+ "f": 0.3703703704
111
  },
112
  "nummod": {
113
+ "p": 0.880952381,
114
+ "r": 0.8915662651,
115
+ "f": 0.8862275449
116
  },
117
  "advcl": {
118
+ "p": 0.4918032787,
119
+ "r": 0.5660377358,
120
+ "f": 0.5263157895
121
  },
122
  "xcomp": {
123
+ "p": 0.6746987952,
124
+ "r": 0.6666666667,
125
+ "f": 0.6706586826
126
  },
127
  "csubj": {
128
+ "p": 0.75,
129
+ "r": 0.5454545455,
130
+ "f": 0.6315789474
131
+ },
132
+ "appos": {
133
+ "p": 0.3043478261,
134
+ "r": 0.2857142857,
135
+ "f": 0.2947368421
136
+ },
137
+ "dep": {
138
+ "p": 0.0,
139
+ "r": 0.0,
140
+ "f": 0.0
141
  },
142
  "fixed": {
143
+ "p": 0.4705882353,
144
+ "r": 0.5714285714,
145
+ "f": 0.5161290323
146
  },
147
  "compound": {
148
  "p": 0.0,
149
  "r": 0.0,
150
  "f": 0.0
151
  },
 
 
 
 
 
152
  "acl": {
153
+ "p": 0.5625,
154
+ "r": 0.4090909091,
155
+ "f": 0.4736842105
156
  },
157
  "csubj:pass": {
158
+ "p": 0.8333333333,
159
  "r": 0.8333333333,
160
+ "f": 0.8333333333
161
  },
162
  "obl:agent": {
163
+ "p": 0.625,
164
+ "r": 0.4,
165
+ "f": 0.487804878
 
 
 
 
 
166
  },
167
  "orphan": {
168
  "p": 0.0,
 
170
  "f": 0.0
171
  },
172
  "iobj": {
173
+ "p": 1.0,
174
  "r": 1.0,
175
+ "f": 1.0
176
  },
177
  "expl": {
178
  "p": 0.0,
 
180
  "f": 0.0
181
  }
182
  },
183
+ "ents_p": 0.7682926829,
184
+ "ents_r": 0.7941176471,
185
+ "ents_f": 0.7809917355,
186
  "ents_per_type": {
 
 
 
 
 
187
  "GPE": {
188
+ "p": 0.0,
189
+ "r": 0.0,
190
+ "f": 0.0
191
  },
192
  "ORG": {
193
+ "p": 0.0,
194
+ "r": 0.0,
195
+ "f": 0.0
196
  },
197
+ "PERSON": {
198
+ "p": 0.0,
199
+ "r": 0.0,
200
+ "f": 0.0
201
  },
202
  "EVENT": {
203
+ "p": 0.0,
204
+ "r": 0.0,
205
+ "f": 0.0
206
  },
207
  "LOC": {
208
  "p": 0.0,
209
  "r": 0.0,
210
  "f": 0.0
211
+ },
212
+ "PRODUCT": {
213
+ "p": 0.0,
214
+ "r": 0.0,
215
+ "f": 0.0
216
+ }
217
+ },
218
+ "speed": 2288.1546505454,
219
+ "pos_acc": 0.9635655475,
220
+ "morph_acc": 0.9141645713,
221
+ "morph_micro_p": 0.9631816485,
222
+ "morph_micro_r": 0.9623978571,
223
+ "morph_micro_f": 0.9627895933,
224
+ "morph_per_feat": {
225
+ "Abbr": {
226
+ "p": 0.9863013699,
227
+ "r": 0.7741935484,
228
+ "f": 0.8674698795
229
+ },
230
+ "Case": {
231
+ "p": 0.9377600266,
232
+ "r": 0.9394798266,
233
+ "f": 0.9386191388
234
+ },
235
+ "Gender": {
236
+ "p": 0.9425861208,
237
+ "r": 0.9443147716,
238
+ "f": 0.9434496544
239
+ },
240
+ "Number": {
241
+ "p": 0.9801210026,
242
+ "r": 0.9821016166,
243
+ "f": 0.98111031
244
+ },
245
+ "Aspect": {
246
+ "p": 0.9585858586,
247
+ "r": 0.952811245,
248
+ "f": 0.9556898288
249
+ },
250
+ "Mood": {
251
+ "p": 0.9914255091,
252
+ "r": 0.9946236559,
253
+ "f": 0.9930220075
254
+ },
255
+ "Person": {
256
+ "p": 0.9824175824,
257
+ "r": 0.9824175824,
258
+ "f": 0.9824175824
259
+ },
260
+ "Tense": {
261
+ "p": 0.9830729167,
262
+ "r": 0.9843546284,
263
+ "f": 0.983713355
264
+ },
265
+ "VerbForm": {
266
+ "p": 0.9838383838,
267
+ "r": 0.9779116466,
268
+ "f": 0.9808660624
269
+ },
270
+ "Voice": {
271
+ "p": 0.9686868687,
272
+ "r": 0.9628514056,
273
+ "f": 0.9657603223
274
+ },
275
+ "Definite": {
276
+ "p": 0.9920409323,
277
+ "r": 0.9977129788,
278
+ "f": 0.9948688712
279
+ },
280
+ "PronType": {
281
+ "p": 0.9876768599,
282
+ "r": 0.9908424908,
283
+ "f": 0.9892571429
284
+ },
285
+ "Foreign": {
286
+ "p": 0.7328767123,
287
+ "r": 0.6645962733,
288
+ "f": 0.6970684039
289
+ },
290
+ "NumType": {
291
+ "p": 0.9893048128,
292
+ "r": 0.9024390244,
293
+ "f": 0.943877551
294
+ },
295
+ "Poss": {
296
+ "p": 0.9204545455,
297
+ "r": 0.9101123596,
298
+ "f": 0.9152542373
299
+ },
300
+ "Degree": {
301
+ "p": 0.8275862069,
302
+ "r": 0.6315789474,
303
+ "f": 0.7164179104
304
  }
305
+ },
306
+ "tag_acc": 0.9335897057,
307
+ "lemma_acc": 0.5646107578
308
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/el-dep-news/train.spacy"
3
- dev = "corpus/el-dep-news/dev.spacy"
4
- vectors = "corpus/el_vectors"
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = true
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = true
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -130,8 +137,8 @@ factory = "tok2vec"
130
  [components.tok2vec.model.embed]
131
  @architectures = "spacy.MultiHashEmbed.v2"
132
  width = ${components.tok2vec.model.encode:width}
133
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
134
- rows = [5000,2500,2500,2500]
135
  include_static_vectors = true
136
 
137
  [components.tok2vec.model.encode]
@@ -145,22 +152,19 @@ maxout_pieces = 3
145
 
146
  [corpora.dev]
147
  @readers = "spacy.Corpus.v1"
148
- limit = 0
149
- max_length = 0
150
- path = ${paths:dev}
151
  gold_preproc = false
 
 
152
  augmenter = null
153
 
154
  [corpora.train]
155
  @readers = "spacy.Corpus.v1"
156
- path = ${paths:train}
157
- max_length = 5000
158
  gold_preproc = false
 
159
  limit = 0
160
-
161
- [corpora.train.augmenter]
162
- @augmenters = "spacy.lower_case.v1"
163
- level = 0.1
164
 
165
  [training]
166
  train_corpus = "corpora.train"
@@ -191,9 +195,8 @@ compound = 1.001
191
  t = 0.0
192
 
193
  [training.logger]
194
- @loggers = "spacy.WandbLogger.v1"
195
- project_name = "spacy-v3.0.0a2"
196
- remove_config_values = []
197
 
198
  [training.optimizer]
199
  @optimizers = "Adam.v1"
@@ -216,16 +219,17 @@ dep_las_per_type = null
216
  sents_p = null
217
  sents_r = null
218
  sents_f = 0.02
219
- lemma_acc = 0.33
220
- ents_f = 0.33
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
 
224
 
225
  [pretraining]
226
 
227
  [initialize]
228
- vocab_data = ${paths.vocab_data}
229
  vectors = ${paths.vectors}
230
  init_tok2vec = ${paths.init_tok2vec}
231
  before_init = null
 
1
  [paths]
2
+ train = null
3
+ dev = null
4
+ vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
 
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
 
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
 
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
 
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
 
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
 
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
 
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = true
123
 
124
  [components.senter.model.tok2vec.encode]
 
137
  [components.tok2vec.model.embed]
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
+ rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = true
143
 
144
  [components.tok2vec.model.encode]
 
152
 
153
  [corpora.dev]
154
  @readers = "spacy.Corpus.v1"
155
+ path = ${paths.dev}
 
 
156
  gold_preproc = false
157
+ max_length = 0
158
+ limit = 0
159
  augmenter = null
160
 
161
  [corpora.train]
162
  @readers = "spacy.Corpus.v1"
163
+ path = ${paths.train}
 
164
  gold_preproc = false
165
+ max_length = 0
166
  limit = 0
167
+ augmenter = null
 
 
 
168
 
169
  [training]
170
  train_corpus = "corpora.train"
 
195
  t = 0.0
196
 
197
  [training.logger]
198
+ @loggers = "spacy.ConsoleLogger.v1"
199
+ progress_bar = false
 
200
 
201
  [training.optimizer]
202
  @optimizers = "Adam.v1"
 
219
  sents_p = null
220
  sents_r = null
221
  sents_f = 0.02
222
+ lemma_acc = 0.5
223
+ ents_f = 0.16
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
227
+ speed = 0.0
228
 
229
  [pretraining]
230
 
231
  [initialize]
232
+ vocab_data = null
233
  vectors = ${paths.vectors}
234
  init_tok2vec = ${paths.init_tok2vec}
235
  before_init = null
el_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f4ada4425d07c8679c7ef7d5c7ab31bd611b35e44dc15c689933d6cbac5109b8
3
- size 569059474
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab4b91664ed28477187c802178bd81f07a30ee2c9c2f902e9b4ab24d9c249f82
3
+ size 569842890
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"el",
3
  "name":"core_news_lg",
4
- "version":"3.1.0",
5
  "description":"Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-NC-SA 3.0",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
@@ -452,253 +452,169 @@
452
  ],
453
  "performance":{
454
  "token_acc":1.0,
455
- "tag_acc":0.9349701721,
456
- "pos_acc":0.9649460139,
457
- "morph_acc":0.9168268994,
458
- "lemma_acc":0.5655968052,
459
- "dep_uas":0.8808176273,
460
- "dep_las":0.8492774328,
461
- "sents_p":0.9164619165,
462
- "sents_r":0.9255583127,
463
- "sents_f":0.9209876543,
464
- "speed":3094.9893702595,
465
- "morph_per_feat":{
466
- "Abbr":{
467
- "p":0.9638554217,
468
- "r":0.8602150538,
469
- "f":0.9090909091
470
- },
471
- "Case":{
472
- "p":0.9394544245,
473
- "r":0.9416472157,
474
- "f":0.940549542
475
- },
476
- "Gender":{
477
- "p":0.9451097804,
478
- "r":0.9473157719,
479
- "f":0.9462114904
480
- },
481
- "Number":{
482
- "p":0.9825673534,
483
- "r":0.9844110855,
484
- "f":0.9834883553
485
- },
486
- "Aspect":{
487
- "p":0.9562118126,
488
- "r":0.9427710843,
489
- "f":0.9494438827
490
- },
491
- "Mood":{
492
- "p":0.9935414424,
493
- "r":0.9924731183,
494
- "f":0.993006993
495
- },
496
- "Person":{
497
- "p":0.9859675037,
498
- "r":0.978021978,
499
- "f":0.9819786686
500
- },
501
- "Tense":{
502
- "p":0.9868938401,
503
- "r":0.9817470665,
504
- "f":0.9843137255
505
- },
506
- "VerbForm":{
507
- "p":0.9867617108,
508
- "r":0.9728915663,
509
- "f":0.9797775531
510
- },
511
- "Voice":{
512
- "p":0.9765784114,
513
- "r":0.9628514056,
514
- "f":0.9696663296
515
- },
516
- "Definite":{
517
- "p":0.9886685552,
518
- "r":0.9977129788,
519
- "f":0.9931701764
520
- },
521
- "PronType":{
522
- "p":0.9858447489,
523
- "r":0.9885531136,
524
- "f":0.9871970736
525
- },
526
- "Foreign":{
527
- "p":0.7928571429,
528
- "r":0.6894409938,
529
- "f":0.7375415282
530
- },
531
- "NumType":{
532
- "p":0.9692307692,
533
- "r":0.9219512195,
534
- "f":0.945
535
- },
536
- "Poss":{
537
- "p":0.9213483146,
538
- "r":0.9213483146,
539
- "f":0.9213483146
540
- },
541
- "Degree":{
542
- "p":0.8666666667,
543
- "r":0.6842105263,
544
- "f":0.7647058824
545
- }
546
- },
547
  "dep_las_per_type":{
548
  "root":{
549
- "p":0.8771498771,
550
- "r":0.8858560794,
551
- "f":0.8814814815
552
  },
553
  "nmod":{
554
- "p":0.8143100511,
555
- "r":0.8213058419,
556
- "f":0.8177929855
557
  },
558
  "vocative":{
559
- "p":1.0,
560
  "r":0.5714285714,
561
- "f":0.7272727273
562
  },
563
  "cc":{
564
- "p":0.85,
565
  "r":0.8473520249,
566
- "f":0.848673947
567
  },
568
  "conj":{
569
- "p":0.5,
570
- "r":0.4862637363,
571
- "f":0.4930362117
572
  },
573
  "aux":{
574
- "p":0.9814814815,
575
  "r":0.9742647059,
576
- "f":0.9778597786
577
  },
578
  "advmod":{
579
- "p":0.7733711048,
580
- "r":0.7690140845,
581
- "f":0.7711864407
582
  },
583
  "ccomp":{
584
- "p":0.8260869565,
585
- "r":0.8260869565,
586
- "f":0.8260869565
587
  },
588
  "det":{
589
- "p":0.9628232759,
590
- "r":0.9711956522,
591
- "f":0.966991342
592
  },
593
  "obj":{
594
- "p":0.8652694611,
595
- "r":0.8784194529,
596
- "f":0.8717948718
597
  },
598
  "flat":{
599
- "p":0.6788990826,
600
- "r":0.7474747475,
601
- "f":0.7115384615
602
  },
603
  "case":{
604
- "p":0.9605963791,
605
- "r":0.9636752137,
606
- "f":0.9621333333
607
  },
608
  "amod":{
609
- "p":0.925093633,
610
  "r":0.8992718447,
611
- "f":0.912
612
  },
613
  "obl":{
614
- "p":0.7770992366,
615
- "r":0.8015748031,
616
- "f":0.7891472868
617
  },
618
  "acl:relcl":{
619
- "p":0.7471264368,
620
- "r":0.7065217391,
621
- "f":0.7262569832
622
  },
623
  "mark":{
624
- "p":0.9097222222,
625
- "r":0.9160839161,
626
- "f":0.9128919861
627
  },
628
  "nsubj:pass":{
629
- "p":0.7865853659,
630
- "r":0.7818181818,
631
- "f":0.7841945289
632
  },
633
  "nsubj":{
634
- "p":0.7690531178,
635
- "r":0.7780373832,
636
- "f":0.7735191638
637
  },
638
  "cop":{
639
- "p":0.7857142857,
640
- "r":0.7623762376,
641
- "f":0.7738693467
642
  },
643
  "parataxis":{
644
- "p":0.125,
645
- "r":0.0588235294,
646
- "f":0.08
647
  },
648
  "nummod":{
649
- "p":0.8488372093,
650
- "r":0.8795180723,
651
- "f":0.8639053254
652
  },
653
  "advcl":{
654
- "p":0.5041322314,
655
- "r":0.5754716981,
656
- "f":0.5374449339
657
  },
658
  "xcomp":{
659
- "p":0.7972972973,
660
- "r":0.7023809524,
661
- "f":0.746835443
662
  },
663
  "csubj":{
664
- "p":0.8823529412,
665
- "r":0.6818181818,
666
- "f":0.7692307692
 
 
 
 
 
 
 
 
 
 
667
  },
668
  "fixed":{
669
- "p":0.3,
670
- "r":0.4285714286,
671
- "f":0.3529411765
672
  },
673
  "compound":{
674
  "p":0.0,
675
  "r":0.0,
676
  "f":0.0
677
  },
678
- "appos":{
679
- "p":0.3076923077,
680
- "r":0.2448979592,
681
- "f":0.2727272727
682
- },
683
  "acl":{
684
- "p":0.7916666667,
685
- "r":0.4318181818,
686
- "f":0.5588235294
687
  },
688
  "csubj:pass":{
689
- "p":1.0,
690
  "r":0.8333333333,
691
- "f":0.9090909091
692
  },
693
  "obl:agent":{
694
- "p":0.8,
695
- "r":0.48,
696
- "f":0.6
697
- },
698
- "dep":{
699
- "p":0.0,
700
- "r":0.0,
701
- "f":0.0
702
  },
703
  "orphan":{
704
  "p":0.0,
@@ -706,9 +622,9 @@
706
  "f":0.0
707
  },
708
  "iobj":{
709
- "p":0.6666666667,
710
  "r":1.0,
711
- "f":0.8
712
  },
713
  "expl":{
714
  "p":0.0,
@@ -716,45 +632,135 @@
716
  "f":0.0
717
  }
718
  },
719
- "ents_p":0.7644628099,
720
- "ents_r":0.7773109244,
721
- "ents_f":0.7708333333,
722
  "ents_per_type":{
723
- "PERSON":{
724
- "p":0.9,
725
- "r":0.84375,
726
- "f":0.8709677419
727
- },
728
  "GPE":{
729
- "p":0.806122449,
730
- "r":0.908045977,
731
- "f":0.8540540541
732
  },
733
  "ORG":{
734
- "p":0.6351351351,
735
- "r":0.661971831,
736
- "f":0.6482758621
737
  },
738
- "PRODUCT":{
739
- "p":0.4,
740
- "r":0.25,
741
- "f":0.3076923077
742
  },
743
  "EVENT":{
744
- "p":0.6,
745
- "r":0.5,
746
- "f":0.5454545455
747
  },
748
  "LOC":{
749
  "p":0.0,
750
  "r":0.0,
751
  "f":0.0
 
 
 
 
 
752
  }
753
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
754
  },
755
  "sources":[
756
  {
757
- "name":"UD Greek GDT v2.5",
758
  "url":"https://github.com/UniversalDependencies/UD_Greek-GDT",
759
  "license":"CC BY-NC-SA 3.0",
760
  "author":"Prokopidis, Prokopis"
 
1
  {
2
  "lang":"el",
3
  "name":"core_news_lg",
4
+ "version":"3.2.0",
5
  "description":"Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-NC-SA 3.0",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
 
452
  ],
453
  "performance":{
454
  "token_acc":1.0,
455
+ "token_p":0.9990295973,
456
+ "token_r":0.9995068547,
457
+ "token_f":0.9992604644,
458
+ "sents_p":0.9305210918,
459
+ "sents_r":0.9305210918,
460
+ "sents_f":0.9305210918,
461
+ "dep_uas":0.8820360598,
462
+ "dep_las":0.8477352682,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
463
  "dep_las_per_type":{
464
  "root":{
465
+ "p":0.8933002481,
466
+ "r":0.8933002481,
467
+ "f":0.8933002481
468
  },
469
  "nmod":{
470
+ "p":0.795,
471
+ "r":0.8195876289,
472
+ "f":0.807106599
473
  },
474
  "vocative":{
475
+ "p":0.6666666667,
476
  "r":0.5714285714,
477
+ "f":0.6153846154
478
  },
479
  "cc":{
480
+ "p":0.8473520249,
481
  "r":0.8473520249,
482
+ "f":0.8473520249
483
  },
484
  "conj":{
485
+ "p":0.5391304348,
486
+ "r":0.510989011,
487
+ "f":0.5246826516
488
  },
489
  "aux":{
490
+ "p":0.9778597786,
491
  "r":0.9742647059,
492
+ "f":0.9760589319
493
  },
494
  "advmod":{
495
+ "p":0.7726027397,
496
+ "r":0.7943661972,
497
+ "f":0.7833333333
498
  },
499
  "ccomp":{
500
+ "p":0.7777777778,
501
+ "r":0.7101449275,
502
+ "f":0.7424242424
503
  },
504
  "det":{
505
+ "p":0.9606469003,
506
+ "r":0.9684782609,
507
+ "f":0.9645466847
508
  },
509
  "obj":{
510
+ "p":0.8639053254,
511
+ "r":0.8875379939,
512
+ "f":0.8755622189
513
  },
514
  "flat":{
515
+ "p":0.6315789474,
516
+ "r":0.7272727273,
517
+ "f":0.676056338
518
  },
519
  "case":{
520
+ "p":0.9680511182,
521
+ "r":0.9711538462,
522
+ "f":0.9696
523
  },
524
  "amod":{
525
+ "p":0.9297365119,
526
  "r":0.8992718447,
527
+ "f":0.9142504627
528
  },
529
  "obl":{
530
+ "p":0.7926634769,
531
+ "r":0.7826771654,
532
+ "f":0.7876386688
533
  },
534
  "acl:relcl":{
535
+ "p":0.7125748503,
536
+ "r":0.6467391304,
537
+ "f":0.6780626781
538
  },
539
  "mark":{
540
+ "p":0.8928571429,
541
+ "r":0.8741258741,
542
+ "f":0.8833922261
543
  },
544
  "nsubj:pass":{
545
+ "p":0.7784810127,
546
+ "r":0.7454545455,
547
+ "f":0.7616099071
548
  },
549
  "nsubj":{
550
+ "p":0.7938388626,
551
+ "r":0.7827102804,
552
+ "f":0.7882352941
553
  },
554
  "cop":{
555
+ "p":0.8163265306,
556
+ "r":0.7920792079,
557
+ "f":0.8040201005
558
  },
559
  "parataxis":{
560
+ "p":0.5,
561
+ "r":0.2941176471,
562
+ "f":0.3703703704
563
  },
564
  "nummod":{
565
+ "p":0.880952381,
566
+ "r":0.8915662651,
567
+ "f":0.8862275449
568
  },
569
  "advcl":{
570
+ "p":0.4918032787,
571
+ "r":0.5660377358,
572
+ "f":0.5263157895
573
  },
574
  "xcomp":{
575
+ "p":0.6746987952,
576
+ "r":0.6666666667,
577
+ "f":0.6706586826
578
  },
579
  "csubj":{
580
+ "p":0.75,
581
+ "r":0.5454545455,
582
+ "f":0.6315789474
583
+ },
584
+ "appos":{
585
+ "p":0.3043478261,
586
+ "r":0.2857142857,
587
+ "f":0.2947368421
588
+ },
589
+ "dep":{
590
+ "p":0.0,
591
+ "r":0.0,
592
+ "f":0.0
593
  },
594
  "fixed":{
595
+ "p":0.4705882353,
596
+ "r":0.5714285714,
597
+ "f":0.5161290323
598
  },
599
  "compound":{
600
  "p":0.0,
601
  "r":0.0,
602
  "f":0.0
603
  },
 
 
 
 
 
604
  "acl":{
605
+ "p":0.5625,
606
+ "r":0.4090909091,
607
+ "f":0.4736842105
608
  },
609
  "csubj:pass":{
610
+ "p":0.8333333333,
611
  "r":0.8333333333,
612
+ "f":0.8333333333
613
  },
614
  "obl:agent":{
615
+ "p":0.625,
616
+ "r":0.4,
617
+ "f":0.487804878
 
 
 
 
 
618
  },
619
  "orphan":{
620
  "p":0.0,
 
622
  "f":0.0
623
  },
624
  "iobj":{
625
+ "p":1.0,
626
  "r":1.0,
627
+ "f":1.0
628
  },
629
  "expl":{
630
  "p":0.0,
 
632
  "f":0.0
633
  }
634
  },
635
+ "ents_p":0.7682926829,
636
+ "ents_r":0.7941176471,
637
+ "ents_f":0.7809917355,
638
  "ents_per_type":{
 
 
 
 
 
639
  "GPE":{
640
+ "p":0.0,
641
+ "r":0.0,
642
+ "f":0.0
643
  },
644
  "ORG":{
645
+ "p":0.0,
646
+ "r":0.0,
647
+ "f":0.0
648
  },
649
+ "PERSON":{
650
+ "p":0.0,
651
+ "r":0.0,
652
+ "f":0.0
653
  },
654
  "EVENT":{
655
+ "p":0.0,
656
+ "r":0.0,
657
+ "f":0.0
658
  },
659
  "LOC":{
660
  "p":0.0,
661
  "r":0.0,
662
  "f":0.0
663
+ },
664
+ "PRODUCT":{
665
+ "p":0.0,
666
+ "r":0.0,
667
+ "f":0.0
668
  }
669
+ },
670
+ "speed":2288.1546505454,
671
+ "pos_acc":0.9635655475,
672
+ "morph_acc":0.9141645713,
673
+ "morph_micro_p":0.9631816485,
674
+ "morph_micro_r":0.9623978571,
675
+ "morph_micro_f":0.9627895933,
676
+ "morph_per_feat":{
677
+ "Abbr":{
678
+ "p":0.9863013699,
679
+ "r":0.7741935484,
680
+ "f":0.8674698795
681
+ },
682
+ "Case":{
683
+ "p":0.9377600266,
684
+ "r":0.9394798266,
685
+ "f":0.9386191388
686
+ },
687
+ "Gender":{
688
+ "p":0.9425861208,
689
+ "r":0.9443147716,
690
+ "f":0.9434496544
691
+ },
692
+ "Number":{
693
+ "p":0.9801210026,
694
+ "r":0.9821016166,
695
+ "f":0.98111031
696
+ },
697
+ "Aspect":{
698
+ "p":0.9585858586,
699
+ "r":0.952811245,
700
+ "f":0.9556898288
701
+ },
702
+ "Mood":{
703
+ "p":0.9914255091,
704
+ "r":0.9946236559,
705
+ "f":0.9930220075
706
+ },
707
+ "Person":{
708
+ "p":0.9824175824,
709
+ "r":0.9824175824,
710
+ "f":0.9824175824
711
+ },
712
+ "Tense":{
713
+ "p":0.9830729167,
714
+ "r":0.9843546284,
715
+ "f":0.983713355
716
+ },
717
+ "VerbForm":{
718
+ "p":0.9838383838,
719
+ "r":0.9779116466,
720
+ "f":0.9808660624
721
+ },
722
+ "Voice":{
723
+ "p":0.9686868687,
724
+ "r":0.9628514056,
725
+ "f":0.9657603223
726
+ },
727
+ "Definite":{
728
+ "p":0.9920409323,
729
+ "r":0.9977129788,
730
+ "f":0.9948688712
731
+ },
732
+ "PronType":{
733
+ "p":0.9876768599,
734
+ "r":0.9908424908,
735
+ "f":0.9892571429
736
+ },
737
+ "Foreign":{
738
+ "p":0.7328767123,
739
+ "r":0.6645962733,
740
+ "f":0.6970684039
741
+ },
742
+ "NumType":{
743
+ "p":0.9893048128,
744
+ "r":0.9024390244,
745
+ "f":0.943877551
746
+ },
747
+ "Poss":{
748
+ "p":0.9204545455,
749
+ "r":0.9101123596,
750
+ "f":0.9152542373
751
+ },
752
+ "Degree":{
753
+ "p":0.8275862069,
754
+ "r":0.6315789474,
755
+ "f":0.7164179104
756
+ }
757
+ },
758
+ "tag_acc":0.9335897057,
759
+ "lemma_acc":0.5646107578
760
  },
761
  "sources":[
762
  {
763
+ "name":"UD Greek GDT v2.8",
764
  "url":"https://github.com/UniversalDependencies/UD_Greek-GDT",
765
  "license":"CC BY-NC-SA 3.0",
766
  "author":"Prokopidis, Prokopis"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "Case=Nom|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Case=Nom|Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
4
  "Foreign=Yes|POS=X":"Foreign=Yes",
@@ -714,5 +715,6 @@
714
  "Case=Gen|Gender=Fem|NumType=Ord|Number=Plur|POS=NUM":93,
715
  "Case=Dat|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
716
  "Case=Gen|Degree=Cmp|Gender=Masc|Number=Sing|POS=ADJ":84
717
- }
 
718
  }
 
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "Case=Nom|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Case=Nom|Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
5
  "Foreign=Yes|POS=X":"Foreign=Yes",
 
715
  "Case=Gen|Gender=Fem|NumType=Ord|Number=Plur|POS=NUM":93,
716
  "Case=Dat|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
717
  "Case=Gen|Degree=Cmp|Gender=Masc|Number=Sing|POS=ADJ":84
718
+ },
719
+ "overwrite":true
720
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
 
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
 
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
 
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
 
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
 
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
 
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a04cb52774aa8a466834f48e1a6c99394f3fdea9f11e1f8981882edbc044edc4
3
- size 25442553
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e43c51edcca429c5d1125f81e66c9902e9c0c7322b4c585b8f9c4f410202f99
3
+ size 30279107
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }