osanseviero commited on
Commit
d79daad
1 Parent(s): dc55b25

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD French Sequoia v2.5
2
 
3
  * Author: Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno
4
  * URL: https://github.com/UniversalDependencies/UD_French-Sequoia
 
1
+ # UD French Sequoia v2.8
2
 
3
  * Author: Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno
4
  * URL: https://github.com/UniversalDependencies/UD_French-Sequoia
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - fr
7
- license: lgpllr
8
  model-index:
9
  - name: fr_core_news_lg
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8431266739
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8443416997
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8437337493
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.947357577
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.8521126761
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.9029253272
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.8715586104
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.897142196
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.897142196
58
  ---
59
  ### Details: https://spacy.io/models/fr#fr_core_news_lg
60
 
@@ -63,12 +63,12 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `fr_core_news_lg` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
- | **Sources** | [UD French Sequoia v2.5](https://github.com/UniversalDependencies/UD_French-Sequoia) (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno)<br />[WikiNER](https://figshare.com/articles/Learning_multilingual_named_entity_recognition_from_Wikipedia/5462500) (Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R Curran)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `LGPL-LR` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,11 +76,11 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
76
 
77
  <details>
78
 
79
- <summary>View label scheme (240 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
- | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `POS=PART`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
84
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
85
  | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
@@ -92,15 +92,21 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.90 |
95
- | `TAG_ACC` | 94.74 |
96
- | `POS_ACC` | 97.61 |
97
- | `MORPH_ACC` | 96.86 |
98
- | `LEMMA_ACC` | 90.91 |
99
- | `DEP_UAS` | 89.71 |
100
- | `DEP_LAS` | 85.95 |
101
- | `SENTS_P` | 85.21 |
102
- | `SENTS_R` | 90.29 |
103
- | `SENTS_F` | 87.16 |
104
- | `ENTS_P` | 84.31 |
105
- | `ENTS_R` | 84.43 |
106
- | `ENTS_F` | 84.37 |
 
 
 
 
 
 
 
4
  - token-classification
5
  language:
6
  - fr
7
+ license: lgpl-lr
8
  model-index:
9
  - name: fr_core_news_lg
10
  results:
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.843850032
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8442216084
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8440357793
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.947075496
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.8684210526
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.9002309469
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.8746987952
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.9011981247
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.9011981247
58
  ---
59
  ### Details: https://spacy.io/models/fr#fr_core_news_lg
60
 
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `fr_core_news_lg` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
+ | **Sources** | [UD French Sequoia v2.8](https://github.com/UniversalDependencies/UD_French-Sequoia) (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno)<br />[WikiNER](https://figshare.com/articles/Learning_multilingual_named_entity_recognition_from_Wikipedia/5462500) (Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R Curran)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `LGPL-LR` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
 
76
 
77
  <details>
78
 
79
+ <summary>View label scheme (238 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
+ | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
84
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
85
  | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
 
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.90 |
95
+ | `TOKEN_P` | 98.44 |
96
+ | `TOKEN_R` | 98.96 |
97
+ | `TOKEN_F` | 98.70 |
98
+ | `POS_ACC` | 97.57 |
99
+ | `MORPH_ACC` | 96.96 |
100
+ | `MORPH_MICRO_P` | 98.91 |
101
+ | `MORPH_MICRO_R` | 98.19 |
102
+ | `MORPH_MICRO_F` | 98.55 |
103
+ | `SENTS_P` | 86.84 |
104
+ | `SENTS_R` | 90.02 |
105
+ | `SENTS_F` | 87.47 |
106
+ | `DEP_UAS` | 90.12 |
107
+ | `DEP_LAS` | 86.32 |
108
+ | `TAG_ACC` | 94.71 |
109
+ | `LEMMA_ACC` | 90.95 |
110
+ | `ENTS_P` | 84.39 |
111
+ | `ENTS_R` | 84.42 |
112
+ | `ENTS_F` | 84.40 |
accuracy.json CHANGED
@@ -1,60 +1,58 @@
1
  {
2
  "token_acc": 0.9989751998,
3
- "tag_acc": 0.947357577,
4
- "pos_acc": 0.9760997219,
5
- "morph_acc": 0.968569662,
6
- "lemma_acc": 0.9091282684,
7
- "dep_uas": 0.897142196,
8
- "dep_las": 0.8595222948,
9
- "sents_p": 0.8521126761,
10
- "sents_r": 0.9029253272,
11
- "sents_f": 0.8715586104,
12
- "speed": 5335.1974335443,
13
  "morph_per_feat": {
14
  "Definite": {
15
- "p": 0.9904901244,
16
- "r": 0.9876002918,
17
- "f": 0.9890430972
18
  },
19
  "Number": {
20
- "p": 0.9934725849,
21
- "r": 0.9872127502,
22
- "f": 0.9903327756
23
  },
24
  "PronType": {
25
- "p": 0.9967845659,
26
- "r": 0.9910485934,
27
  "f": 0.9939083039
28
  },
29
  "Gender": {
30
- "p": 0.9865040228,
31
- "r": 0.9821705426,
32
- "f": 0.9843325133
33
  },
34
  "Mood": {
35
- "p": 0.9711711712,
36
- "r": 0.9573712256,
37
- "f": 0.9642218247
38
  },
39
  "Person": {
40
- "p": 0.9872935197,
41
- "r": 0.9761306533,
42
- "f": 0.9816803538
43
  },
44
  "Tense": {
45
- "p": 0.9702868852,
46
- "r": 0.9673135853,
47
- "f": 0.968797954
48
  },
49
  "VerbForm": {
50
- "p": 0.9833610649,
51
- "r": 0.9784768212,
52
- "f": 0.9809128631
53
  },
54
  "NumType": {
55
- "p": 1.0,
56
- "r": 0.9620689655,
57
- "f": 0.9806678383
58
  },
59
  "Reflex": {
60
  "p": 1.0,
@@ -62,9 +60,9 @@
62
  "f": 1.0
63
  },
64
  "Voice": {
65
- "p": 0.9380530973,
66
  "r": 0.9464285714,
67
- "f": 0.9422222222
68
  },
69
  "Poss": {
70
  "p": 1.0,
@@ -77,171 +75,176 @@
77
  "f": 0.9940828402
78
  }
79
  },
 
 
 
 
 
80
  "dep_las_per_type": {
81
  "det": {
82
- "p": 0.9813614263,
83
- "r": 0.9774011299,
84
- "f": 0.9793772746
85
  },
86
  "nsubj": {
87
- "p": 0.8704156479,
88
- "r": 0.8578313253,
89
- "f": 0.8640776699
90
  },
91
  "aux:tense": {
92
- "p": 0.9365079365,
93
- "r": 0.944,
94
- "f": 0.9402390438
95
  },
96
  "root": {
97
- "p": 0.8558139535,
98
- "r": 0.8932038835,
99
- "f": 0.8741092637
100
  },
101
  "obj": {
102
- "p": 0.8504398827,
103
- "r": 0.8605341246,
104
- "f": 0.8554572271
105
  },
106
  "cc": {
107
- "p": 0.8801843318,
108
- "r": 0.8801843318,
109
- "f": 0.8801843318
110
  },
111
  "case": {
112
- "p": 0.9669142471,
113
- "r": 0.9754768392,
114
- "f": 0.9711766701
115
  },
116
  "obl:mod": {
117
- "p": 0.678807947,
118
- "r": 0.6101190476,
119
- "f": 0.6426332288
120
  },
121
  "nmod": {
122
- "p": 0.8216007715,
123
- "r": 0.8502994012,
124
- "f": 0.8357037764
125
  },
126
  "conj": {
127
- "p": 0.5275590551,
128
- "r": 0.5275590551,
129
- "f": 0.5275590551
130
  },
131
  "nummod": {
132
- "p": 0.8950617284,
133
- "r": 0.8630952381,
134
- "f": 0.8787878788
135
  },
136
  "amod": {
137
- "p": 0.9237918216,
138
- "r": 0.9069343066,
139
- "f": 0.9152854512
140
  },
141
  "acl": {
142
- "p": 0.702247191,
143
- "r": 0.7225433526,
144
- "f": 0.7122507123
145
  },
146
  "mark": {
147
- "p": 0.8711111111,
148
- "r": 0.8634361233,
149
- "f": 0.8672566372
150
  },
151
  "xcomp": {
152
- "p": 0.8698630137,
153
- "r": 0.8410596026,
154
- "f": 0.8552188552
155
  },
156
  "flat:name": {
157
- "p": 0.8888888889,
158
- "r": 0.9142857143,
159
- "f": 0.9014084507
160
  },
161
  "cop": {
162
- "p": 0.8602150538,
163
  "r": 0.8888888889,
164
- "f": 0.8743169399
165
  },
166
  "advmod": {
167
- "p": 0.8935483871,
168
- "r": 0.868338558,
169
- "f": 0.8807631161
170
  },
171
  "obl:arg": {
172
- "p": 0.7104072398,
173
- "r": 0.7136363636,
174
- "f": 0.7120181406
175
  },
176
  "appos": {
177
- "p": 0.5857142857,
178
- "r": 0.4939759036,
179
- "f": 0.5359477124
180
  },
181
  "nsubj:pass": {
182
- "p": 0.8846153846,
183
- "r": 0.8117647059,
184
- "f": 0.8466257669
185
  },
186
  "aux:pass": {
187
- "p": 0.954954955,
188
- "r": 0.9464285714,
189
- "f": 0.9506726457
190
  },
191
  "acl:relcl": {
192
- "p": 0.6987951807,
193
- "r": 0.6744186047,
194
- "f": 0.6863905325
195
  },
196
  "advcl": {
197
- "p": 0.5256410256,
198
- "r": 0.5256410256,
199
- "f": 0.5256410256
200
  },
201
  "fixed": {
202
- "p": 0.8539325843,
203
- "r": 0.7524752475,
204
- "f": 0.8
205
  },
206
  "dep": {
207
- "p": 0.3230769231,
208
- "r": 0.6774193548,
209
- "f": 0.4375
210
  },
211
  "expl:subj": {
212
- "p": 0.8,
213
- "r": 0.75,
214
- "f": 0.7741935484
215
  },
216
  "expl:comp": {
217
- "p": 0.6428571429,
218
- "r": 0.9,
219
- "f": 0.75
220
  },
221
  "expl:pass": {
222
- "p": 0.3333333333,
223
- "r": 0.1428571429,
224
- "f": 0.2
225
- },
226
- "obl:agent": {
227
- "p": 0.8085106383,
228
- "r": 0.9047619048,
229
- "f": 0.8539325843
230
  },
231
  "ccomp": {
232
- "p": 0.7547169811,
233
- "r": 0.7843137255,
234
- "f": 0.7692307692
 
 
 
 
 
235
  },
236
  "parataxis": {
237
- "p": 0.4333333333,
238
- "r": 0.4642857143,
239
- "f": 0.4482758621
240
  },
241
  "iobj": {
242
- "p": 0.75,
243
- "r": 0.48,
244
- "f": 0.5853658537
245
  },
246
  "nsubj:caus": {
247
  "p": 0.0,
@@ -264,9 +267,9 @@
264
  "f": 0.0
265
  },
266
  "vocative": {
267
- "p": 1.0,
268
  "r": 0.625,
269
- "f": 0.7692307692
270
  },
271
  "dislocated": {
272
  "p": 0.0,
@@ -274,9 +277,9 @@
274
  "f": 0.0
275
  },
276
  "flat:foreign": {
277
- "p": 0.6666666667,
278
- "r": 0.2857142857,
279
- "f": 0.4
280
  },
281
  "orphan": {
282
  "p": 0.0,
@@ -294,29 +297,32 @@
294
  "f": 0.0
295
  }
296
  },
297
- "ents_p": 0.8431266739,
298
- "ents_r": 0.8443416997,
299
- "ents_f": 0.8437337493,
 
 
300
  "ents_per_type": {
301
  "PER": {
302
- "p": 0.907275021,
303
- "r": 0.9274915813,
304
- "f": 0.9172719221
305
  },
306
  "LOC": {
307
- "p": 0.8427331441,
308
- "r": 0.8569396804,
309
- "f": 0.8497770403
310
  },
311
  "ORG": {
312
- "p": 0.7912621359,
313
- "r": 0.7776717557,
314
- "f": 0.7844080847
315
  },
316
  "MISC": {
317
- "p": 0.7418741081,
318
- "r": 0.6821694125,
319
- "f": 0.7107701656
320
  }
321
- }
 
322
  }
 
1
  {
2
  "token_acc": 0.9989751998,
3
+ "token_p": 0.9844389844,
4
+ "token_r": 0.9896058454,
5
+ "token_f": 0.9870156531,
6
+ "pos_acc": 0.9757279052,
7
+ "morph_acc": 0.9695876289,
8
+ "morph_micro_p": 0.9891371057,
9
+ "morph_micro_r": 0.9819056903,
10
+ "morph_micro_f": 0.9855081326,
 
 
11
  "morph_per_feat": {
12
  "Definite": {
13
+ "p": 0.9890510949,
14
+ "r": 0.9890510949,
15
+ "f": 0.9890510949
16
  },
17
  "Number": {
18
+ "p": 0.9946286349,
19
+ "r": 0.9885861561,
20
+ "f": 0.9915981904
21
  },
22
  "PronType": {
23
+ "p": 0.9961439589,
24
+ "r": 0.9916826615,
25
  "f": 0.9939083039
26
  },
27
  "Gender": {
28
+ "p": 0.9865979381,
29
+ "r": 0.9782775364,
30
+ "f": 0.9824201206
31
  },
32
  "Mood": {
33
+ "p": 0.9747292419,
34
+ "r": 0.9591474245,
35
+ "f": 0.9668755595
36
  },
37
  "Person": {
38
+ "p": 0.9923273657,
39
+ "r": 0.9761006289,
40
+ "f": 0.9841471148
41
  },
42
  "Tense": {
43
+ "p": 0.975308642,
44
+ "r": 0.9683350358,
45
+ "f": 0.9718093285
46
  },
47
  "VerbForm": {
48
+ "p": 0.9841534612,
49
+ "r": 0.9768211921,
50
+ "f": 0.9804736186
51
  },
52
  "NumType": {
53
+ "p": 0.9858657244,
54
+ "r": 0.95221843,
55
+ "f": 0.96875
56
  },
57
  "Reflex": {
58
  "p": 1.0,
 
60
  "f": 1.0
61
  },
62
  "Voice": {
63
+ "p": 0.9298245614,
64
  "r": 0.9464285714,
65
+ "f": 0.9380530973
66
  },
67
  "Poss": {
68
  "p": 1.0,
 
75
  "f": 0.9940828402
76
  }
77
  },
78
+ "sents_p": 0.8684210526,
79
+ "sents_r": 0.9002309469,
80
+ "sents_f": 0.8746987952,
81
+ "dep_uas": 0.9011981247,
82
+ "dep_las": 0.863160331,
83
  "dep_las_per_type": {
84
  "det": {
85
+ "p": 0.9798549557,
86
+ "r": 0.9814366425,
87
+ "f": 0.9806451613
88
  },
89
  "nsubj": {
90
+ "p": 0.896039604,
91
+ "r": 0.8722891566,
92
+ "f": 0.884004884
93
  },
94
  "aux:tense": {
95
+ "p": 0.9754098361,
96
+ "r": 0.952,
97
+ "f": 0.963562753
98
  },
99
  "root": {
100
+ "p": 0.8708133971,
101
+ "r": 0.8834951456,
102
+ "f": 0.8771084337
103
  },
104
  "obj": {
105
+ "p": 0.865497076,
106
+ "r": 0.8783382789,
107
+ "f": 0.8718703976
108
  },
109
  "cc": {
110
+ "p": 0.8986175115,
111
+ "r": 0.8986175115,
112
+ "f": 0.8986175115
113
  },
114
  "case": {
115
+ "p": 0.9708078751,
116
+ "r": 0.9741144414,
117
+ "f": 0.9724583475
118
  },
119
  "obl:mod": {
120
+ "p": 0.6719242902,
121
+ "r": 0.6358208955,
122
+ "f": 0.6533742331
123
  },
124
  "nmod": {
125
+ "p": 0.8217349857,
126
+ "r": 0.8611388611,
127
+ "f": 0.8409756098
128
  },
129
  "conj": {
130
+ "p": 0.5595238095,
131
+ "r": 0.5551181102,
132
+ "f": 0.557312253
133
  },
134
  "nummod": {
135
+ "p": 0.9299363057,
136
+ "r": 0.8639053254,
137
+ "f": 0.8957055215
138
  },
139
  "amod": {
140
+ "p": 0.9307116105,
141
+ "r": 0.9052823315,
142
+ "f": 0.917820868
143
  },
144
  "acl": {
145
+ "p": 0.7045454545,
146
+ "r": 0.7167630058,
147
+ "f": 0.7106017192
148
  },
149
  "mark": {
150
+ "p": 0.8928571429,
151
+ "r": 0.8810572687,
152
+ "f": 0.8869179601
153
  },
154
  "xcomp": {
155
+ "p": 0.8835616438,
156
+ "r": 0.8543046358,
157
+ "f": 0.8686868687
158
  },
159
  "flat:name": {
160
+ "p": 0.9047619048,
161
+ "r": 0.9047619048,
162
+ "f": 0.9047619048
163
  },
164
  "cop": {
165
+ "p": 0.8988764045,
166
  "r": 0.8888888889,
167
+ "f": 0.8938547486
168
  },
169
  "advmod": {
170
+ "p": 0.858044164,
171
+ "r": 0.8526645768,
172
+ "f": 0.8553459119
173
  },
174
  "obl:arg": {
175
+ "p": 0.6832579186,
176
+ "r": 0.6863636364,
177
+ "f": 0.6848072562
178
  },
179
  "appos": {
180
+ "p": 0.5421686747,
181
+ "r": 0.5421686747,
182
+ "f": 0.5421686747
183
  },
184
  "nsubj:pass": {
185
+ "p": 0.880952381,
186
+ "r": 0.8705882353,
187
+ "f": 0.875739645
188
  },
189
  "aux:pass": {
190
+ "p": 0.9557522124,
191
+ "r": 0.9642857143,
192
+ "f": 0.96
193
  },
194
  "acl:relcl": {
195
+ "p": 0.686746988,
196
+ "r": 0.6627906977,
197
+ "f": 0.674556213
198
  },
199
  "advcl": {
200
+ "p": 0.4810126582,
201
+ "r": 0.4871794872,
202
+ "f": 0.4840764331
203
  },
204
  "fixed": {
205
+ "p": 0.7956989247,
206
+ "r": 0.74,
207
+ "f": 0.7668393782
208
  },
209
  "dep": {
210
+ "p": 0.253968254,
211
+ "r": 0.5517241379,
212
+ "f": 0.347826087
213
  },
214
  "expl:subj": {
215
+ "p": 0.8125,
216
+ "r": 0.8125,
217
+ "f": 0.8125
218
  },
219
  "expl:comp": {
220
+ "p": 0.7,
221
+ "r": 0.9333333333,
222
+ "f": 0.8
223
  },
224
  "expl:pass": {
225
+ "p": 0.4,
226
+ "r": 0.2857142857,
227
+ "f": 0.3333333333
 
 
 
 
 
228
  },
229
  "ccomp": {
230
+ "p": 0.7,
231
+ "r": 0.6862745098,
232
+ "f": 0.6930693069
233
+ },
234
+ "obl:agent": {
235
+ "p": 0.8888888889,
236
+ "r": 0.7619047619,
237
+ "f": 0.8205128205
238
  },
239
  "parataxis": {
240
+ "p": 0.6,
241
+ "r": 0.4285714286,
242
+ "f": 0.5
243
  },
244
  "iobj": {
245
+ "p": 0.8235294118,
246
+ "r": 0.56,
247
+ "f": 0.6666666667
248
  },
249
  "nsubj:caus": {
250
  "p": 0.0,
 
267
  "f": 0.0
268
  },
269
  "vocative": {
270
+ "p": 0.8333333333,
271
  "r": 0.625,
272
+ "f": 0.7142857143
273
  },
274
  "dislocated": {
275
  "p": 0.0,
 
277
  "f": 0.0
278
  },
279
  "flat:foreign": {
280
+ "p": 1.0,
281
+ "r": 0.1428571429,
282
+ "f": 0.25
283
  },
284
  "orphan": {
285
  "p": 0.0,
 
297
  "f": 0.0
298
  }
299
  },
300
+ "tag_acc": 0.947075496,
301
+ "lemma_acc": 0.9094739005,
302
+ "ents_p": 0.843850032,
303
+ "ents_r": 0.8442216084,
304
+ "ents_f": 0.8440357793,
305
  "ents_per_type": {
306
  "PER": {
307
+ "p": 0.9113682873,
308
+ "r": 0.9253421222,
309
+ "f": 0.9183020478
310
  },
311
  "LOC": {
312
+ "p": 0.8441938007,
313
+ "r": 0.8578599515,
314
+ "f": 0.8509720119
315
  },
316
  "ORG": {
317
+ "p": 0.7981488775,
318
+ "r": 0.7734732824,
319
+ "f": 0.7856173677
320
  },
321
  "MISC": {
322
+ "p": 0.7300527786,
323
+ "r": 0.6856684648,
324
+ "f": 0.7071648748
325
  }
326
+ },
327
+ "speed": 4058.7135091792
328
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/fr-dep-news/train.spacy"
3
- dev = "corpus/fr-dep-news/dev.spacy"
4
- vectors = "corpus/fr_vectors"
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = true
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = true
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -130,8 +137,8 @@ factory = "tok2vec"
130
  [components.tok2vec.model.embed]
131
  @architectures = "spacy.MultiHashEmbed.v2"
132
  width = ${components.tok2vec.model.encode:width}
133
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
134
- rows = [5000,2500,2500,2500]
135
  include_static_vectors = true
136
 
137
  [components.tok2vec.model.encode]
@@ -145,22 +152,19 @@ maxout_pieces = 3
145
 
146
  [corpora.dev]
147
  @readers = "spacy.Corpus.v1"
148
- limit = 0
149
- max_length = 0
150
- path = ${paths:dev}
151
  gold_preproc = false
 
 
152
  augmenter = null
153
 
154
  [corpora.train]
155
  @readers = "spacy.Corpus.v1"
156
- path = ${paths:train}
157
- max_length = 5000
158
  gold_preproc = false
 
159
  limit = 0
160
-
161
- [corpora.train.augmenter]
162
- @augmenters = "spacy.lower_case.v1"
163
- level = 0.1
164
 
165
  [training]
166
  train_corpus = "corpora.train"
@@ -191,9 +195,8 @@ compound = 1.001
191
  t = 0.0
192
 
193
  [training.logger]
194
- @loggers = "spacy.WandbLogger.v1"
195
- project_name = "spacy-v3.0.0a2"
196
- remove_config_values = []
197
 
198
  [training.optimizer]
199
  @optimizers = "Adam.v1"
@@ -216,16 +219,17 @@ dep_las_per_type = null
216
  sents_p = null
217
  sents_r = null
218
  sents_f = 0.02
219
- lemma_acc = 0.33
220
- ents_f = 0.33
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
 
224
 
225
  [pretraining]
226
 
227
  [initialize]
228
- vocab_data = ${paths.vocab_data}
229
  vectors = ${paths.vectors}
230
  init_tok2vec = ${paths.init_tok2vec}
231
  before_init = null
 
1
  [paths]
2
+ train = null
3
+ dev = null
4
+ vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
 
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
 
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
 
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
 
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
 
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
 
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
 
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = true
123
 
124
  [components.senter.model.tok2vec.encode]
 
137
  [components.tok2vec.model.embed]
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
+ rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = true
143
 
144
  [components.tok2vec.model.encode]
 
152
 
153
  [corpora.dev]
154
  @readers = "spacy.Corpus.v1"
155
+ path = ${paths.dev}
 
 
156
  gold_preproc = false
157
+ max_length = 0
158
+ limit = 0
159
  augmenter = null
160
 
161
  [corpora.train]
162
  @readers = "spacy.Corpus.v1"
163
+ path = ${paths.train}
 
164
  gold_preproc = false
165
+ max_length = 0
166
  limit = 0
167
+ augmenter = null
 
 
 
168
 
169
  [training]
170
  train_corpus = "corpora.train"
 
195
  t = 0.0
196
 
197
  [training.logger]
198
+ @loggers = "spacy.ConsoleLogger.v1"
199
+ progress_bar = false
 
200
 
201
  [training.optimizer]
202
  @optimizers = "Adam.v1"
 
219
  sents_p = null
220
  sents_r = null
221
  sents_f = 0.02
222
+ lemma_acc = 0.5
223
+ ents_f = 0.16
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
227
+ speed = 0.0
228
 
229
  [pretraining]
230
 
231
  [initialize]
232
+ vocab_data = null
233
  vectors = ${paths.vectors}
234
  init_tok2vec = ${paths.init_tok2vec}
235
  before_init = null
fr_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:693306a36120acaea3bd67cfeb9abe060a2204dab01a9c6025ab7d73d6ef33ca
3
- size 572128883
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec6139f61d34ec0fe6d0574c3cb1453f902bbf08ea0a9574f0ee838941e3029b
3
+ size 572929066
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"fr",
3
  "name":"core_news_lg",
4
- "version":"3.1.0",
5
  "description":"French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
@@ -173,7 +173,6 @@
173
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
174
  "POS=DET",
175
  "Gender=Masc|Number=Plur|POS=PRON",
176
- "POS=PART",
177
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
178
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
179
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
@@ -213,7 +212,6 @@
213
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
214
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
215
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
216
- "Gender=Fem|POS=ADV",
217
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin",
218
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
219
  "Gender=Fem|Number=Plur|POS=PROPN",
@@ -296,61 +294,59 @@
296
  ],
297
  "performance":{
298
  "token_acc":0.9989751998,
299
- "tag_acc":0.947357577,
300
- "pos_acc":0.9760997219,
301
- "morph_acc":0.968569662,
302
- "lemma_acc":0.9091282684,
303
- "dep_uas":0.897142196,
304
- "dep_las":0.8595222948,
305
- "sents_p":0.8521126761,
306
- "sents_r":0.9029253272,
307
- "sents_f":0.8715586104,
308
- "speed":5335.1974335443,
309
  "morph_per_feat":{
310
  "Definite":{
311
- "p":0.9904901244,
312
- "r":0.9876002918,
313
- "f":0.9890430972
314
  },
315
  "Number":{
316
- "p":0.9934725849,
317
- "r":0.9872127502,
318
- "f":0.9903327756
319
  },
320
  "PronType":{
321
- "p":0.9967845659,
322
- "r":0.9910485934,
323
  "f":0.9939083039
324
  },
325
  "Gender":{
326
- "p":0.9865040228,
327
- "r":0.9821705426,
328
- "f":0.9843325133
329
  },
330
  "Mood":{
331
- "p":0.9711711712,
332
- "r":0.9573712256,
333
- "f":0.9642218247
334
  },
335
  "Person":{
336
- "p":0.9872935197,
337
- "r":0.9761306533,
338
- "f":0.9816803538
339
  },
340
  "Tense":{
341
- "p":0.9702868852,
342
- "r":0.9673135853,
343
- "f":0.968797954
344
  },
345
  "VerbForm":{
346
- "p":0.9833610649,
347
- "r":0.9784768212,
348
- "f":0.9809128631
349
  },
350
  "NumType":{
351
- "p":1.0,
352
- "r":0.9620689655,
353
- "f":0.9806678383
354
  },
355
  "Reflex":{
356
  "p":1.0,
@@ -358,9 +354,9 @@
358
  "f":1.0
359
  },
360
  "Voice":{
361
- "p":0.9380530973,
362
  "r":0.9464285714,
363
- "f":0.9422222222
364
  },
365
  "Poss":{
366
  "p":1.0,
@@ -373,171 +369,176 @@
373
  "f":0.9940828402
374
  }
375
  },
 
 
 
 
 
376
  "dep_las_per_type":{
377
  "det":{
378
- "p":0.9813614263,
379
- "r":0.9774011299,
380
- "f":0.9793772746
381
  },
382
  "nsubj":{
383
- "p":0.8704156479,
384
- "r":0.8578313253,
385
- "f":0.8640776699
386
  },
387
  "aux:tense":{
388
- "p":0.9365079365,
389
- "r":0.944,
390
- "f":0.9402390438
391
  },
392
  "root":{
393
- "p":0.8558139535,
394
- "r":0.8932038835,
395
- "f":0.8741092637
396
  },
397
  "obj":{
398
- "p":0.8504398827,
399
- "r":0.8605341246,
400
- "f":0.8554572271
401
  },
402
  "cc":{
403
- "p":0.8801843318,
404
- "r":0.8801843318,
405
- "f":0.8801843318
406
  },
407
  "case":{
408
- "p":0.9669142471,
409
- "r":0.9754768392,
410
- "f":0.9711766701
411
  },
412
  "obl:mod":{
413
- "p":0.678807947,
414
- "r":0.6101190476,
415
- "f":0.6426332288
416
  },
417
  "nmod":{
418
- "p":0.8216007715,
419
- "r":0.8502994012,
420
- "f":0.8357037764
421
  },
422
  "conj":{
423
- "p":0.5275590551,
424
- "r":0.5275590551,
425
- "f":0.5275590551
426
  },
427
  "nummod":{
428
- "p":0.8950617284,
429
- "r":0.8630952381,
430
- "f":0.8787878788
431
  },
432
  "amod":{
433
- "p":0.9237918216,
434
- "r":0.9069343066,
435
- "f":0.9152854512
436
  },
437
  "acl":{
438
- "p":0.702247191,
439
- "r":0.7225433526,
440
- "f":0.7122507123
441
  },
442
  "mark":{
443
- "p":0.8711111111,
444
- "r":0.8634361233,
445
- "f":0.8672566372
446
  },
447
  "xcomp":{
448
- "p":0.8698630137,
449
- "r":0.8410596026,
450
- "f":0.8552188552
451
  },
452
  "flat:name":{
453
- "p":0.8888888889,
454
- "r":0.9142857143,
455
- "f":0.9014084507
456
  },
457
  "cop":{
458
- "p":0.8602150538,
459
  "r":0.8888888889,
460
- "f":0.8743169399
461
  },
462
  "advmod":{
463
- "p":0.8935483871,
464
- "r":0.868338558,
465
- "f":0.8807631161
466
  },
467
  "obl:arg":{
468
- "p":0.7104072398,
469
- "r":0.7136363636,
470
- "f":0.7120181406
471
  },
472
  "appos":{
473
- "p":0.5857142857,
474
- "r":0.4939759036,
475
- "f":0.5359477124
476
  },
477
  "nsubj:pass":{
478
- "p":0.8846153846,
479
- "r":0.8117647059,
480
- "f":0.8466257669
481
  },
482
  "aux:pass":{
483
- "p":0.954954955,
484
- "r":0.9464285714,
485
- "f":0.9506726457
486
  },
487
  "acl:relcl":{
488
- "p":0.6987951807,
489
- "r":0.6744186047,
490
- "f":0.6863905325
491
  },
492
  "advcl":{
493
- "p":0.5256410256,
494
- "r":0.5256410256,
495
- "f":0.5256410256
496
  },
497
  "fixed":{
498
- "p":0.8539325843,
499
- "r":0.7524752475,
500
- "f":0.8
501
  },
502
  "dep":{
503
- "p":0.3230769231,
504
- "r":0.6774193548,
505
- "f":0.4375
506
  },
507
  "expl:subj":{
508
- "p":0.8,
509
- "r":0.75,
510
- "f":0.7741935484
511
  },
512
  "expl:comp":{
513
- "p":0.6428571429,
514
- "r":0.9,
515
- "f":0.75
516
  },
517
  "expl:pass":{
518
- "p":0.3333333333,
519
- "r":0.1428571429,
520
- "f":0.2
521
- },
522
- "obl:agent":{
523
- "p":0.8085106383,
524
- "r":0.9047619048,
525
- "f":0.8539325843
526
  },
527
  "ccomp":{
528
- "p":0.7547169811,
529
- "r":0.7843137255,
530
- "f":0.7692307692
 
 
 
 
 
531
  },
532
  "parataxis":{
533
- "p":0.4333333333,
534
- "r":0.4642857143,
535
- "f":0.4482758621
536
  },
537
  "iobj":{
538
- "p":0.75,
539
- "r":0.48,
540
- "f":0.5853658537
541
  },
542
  "nsubj:caus":{
543
  "p":0.0,
@@ -560,9 +561,9 @@
560
  "f":0.0
561
  },
562
  "vocative":{
563
- "p":1.0,
564
  "r":0.625,
565
- "f":0.7692307692
566
  },
567
  "dislocated":{
568
  "p":0.0,
@@ -570,9 +571,9 @@
570
  "f":0.0
571
  },
572
  "flat:foreign":{
573
- "p":0.6666666667,
574
- "r":0.2857142857,
575
- "f":0.4
576
  },
577
  "orphan":{
578
  "p":0.0,
@@ -590,35 +591,38 @@
590
  "f":0.0
591
  }
592
  },
593
- "ents_p":0.8431266739,
594
- "ents_r":0.8443416997,
595
- "ents_f":0.8437337493,
 
 
596
  "ents_per_type":{
597
  "PER":{
598
- "p":0.907275021,
599
- "r":0.9274915813,
600
- "f":0.9172719221
601
  },
602
  "LOC":{
603
- "p":0.8427331441,
604
- "r":0.8569396804,
605
- "f":0.8497770403
606
  },
607
  "ORG":{
608
- "p":0.7912621359,
609
- "r":0.7776717557,
610
- "f":0.7844080847
611
  },
612
  "MISC":{
613
- "p":0.7418741081,
614
- "r":0.6821694125,
615
- "f":0.7107701656
616
  }
617
- }
 
618
  },
619
  "sources":[
620
  {
621
- "name":"UD French Sequoia v2.5",
622
  "url":"https://github.com/UniversalDependencies/UD_French-Sequoia",
623
  "license":"LGPL-LR",
624
  "author":"Candito, Marie; Seddah, Djam\u00e9; Perrier, Guy; Guillaume, Bruno"
 
1
  {
2
  "lang":"fr",
3
  "name":"core_news_lg",
4
+ "version":"3.2.0",
5
  "description":"French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
 
173
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
174
  "POS=DET",
175
  "Gender=Masc|Number=Plur|POS=PRON",
 
176
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
177
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
178
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
 
212
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
213
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
214
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
 
215
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin",
216
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
217
  "Gender=Fem|Number=Plur|POS=PROPN",
 
294
  ],
295
  "performance":{
296
  "token_acc":0.9989751998,
297
+ "token_p":0.9844389844,
298
+ "token_r":0.9896058454,
299
+ "token_f":0.9870156531,
300
+ "pos_acc":0.9757279052,
301
+ "morph_acc":0.9695876289,
302
+ "morph_micro_p":0.9891371057,
303
+ "morph_micro_r":0.9819056903,
304
+ "morph_micro_f":0.9855081326,
 
 
305
  "morph_per_feat":{
306
  "Definite":{
307
+ "p":0.9890510949,
308
+ "r":0.9890510949,
309
+ "f":0.9890510949
310
  },
311
  "Number":{
312
+ "p":0.9946286349,
313
+ "r":0.9885861561,
314
+ "f":0.9915981904
315
  },
316
  "PronType":{
317
+ "p":0.9961439589,
318
+ "r":0.9916826615,
319
  "f":0.9939083039
320
  },
321
  "Gender":{
322
+ "p":0.9865979381,
323
+ "r":0.9782775364,
324
+ "f":0.9824201206
325
  },
326
  "Mood":{
327
+ "p":0.9747292419,
328
+ "r":0.9591474245,
329
+ "f":0.9668755595
330
  },
331
  "Person":{
332
+ "p":0.9923273657,
333
+ "r":0.9761006289,
334
+ "f":0.9841471148
335
  },
336
  "Tense":{
337
+ "p":0.975308642,
338
+ "r":0.9683350358,
339
+ "f":0.9718093285
340
  },
341
  "VerbForm":{
342
+ "p":0.9841534612,
343
+ "r":0.9768211921,
344
+ "f":0.9804736186
345
  },
346
  "NumType":{
347
+ "p":0.9858657244,
348
+ "r":0.95221843,
349
+ "f":0.96875
350
  },
351
  "Reflex":{
352
  "p":1.0,
 
354
  "f":1.0
355
  },
356
  "Voice":{
357
+ "p":0.9298245614,
358
  "r":0.9464285714,
359
+ "f":0.9380530973
360
  },
361
  "Poss":{
362
  "p":1.0,
 
369
  "f":0.9940828402
370
  }
371
  },
372
+ "sents_p":0.8684210526,
373
+ "sents_r":0.9002309469,
374
+ "sents_f":0.8746987952,
375
+ "dep_uas":0.9011981247,
376
+ "dep_las":0.863160331,
377
  "dep_las_per_type":{
378
  "det":{
379
+ "p":0.9798549557,
380
+ "r":0.9814366425,
381
+ "f":0.9806451613
382
  },
383
  "nsubj":{
384
+ "p":0.896039604,
385
+ "r":0.8722891566,
386
+ "f":0.884004884
387
  },
388
  "aux:tense":{
389
+ "p":0.9754098361,
390
+ "r":0.952,
391
+ "f":0.963562753
392
  },
393
  "root":{
394
+ "p":0.8708133971,
395
+ "r":0.8834951456,
396
+ "f":0.8771084337
397
  },
398
  "obj":{
399
+ "p":0.865497076,
400
+ "r":0.8783382789,
401
+ "f":0.8718703976
402
  },
403
  "cc":{
404
+ "p":0.8986175115,
405
+ "r":0.8986175115,
406
+ "f":0.8986175115
407
  },
408
  "case":{
409
+ "p":0.9708078751,
410
+ "r":0.9741144414,
411
+ "f":0.9724583475
412
  },
413
  "obl:mod":{
414
+ "p":0.6719242902,
415
+ "r":0.6358208955,
416
+ "f":0.6533742331
417
  },
418
  "nmod":{
419
+ "p":0.8217349857,
420
+ "r":0.8611388611,
421
+ "f":0.8409756098
422
  },
423
  "conj":{
424
+ "p":0.5595238095,
425
+ "r":0.5551181102,
426
+ "f":0.557312253
427
  },
428
  "nummod":{
429
+ "p":0.9299363057,
430
+ "r":0.8639053254,
431
+ "f":0.8957055215
432
  },
433
  "amod":{
434
+ "p":0.9307116105,
435
+ "r":0.9052823315,
436
+ "f":0.917820868
437
  },
438
  "acl":{
439
+ "p":0.7045454545,
440
+ "r":0.7167630058,
441
+ "f":0.7106017192
442
  },
443
  "mark":{
444
+ "p":0.8928571429,
445
+ "r":0.8810572687,
446
+ "f":0.8869179601
447
  },
448
  "xcomp":{
449
+ "p":0.8835616438,
450
+ "r":0.8543046358,
451
+ "f":0.8686868687
452
  },
453
  "flat:name":{
454
+ "p":0.9047619048,
455
+ "r":0.9047619048,
456
+ "f":0.9047619048
457
  },
458
  "cop":{
459
+ "p":0.8988764045,
460
  "r":0.8888888889,
461
+ "f":0.8938547486
462
  },
463
  "advmod":{
464
+ "p":0.858044164,
465
+ "r":0.8526645768,
466
+ "f":0.8553459119
467
  },
468
  "obl:arg":{
469
+ "p":0.6832579186,
470
+ "r":0.6863636364,
471
+ "f":0.6848072562
472
  },
473
  "appos":{
474
+ "p":0.5421686747,
475
+ "r":0.5421686747,
476
+ "f":0.5421686747
477
  },
478
  "nsubj:pass":{
479
+ "p":0.880952381,
480
+ "r":0.8705882353,
481
+ "f":0.875739645
482
  },
483
  "aux:pass":{
484
+ "p":0.9557522124,
485
+ "r":0.9642857143,
486
+ "f":0.96
487
  },
488
  "acl:relcl":{
489
+ "p":0.686746988,
490
+ "r":0.6627906977,
491
+ "f":0.674556213
492
  },
493
  "advcl":{
494
+ "p":0.4810126582,
495
+ "r":0.4871794872,
496
+ "f":0.4840764331
497
  },
498
  "fixed":{
499
+ "p":0.7956989247,
500
+ "r":0.74,
501
+ "f":0.7668393782
502
  },
503
  "dep":{
504
+ "p":0.253968254,
505
+ "r":0.5517241379,
506
+ "f":0.347826087
507
  },
508
  "expl:subj":{
509
+ "p":0.8125,
510
+ "r":0.8125,
511
+ "f":0.8125
512
  },
513
  "expl:comp":{
514
+ "p":0.7,
515
+ "r":0.9333333333,
516
+ "f":0.8
517
  },
518
  "expl:pass":{
519
+ "p":0.4,
520
+ "r":0.2857142857,
521
+ "f":0.3333333333
 
 
 
 
 
522
  },
523
  "ccomp":{
524
+ "p":0.7,
525
+ "r":0.6862745098,
526
+ "f":0.6930693069
527
+ },
528
+ "obl:agent":{
529
+ "p":0.8888888889,
530
+ "r":0.7619047619,
531
+ "f":0.8205128205
532
  },
533
  "parataxis":{
534
+ "p":0.6,
535
+ "r":0.4285714286,
536
+ "f":0.5
537
  },
538
  "iobj":{
539
+ "p":0.8235294118,
540
+ "r":0.56,
541
+ "f":0.6666666667
542
  },
543
  "nsubj:caus":{
544
  "p":0.0,
 
561
  "f":0.0
562
  },
563
  "vocative":{
564
+ "p":0.8333333333,
565
  "r":0.625,
566
+ "f":0.7142857143
567
  },
568
  "dislocated":{
569
  "p":0.0,
 
571
  "f":0.0
572
  },
573
  "flat:foreign":{
574
+ "p":1.0,
575
+ "r":0.1428571429,
576
+ "f":0.25
577
  },
578
  "orphan":{
579
  "p":0.0,
 
591
  "f":0.0
592
  }
593
  },
594
+ "tag_acc":0.947075496,
595
+ "lemma_acc":0.9094739005,
596
+ "ents_p":0.843850032,
597
+ "ents_r":0.8442216084,
598
+ "ents_f":0.8440357793,
599
  "ents_per_type":{
600
  "PER":{
601
+ "p":0.9113682873,
602
+ "r":0.9253421222,
603
+ "f":0.9183020478
604
  },
605
  "LOC":{
606
+ "p":0.8441938007,
607
+ "r":0.8578599515,
608
+ "f":0.8509720119
609
  },
610
  "ORG":{
611
+ "p":0.7981488775,
612
+ "r":0.7734732824,
613
+ "f":0.7856173677
614
  },
615
  "MISC":{
616
+ "p":0.7300527786,
617
+ "r":0.6856684648,
618
+ "f":0.7071648748
619
  }
620
+ },
621
+ "speed":4058.7135091792
622
  },
623
  "sources":[
624
  {
625
+ "name":"UD French Sequoia v2.8",
626
  "url":"https://github.com/UniversalDependencies/UD_French-Sequoia",
627
  "license":"LGPL-LR",
628
  "author":"Candito, Marie; Seddah, Djam\u00e9; Perrier, Guy; Guillaume, Bruno"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "POS=PROPN":"",
4
  "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
@@ -153,7 +154,6 @@
153
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
154
  "POS=DET":"",
155
  "Gender=Masc|Number=Plur|POS=PRON":"Gender=Masc|Number=Plur",
156
- "POS=PART":"",
157
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
158
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
159
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
@@ -193,7 +193,6 @@
193
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
194
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
195
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
196
- "Gender=Fem|POS=ADV":"Gender=Fem",
197
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=2|Tense=Imp|VerbForm=Fin",
198
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
199
  "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
@@ -353,7 +352,6 @@
353
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
354
  "POS=DET":90,
355
  "Gender=Masc|Number=Plur|POS=PRON":95,
356
- "POS=PART":94,
357
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
358
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
359
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
@@ -393,10 +391,10 @@
393
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
394
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
395
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
396
- "Gender=Fem|POS=ADV":86,
397
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":87,
398
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
399
  "Gender=Fem|Number=Plur|POS=PROPN":96,
400
  "Gender=Masc|NumType=Card|POS=NUM":93
401
- }
 
402
  }
 
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "POS=PROPN":"",
5
  "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
 
154
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
155
  "POS=DET":"",
156
  "Gender=Masc|Number=Plur|POS=PRON":"Gender=Masc|Number=Plur",
 
157
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
158
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
159
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
 
193
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
194
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
195
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
 
196
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=2|Tense=Imp|VerbForm=Fin",
197
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
198
  "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
 
352
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
353
  "POS=DET":90,
354
  "Gender=Masc|Number=Plur|POS=PRON":95,
 
355
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
356
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
357
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
 
391
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
392
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
393
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
 
394
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":87,
395
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
396
  "Gender=Fem|Number=Plur|POS=PROPN":96,
397
  "Gender=Masc|NumType=Card|POS=NUM":93
398
+ },
399
+ "overwrite":true
400
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
 
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
 
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
 
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":25247},"1":{"":21688},"2":{"case":7258,"det":6062,"nsubj":1972,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":163,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1078,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":418,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
 
1
+ ��moves��{"0":{"":25255},"1":{"":21680},"2":{"case":7258,"det":6062,"nsubj":1982,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":164,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1078,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":409,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
 
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
 
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
 
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:66010bb54a8e9c0a19c43ac74caebe515c456d921dbded4ad4e10e559d554de5
3
- size 8159993
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:960cc1797e729519ed569b69bfca8107e6be1c051008015967e4636720ecfc67
3
+ size 10419856
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }