EC2 Default User commited on
Commit
5f710e2
1 Parent(s): d79daad

Update spaCy pipeline

Browse files
.gitattributes CHANGED
@@ -19,3 +19,5 @@
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
 
 
 
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
22
+ *key2row filter=lfs diff=lfs merge=lfs -text
23
+ *tokenizer filter=lfs diff=lfs merge=lfs -text
LICENSES_SOURCES CHANGED
@@ -105,6 +105,8 @@ END OF TERMS AND CONDITIONS```
105
  * License: CC BY 4.0
106
 
107
  ```
 
 
108
  By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution 4.0 International Public License ("Public License"). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions.
109
 
110
  Section 1 – Definitions.
 
105
  * License: CC BY 4.0
106
 
107
  ```
108
+ Creative Commons Attribution 4.0 International Public License
109
+
110
  By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution 4.0 International Public License ("Public License"). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions.
111
 
112
  Section 1 – Definitions.
README.md CHANGED
@@ -14,47 +14,62 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.843850032
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8442216084
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8440357793
 
 
 
 
 
 
 
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
- - name: POS Accuracy
29
  type: accuracy
30
- value: 0.947075496
31
  - task:
32
- name: SENTER
33
  type: token-classification
34
  metrics:
35
- - name: SENTER Precision
36
- type: precision
37
- value: 0.8684210526
38
- - name: SENTER Recall
39
- type: recall
40
- value: 0.9002309469
41
- - name: SENTER F Score
42
- type: f_score
43
- value: 0.8746987952
44
  - task:
45
- name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
- - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.9011981247
 
 
 
 
 
 
 
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
- - name: Labeled Dependencies Accuracy
56
- type: accuracy
57
- value: 0.9011981247
 
 
 
 
 
 
 
58
  ---
59
  ### Details: https://spacy.io/models/fr#fr_core_news_lg
60
 
@@ -63,8 +78,8 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `fr_core_news_lg` |
66
- | **Version** | `3.2.0` |
67
- | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
@@ -76,13 +91,12 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
76
 
77
  <details>
78
 
79
- <summary>View label scheme (238 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
84
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
85
- | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
87
 
88
  </details>
@@ -95,18 +109,18 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
95
  | `TOKEN_P` | 98.44 |
96
  | `TOKEN_R` | 98.96 |
97
  | `TOKEN_F` | 98.70 |
98
- | `POS_ACC` | 97.57 |
99
- | `MORPH_ACC` | 96.96 |
100
- | `MORPH_MICRO_P` | 98.91 |
101
- | `MORPH_MICRO_R` | 98.19 |
102
- | `MORPH_MICRO_F` | 98.55 |
103
- | `SENTS_P` | 86.84 |
104
- | `SENTS_R` | 90.02 |
105
- | `SENTS_F` | 87.47 |
106
- | `DEP_UAS` | 90.12 |
107
- | `DEP_LAS` | 86.32 |
108
- | `TAG_ACC` | 94.71 |
109
- | `LEMMA_ACC` | 90.95 |
110
- | `ENTS_P` | 84.39 |
111
- | `ENTS_R` | 84.42 |
112
- | `ENTS_F` | 84.40 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8424958383
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8407589768
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8416265115
24
+ - task:
25
+ name: TAG
26
+ type: token-classification
27
+ metrics:
28
+ - name: TAG (XPOS) Accuracy
29
+ type: accuracy
30
+ value: 0.9457899619
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
+ - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9745439555
38
  - task:
39
+ name: MORPH
40
  type: token-classification
41
  metrics:
42
+ - name: Morph (UFeats) Accuracy
43
+ type: accuracy
44
+ value: 0.9674226804
 
 
 
 
 
 
45
  - task:
46
+ name: LEMMA
47
  type: token-classification
48
  metrics:
49
+ - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9082408549
52
+ - task:
53
+ name: UNLABELED_DEPENDENCIES
54
+ type: token-classification
55
+ metrics:
56
+ - name: Unlabeled Attachment Score (UAS)
57
+ type: f_score
58
+ value: 0.8971464953
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
+ - name: Labeled Attachment Score (LAS)
64
+ type: f_score
65
+ value: 0.8595070015
66
+ - task:
67
+ name: SENTS
68
+ type: token-classification
69
+ metrics:
70
+ - name: Sentences F-Score
71
+ type: f_score
72
+ value: 0.8701923077
73
  ---
74
  ### Details: https://spacy.io/models/fr#fr_core_news_lg
75
 
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `fr_core_news_lg` |
81
+ | **Version** | `3.3.0` |
82
+ | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
83
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
84
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
85
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
 
91
 
92
  <details>
93
 
94
+ <summary>View label scheme (236 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
  | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
99
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
 
100
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
101
 
102
  </details>
 
109
  | `TOKEN_P` | 98.44 |
110
  | `TOKEN_R` | 98.96 |
111
  | `TOKEN_F` | 98.70 |
112
+ | `POS_ACC` | 97.45 |
113
+ | `MORPH_ACC` | 96.74 |
114
+ | `MORPH_MICRO_P` | 98.79 |
115
+ | `MORPH_MICRO_R` | 98.09 |
116
+ | `MORPH_MICRO_F` | 98.44 |
117
+ | `SENTS_P` | 86.19 |
118
+ | `SENTS_R` | 88.91 |
119
+ | `SENTS_F` | 87.02 |
120
+ | `DEP_UAS` | 89.71 |
121
+ | `DEP_LAS` | 85.95 |
122
+ | `TAG_ACC` | 94.58 |
123
+ | `LEMMA_ACC` | 90.82 |
124
+ | `ENTS_P` | 84.25 |
125
+ | `ENTS_R` | 84.08 |
126
+ | `ENTS_F` | 84.16 |
accuracy.json CHANGED
@@ -3,56 +3,56 @@
3
  "token_p": 0.9844389844,
4
  "token_r": 0.9896058454,
5
  "token_f": 0.9870156531,
6
- "pos_acc": 0.9757279052,
7
- "morph_acc": 0.9695876289,
8
- "morph_micro_p": 0.9891371057,
9
- "morph_micro_r": 0.9819056903,
10
- "morph_micro_f": 0.9855081326,
11
  "morph_per_feat": {
12
  "Definite": {
13
- "p": 0.9890510949,
14
- "r": 0.9890510949,
15
- "f": 0.9890510949
16
  },
17
  "Number": {
18
- "p": 0.9946286349,
19
- "r": 0.9885861561,
20
- "f": 0.9915981904
21
  },
22
  "PronType": {
23
- "p": 0.9961439589,
24
- "r": 0.9916826615,
25
- "f": 0.9939083039
26
  },
27
  "Gender": {
28
- "p": 0.9865979381,
29
- "r": 0.9782775364,
30
- "f": 0.9824201206
31
  },
32
  "Mood": {
33
- "p": 0.9747292419,
34
- "r": 0.9591474245,
35
- "f": 0.9668755595
36
  },
37
  "Person": {
38
- "p": 0.9923273657,
39
- "r": 0.9761006289,
40
- "f": 0.9841471148
41
  },
42
  "Tense": {
43
- "p": 0.975308642,
44
- "r": 0.9683350358,
45
- "f": 0.9718093285
46
  },
47
  "VerbForm": {
48
- "p": 0.9841534612,
49
- "r": 0.9768211921,
50
- "f": 0.9804736186
51
  },
52
  "NumType": {
53
- "p": 0.9858657244,
54
- "r": 0.95221843,
55
- "f": 0.96875
56
  },
57
  "Reflex": {
58
  "p": 1.0,
@@ -60,9 +60,9 @@
60
  "f": 1.0
61
  },
62
  "Voice": {
63
- "p": 0.9298245614,
64
- "r": 0.9464285714,
65
- "f": 0.9380530973
66
  },
67
  "Poss": {
68
  "p": 1.0,
@@ -71,180 +71,180 @@
71
  },
72
  "Polarity": {
73
  "p": 1.0,
74
- "r": 0.9882352941,
75
- "f": 0.9940828402
76
  }
77
  },
78
- "sents_p": 0.8684210526,
79
- "sents_r": 0.9002309469,
80
- "sents_f": 0.8746987952,
81
- "dep_uas": 0.9011981247,
82
- "dep_las": 0.863160331,
83
  "dep_las_per_type": {
84
  "det": {
85
- "p": 0.9798549557,
86
- "r": 0.9814366425,
87
- "f": 0.9806451613
88
  },
89
  "nsubj": {
90
- "p": 0.896039604,
91
- "r": 0.8722891566,
92
- "f": 0.884004884
93
  },
94
  "aux:tense": {
95
- "p": 0.9754098361,
96
  "r": 0.952,
97
- "f": 0.963562753
98
  },
99
  "root": {
100
- "p": 0.8708133971,
101
- "r": 0.8834951456,
102
- "f": 0.8771084337
103
  },
104
  "obj": {
105
- "p": 0.865497076,
106
- "r": 0.8783382789,
107
- "f": 0.8718703976
108
  },
109
  "cc": {
110
- "p": 0.8986175115,
111
- "r": 0.8986175115,
112
- "f": 0.8986175115
113
  },
114
  "case": {
115
- "p": 0.9708078751,
116
- "r": 0.9741144414,
117
- "f": 0.9724583475
118
  },
119
  "obl:mod": {
120
- "p": 0.6719242902,
121
- "r": 0.6358208955,
122
- "f": 0.6533742331
123
  },
124
  "nmod": {
125
- "p": 0.8217349857,
126
- "r": 0.8611388611,
127
- "f": 0.8409756098
128
  },
129
  "conj": {
130
- "p": 0.5595238095,
131
  "r": 0.5551181102,
132
- "f": 0.557312253
133
  },
134
  "nummod": {
135
- "p": 0.9299363057,
136
- "r": 0.8639053254,
137
- "f": 0.8957055215
138
  },
139
  "amod": {
140
- "p": 0.9307116105,
141
- "r": 0.9052823315,
142
- "f": 0.917820868
143
  },
144
  "acl": {
145
- "p": 0.7045454545,
146
- "r": 0.7167630058,
147
- "f": 0.7106017192
148
  },
149
  "mark": {
150
- "p": 0.8928571429,
151
- "r": 0.8810572687,
152
- "f": 0.8869179601
153
  },
154
  "xcomp": {
155
- "p": 0.8835616438,
156
- "r": 0.8543046358,
157
- "f": 0.8686868687
158
  },
159
  "flat:name": {
160
- "p": 0.9047619048,
161
- "r": 0.9047619048,
162
- "f": 0.9047619048
163
  },
164
  "cop": {
165
- "p": 0.8988764045,
166
- "r": 0.8888888889,
167
- "f": 0.8938547486
168
  },
169
  "advmod": {
170
- "p": 0.858044164,
171
- "r": 0.8526645768,
172
- "f": 0.8553459119
173
  },
174
  "obl:arg": {
175
- "p": 0.6832579186,
176
- "r": 0.6863636364,
177
- "f": 0.6848072562
178
  },
179
  "appos": {
180
- "p": 0.5421686747,
181
- "r": 0.5421686747,
182
- "f": 0.5421686747
183
  },
184
  "nsubj:pass": {
185
- "p": 0.880952381,
186
  "r": 0.8705882353,
187
- "f": 0.875739645
188
  },
189
  "aux:pass": {
190
- "p": 0.9557522124,
191
- "r": 0.9642857143,
192
- "f": 0.96
193
  },
194
  "acl:relcl": {
195
- "p": 0.686746988,
196
- "r": 0.6627906977,
197
- "f": 0.674556213
198
  },
199
  "advcl": {
200
- "p": 0.4810126582,
201
- "r": 0.4871794872,
202
- "f": 0.4840764331
203
  },
204
  "fixed": {
205
- "p": 0.7956989247,
206
- "r": 0.74,
207
- "f": 0.7668393782
208
  },
209
  "dep": {
210
- "p": 0.253968254,
211
- "r": 0.5517241379,
212
- "f": 0.347826087
213
  },
214
  "expl:subj": {
215
- "p": 0.8125,
216
- "r": 0.8125,
217
- "f": 0.8125
218
  },
219
  "expl:comp": {
220
- "p": 0.7,
221
- "r": 0.9333333333,
222
- "f": 0.8
223
  },
224
  "expl:pass": {
225
- "p": 0.4,
226
- "r": 0.2857142857,
227
- "f": 0.3333333333
228
  },
229
  "ccomp": {
230
- "p": 0.7,
231
- "r": 0.6862745098,
232
- "f": 0.6930693069
233
- },
234
- "obl:agent": {
235
- "p": 0.8888888889,
236
- "r": 0.7619047619,
237
- "f": 0.8205128205
238
  },
239
  "parataxis": {
240
- "p": 0.6,
241
- "r": 0.4285714286,
242
- "f": 0.5
243
  },
244
  "iobj": {
245
- "p": 0.8235294118,
246
- "r": 0.56,
247
- "f": 0.6666666667
 
 
 
 
 
248
  },
249
  "nsubj:caus": {
250
  "p": 0.0,
@@ -267,9 +267,9 @@
267
  "f": 0.0
268
  },
269
  "vocative": {
270
- "p": 0.8333333333,
271
  "r": 0.625,
272
- "f": 0.7142857143
273
  },
274
  "dislocated": {
275
  "p": 0.0,
@@ -277,9 +277,9 @@
277
  "f": 0.0
278
  },
279
  "flat:foreign": {
280
- "p": 1.0,
281
  "r": 0.1428571429,
282
- "f": 0.25
283
  },
284
  "orphan": {
285
  "p": 0.0,
@@ -297,32 +297,32 @@
297
  "f": 0.0
298
  }
299
  },
300
- "tag_acc": 0.947075496,
301
- "lemma_acc": 0.9094739005,
302
- "ents_p": 0.843850032,
303
- "ents_r": 0.8442216084,
304
- "ents_f": 0.8440357793,
305
  "ents_per_type": {
306
  "PER": {
307
- "p": 0.9113682873,
308
- "r": 0.9253421222,
309
- "f": 0.9183020478
310
  },
311
  "LOC": {
312
- "p": 0.8441938007,
313
- "r": 0.8578599515,
314
- "f": 0.8509720119
315
  },
316
  "ORG": {
317
- "p": 0.7981488775,
318
- "r": 0.7734732824,
319
- "f": 0.7856173677
320
  },
321
  "MISC": {
322
- "p": 0.7300527786,
323
- "r": 0.6856684648,
324
- "f": 0.7071648748
325
  }
326
  },
327
- "speed": 4058.7135091792
328
  }
 
3
  "token_p": 0.9844389844,
4
  "token_r": 0.9896058454,
5
  "token_f": 0.9870156531,
6
+ "pos_acc": 0.9745439555,
7
+ "morph_acc": 0.9674226804,
8
+ "morph_micro_p": 0.9879126273,
9
+ "morph_micro_r": 0.9809309126,
10
+ "morph_micro_f": 0.984409391,
11
  "morph_per_feat": {
12
  "Definite": {
13
+ "p": 0.9904831625,
14
+ "r": 0.9875912409,
15
+ "f": 0.9890350877
16
  },
17
  "Number": {
18
+ "p": 0.9933345677,
19
+ "r": 0.9876656848,
20
+ "f": 0.9904920151
21
  },
22
  "PronType": {
23
+ "p": 0.995492595,
24
+ "r": 0.9891234805,
25
+ "f": 0.9922978177
26
  },
27
  "Gender": {
28
+ "p": 0.9823212913,
29
+ "r": 0.9798108868,
30
+ "f": 0.9810644831
31
  },
32
  "Mood": {
33
+ "p": 0.9764065336,
34
+ "r": 0.9555950266,
35
+ "f": 0.9658886894
36
  },
37
  "Person": {
38
+ "p": 0.9910141207,
39
+ "r": 0.9710691824,
40
+ "f": 0.9809402795
41
  },
42
  "Tense": {
43
+ "p": 0.9741735537,
44
+ "r": 0.9632277835,
45
+ "f": 0.9686697483
46
  },
47
  "VerbForm": {
48
+ "p": 0.9840871022,
49
+ "r": 0.9726821192,
50
+ "f": 0.9783513739
51
  },
52
  "NumType": {
53
+ "p": 1.0,
54
+ "r": 0.9624573379,
55
+ "f": 0.9808695652
56
  },
57
  "Reflex": {
58
  "p": 1.0,
 
60
  "f": 1.0
61
  },
62
  "Voice": {
63
+ "p": 0.9316239316,
64
+ "r": 0.9732142857,
65
+ "f": 0.9519650655
66
  },
67
  "Poss": {
68
  "p": 1.0,
 
71
  },
72
  "Polarity": {
73
  "p": 1.0,
74
+ "r": 0.9764705882,
75
+ "f": 0.9880952381
76
  }
77
  },
78
+ "sents_p": 0.8619047619,
79
+ "sents_r": 0.8890685142,
80
+ "sents_f": 0.8701923077,
81
+ "dep_uas": 0.8971464953,
82
+ "dep_las": 0.8595070015,
83
  "dep_las_per_type": {
84
  "det": {
85
+ "p": 0.9846029173,
86
+ "r": 0.98062954,
87
+ "f": 0.9826122119
88
  },
89
  "nsubj": {
90
+ "p": 0.8823529412,
91
+ "r": 0.8674698795,
92
+ "f": 0.8748481166
93
  },
94
  "aux:tense": {
95
+ "p": 0.9224806202,
96
  "r": 0.952,
97
+ "f": 0.937007874
98
  },
99
  "root": {
100
+ "p": 0.8747044917,
101
+ "r": 0.8980582524,
102
+ "f": 0.8862275449
103
  },
104
  "obj": {
105
+ "p": 0.8868501529,
106
+ "r": 0.8605341246,
107
+ "f": 0.8734939759
108
  },
109
  "cc": {
110
+ "p": 0.9049773756,
111
+ "r": 0.9216589862,
112
+ "f": 0.9132420091
113
  },
114
  "case": {
115
+ "p": 0.9722598106,
116
+ "r": 0.9788828338,
117
+ "f": 0.9755600815
118
  },
119
  "obl:mod": {
120
+ "p": 0.6437125749,
121
+ "r": 0.6417910448,
122
+ "f": 0.6427503737
123
  },
124
  "nmod": {
125
+ "p": 0.8200192493,
126
+ "r": 0.8511488511,
127
+ "f": 0.8352941176
128
  },
129
  "conj": {
130
+ "p": 0.561752988,
131
  "r": 0.5551181102,
132
+ "f": 0.5584158416
133
  },
134
  "nummod": {
135
+ "p": 0.903030303,
136
+ "r": 0.8816568047,
137
+ "f": 0.8922155689
138
  },
139
  "amod": {
140
+ "p": 0.9291044776,
141
+ "r": 0.9071038251,
142
+ "f": 0.9179723502
143
  },
144
  "acl": {
145
+ "p": 0.683908046,
146
+ "r": 0.6878612717,
147
+ "f": 0.6858789625
148
  },
149
  "mark": {
150
+ "p": 0.8755555556,
151
+ "r": 0.8678414097,
152
+ "f": 0.8716814159
153
  },
154
  "xcomp": {
155
+ "p": 0.8671328671,
156
+ "r": 0.821192053,
157
+ "f": 0.843537415
158
  },
159
  "flat:name": {
160
+ "p": 0.93,
161
+ "r": 0.8857142857,
162
+ "f": 0.9073170732
163
  },
164
  "cop": {
165
+ "p": 0.8681318681,
166
+ "r": 0.8777777778,
167
+ "f": 0.8729281768
168
  },
169
  "advmod": {
170
+ "p": 0.8444444444,
171
+ "r": 0.8338557994,
172
+ "f": 0.8391167192
173
  },
174
  "obl:arg": {
175
+ "p": 0.7242990654,
176
+ "r": 0.7045454545,
177
+ "f": 0.7142857143
178
  },
179
  "appos": {
180
+ "p": 0.4698795181,
181
+ "r": 0.4698795181,
182
+ "f": 0.4698795181
183
  },
184
  "nsubj:pass": {
185
+ "p": 0.8604651163,
186
  "r": 0.8705882353,
187
+ "f": 0.865497076
188
  },
189
  "aux:pass": {
190
+ "p": 0.9304347826,
191
+ "r": 0.9553571429,
192
+ "f": 0.9427312775
193
  },
194
  "acl:relcl": {
195
+ "p": 0.6419753086,
196
+ "r": 0.6046511628,
197
+ "f": 0.622754491
198
  },
199
  "advcl": {
200
+ "p": 0.4659090909,
201
+ "r": 0.5256410256,
202
+ "f": 0.4939759036
203
  },
204
  "fixed": {
205
+ "p": 0.7604166667,
206
+ "r": 0.73,
207
+ "f": 0.7448979592
208
  },
209
  "dep": {
210
+ "p": 0.2033898305,
211
+ "r": 0.4137931034,
212
+ "f": 0.2727272727
213
  },
214
  "expl:subj": {
215
+ "p": 0.78125,
216
+ "r": 0.78125,
217
+ "f": 0.78125
218
  },
219
  "expl:comp": {
220
+ "p": 0.6341463415,
221
+ "r": 0.8666666667,
222
+ "f": 0.7323943662
223
  },
224
  "expl:pass": {
225
+ "p": 0.3333333333,
226
+ "r": 0.1428571429,
227
+ "f": 0.2
228
  },
229
  "ccomp": {
230
+ "p": 0.8,
231
+ "r": 0.7058823529,
232
+ "f": 0.75
 
 
 
 
 
233
  },
234
  "parataxis": {
235
+ "p": 0.6923076923,
236
+ "r": 0.3214285714,
237
+ "f": 0.4390243902
238
  },
239
  "iobj": {
240
+ "p": 0.8,
241
+ "r": 0.48,
242
+ "f": 0.6
243
+ },
244
+ "obl:agent": {
245
+ "p": 0.9210526316,
246
+ "r": 0.8333333333,
247
+ "f": 0.875
248
  },
249
  "nsubj:caus": {
250
  "p": 0.0,
 
267
  "f": 0.0
268
  },
269
  "vocative": {
270
+ "p": 1.0,
271
  "r": 0.625,
272
+ "f": 0.7692307692
273
  },
274
  "dislocated": {
275
  "p": 0.0,
 
277
  "f": 0.0
278
  },
279
  "flat:foreign": {
280
+ "p": 0.5,
281
  "r": 0.1428571429,
282
+ "f": 0.2222222222
283
  },
284
  "orphan": {
285
  "p": 0.0,
 
297
  "f": 0.0
298
  }
299
  },
300
+ "tag_acc": 0.9457899619,
301
+ "lemma_acc": 0.9082408549,
302
+ "ents_p": 0.8424958383,
303
+ "ents_r": 0.8407589768,
304
+ "ents_f": 0.8416265115,
305
  "ents_per_type": {
306
  "PER": {
307
+ "p": 0.9084447572,
308
+ "r": 0.9249122304,
309
+ "f": 0.9166045372
310
  },
311
  "LOC": {
312
+ "p": 0.84611891,
313
+ "r": 0.8572324939,
314
+ "f": 0.8516394465
315
  },
316
  "ORG": {
317
+ "p": 0.7897670715,
318
+ "r": 0.7570610687,
319
+ "f": 0.7730683036
320
  },
321
  "MISC": {
322
+ "p": 0.7238526382,
323
+ "r": 0.6760460709,
324
+ "f": 0.6991330569
325
  }
326
  },
327
+ "speed": 4250.2217767942
328
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -39,8 +39,9 @@ overwrite = true
39
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
- @architectures = "spacy.Tagger.v1"
43
  nO = null
 
44
 
45
  [components.morphologizer.model.tok2vec]
46
  @architectures = "spacy.Tok2VecListener.v1"
@@ -70,7 +71,7 @@ nO = null
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
- rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
@@ -108,8 +109,9 @@ overwrite = false
108
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
- @architectures = "spacy.Tagger.v1"
112
  nO = null
 
113
 
114
  [components.senter.model.tok2vec]
115
  @architectures = "spacy.Tok2Vec.v2"
@@ -138,7 +140,7 @@ factory = "tok2vec"
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
- rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = true
143
 
144
  [components.tok2vec.model.encode]
@@ -175,7 +177,7 @@ dropout = 0.1
175
  accumulate_gradient = 1
176
  patience = 5000
177
  max_epochs = 0
178
- max_steps = 0
179
  eval_frequency = 1000
180
  frozen_components = []
181
  before_to_disk = null
 
39
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
+ @architectures = "spacy.Tagger.v2"
43
  nO = null
44
+ normalize = false
45
 
46
  [components.morphologizer.model.tok2vec]
47
  @architectures = "spacy.Tok2VecListener.v1"
 
71
  @architectures = "spacy.MultiHashEmbed.v2"
72
  width = 96
73
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
74
+ rows = [5000,1000,2500,2500,50]
75
  include_static_vectors = true
76
 
77
  [components.ner.model.tok2vec.encode]
 
109
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
110
 
111
  [components.senter.model]
112
+ @architectures = "spacy.Tagger.v2"
113
  nO = null
114
+ normalize = false
115
 
116
  [components.senter.model.tok2vec]
117
  @architectures = "spacy.Tok2Vec.v2"
 
140
  @architectures = "spacy.MultiHashEmbed.v2"
141
  width = ${components.tok2vec.model.encode:width}
142
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
143
+ rows = [5000,1000,2500,2500,50]
144
  include_static_vectors = true
145
 
146
  [components.tok2vec.model.encode]
 
177
  accumulate_gradient = 1
178
  patience = 5000
179
  max_epochs = 0
180
+ max_steps = 100000
181
  eval_frequency = 1000
182
  frozen_components = []
183
  before_to_disk = null
fr_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ec6139f61d34ec0fe6d0574c3cb1453f902bbf08ea0a9574f0ee838941e3029b
3
- size 572929066
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d4107106f2f768c3de950e39e870c1ef61b4d5959f30f906caed4f8ad3e9a472
3
+ size 571826408
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"fr",
3
  "name":"core_news_lg",
4
- "version":"3.2.0",
5
  "description":"French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
- "spacy_version":">=3.2.0,<3.3.0",
11
- "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
@@ -255,10 +255,6 @@
255
  "vocative",
256
  "xcomp"
257
  ],
258
- "senter":[
259
- "I",
260
- "S"
261
- ],
262
  "attribute_ruler":[
263
 
264
  ],
@@ -297,56 +293,56 @@
297
  "token_p":0.9844389844,
298
  "token_r":0.9896058454,
299
  "token_f":0.9870156531,
300
- "pos_acc":0.9757279052,
301
- "morph_acc":0.9695876289,
302
- "morph_micro_p":0.9891371057,
303
- "morph_micro_r":0.9819056903,
304
- "morph_micro_f":0.9855081326,
305
  "morph_per_feat":{
306
  "Definite":{
307
- "p":0.9890510949,
308
- "r":0.9890510949,
309
- "f":0.9890510949
310
  },
311
  "Number":{
312
- "p":0.9946286349,
313
- "r":0.9885861561,
314
- "f":0.9915981904
315
  },
316
  "PronType":{
317
- "p":0.9961439589,
318
- "r":0.9916826615,
319
- "f":0.9939083039
320
  },
321
  "Gender":{
322
- "p":0.9865979381,
323
- "r":0.9782775364,
324
- "f":0.9824201206
325
  },
326
  "Mood":{
327
- "p":0.9747292419,
328
- "r":0.9591474245,
329
- "f":0.9668755595
330
  },
331
  "Person":{
332
- "p":0.9923273657,
333
- "r":0.9761006289,
334
- "f":0.9841471148
335
  },
336
  "Tense":{
337
- "p":0.975308642,
338
- "r":0.9683350358,
339
- "f":0.9718093285
340
  },
341
  "VerbForm":{
342
- "p":0.9841534612,
343
- "r":0.9768211921,
344
- "f":0.9804736186
345
  },
346
  "NumType":{
347
- "p":0.9858657244,
348
- "r":0.95221843,
349
- "f":0.96875
350
  },
351
  "Reflex":{
352
  "p":1.0,
@@ -354,9 +350,9 @@
354
  "f":1.0
355
  },
356
  "Voice":{
357
- "p":0.9298245614,
358
- "r":0.9464285714,
359
- "f":0.9380530973
360
  },
361
  "Poss":{
362
  "p":1.0,
@@ -365,180 +361,180 @@
365
  },
366
  "Polarity":{
367
  "p":1.0,
368
- "r":0.9882352941,
369
- "f":0.9940828402
370
  }
371
  },
372
- "sents_p":0.8684210526,
373
- "sents_r":0.9002309469,
374
- "sents_f":0.8746987952,
375
- "dep_uas":0.9011981247,
376
- "dep_las":0.863160331,
377
  "dep_las_per_type":{
378
  "det":{
379
- "p":0.9798549557,
380
- "r":0.9814366425,
381
- "f":0.9806451613
382
  },
383
  "nsubj":{
384
- "p":0.896039604,
385
- "r":0.8722891566,
386
- "f":0.884004884
387
  },
388
  "aux:tense":{
389
- "p":0.9754098361,
390
  "r":0.952,
391
- "f":0.963562753
392
  },
393
  "root":{
394
- "p":0.8708133971,
395
- "r":0.8834951456,
396
- "f":0.8771084337
397
  },
398
  "obj":{
399
- "p":0.865497076,
400
- "r":0.8783382789,
401
- "f":0.8718703976
402
  },
403
  "cc":{
404
- "p":0.8986175115,
405
- "r":0.8986175115,
406
- "f":0.8986175115
407
  },
408
  "case":{
409
- "p":0.9708078751,
410
- "r":0.9741144414,
411
- "f":0.9724583475
412
  },
413
  "obl:mod":{
414
- "p":0.6719242902,
415
- "r":0.6358208955,
416
- "f":0.6533742331
417
  },
418
  "nmod":{
419
- "p":0.8217349857,
420
- "r":0.8611388611,
421
- "f":0.8409756098
422
  },
423
  "conj":{
424
- "p":0.5595238095,
425
  "r":0.5551181102,
426
- "f":0.557312253
427
  },
428
  "nummod":{
429
- "p":0.9299363057,
430
- "r":0.8639053254,
431
- "f":0.8957055215
432
  },
433
  "amod":{
434
- "p":0.9307116105,
435
- "r":0.9052823315,
436
- "f":0.917820868
437
  },
438
  "acl":{
439
- "p":0.7045454545,
440
- "r":0.7167630058,
441
- "f":0.7106017192
442
  },
443
  "mark":{
444
- "p":0.8928571429,
445
- "r":0.8810572687,
446
- "f":0.8869179601
447
  },
448
  "xcomp":{
449
- "p":0.8835616438,
450
- "r":0.8543046358,
451
- "f":0.8686868687
452
  },
453
  "flat:name":{
454
- "p":0.9047619048,
455
- "r":0.9047619048,
456
- "f":0.9047619048
457
  },
458
  "cop":{
459
- "p":0.8988764045,
460
- "r":0.8888888889,
461
- "f":0.8938547486
462
  },
463
  "advmod":{
464
- "p":0.858044164,
465
- "r":0.8526645768,
466
- "f":0.8553459119
467
  },
468
  "obl:arg":{
469
- "p":0.6832579186,
470
- "r":0.6863636364,
471
- "f":0.6848072562
472
  },
473
  "appos":{
474
- "p":0.5421686747,
475
- "r":0.5421686747,
476
- "f":0.5421686747
477
  },
478
  "nsubj:pass":{
479
- "p":0.880952381,
480
  "r":0.8705882353,
481
- "f":0.875739645
482
  },
483
  "aux:pass":{
484
- "p":0.9557522124,
485
- "r":0.9642857143,
486
- "f":0.96
487
  },
488
  "acl:relcl":{
489
- "p":0.686746988,
490
- "r":0.6627906977,
491
- "f":0.674556213
492
  },
493
  "advcl":{
494
- "p":0.4810126582,
495
- "r":0.4871794872,
496
- "f":0.4840764331
497
  },
498
  "fixed":{
499
- "p":0.7956989247,
500
- "r":0.74,
501
- "f":0.7668393782
502
  },
503
  "dep":{
504
- "p":0.253968254,
505
- "r":0.5517241379,
506
- "f":0.347826087
507
  },
508
  "expl:subj":{
509
- "p":0.8125,
510
- "r":0.8125,
511
- "f":0.8125
512
  },
513
  "expl:comp":{
514
- "p":0.7,
515
- "r":0.9333333333,
516
- "f":0.8
517
  },
518
  "expl:pass":{
519
- "p":0.4,
520
- "r":0.2857142857,
521
- "f":0.3333333333
522
  },
523
  "ccomp":{
524
- "p":0.7,
525
- "r":0.6862745098,
526
- "f":0.6930693069
527
- },
528
- "obl:agent":{
529
- "p":0.8888888889,
530
- "r":0.7619047619,
531
- "f":0.8205128205
532
  },
533
  "parataxis":{
534
- "p":0.6,
535
- "r":0.4285714286,
536
- "f":0.5
537
  },
538
  "iobj":{
539
- "p":0.8235294118,
540
- "r":0.56,
541
- "f":0.6666666667
 
 
 
 
 
542
  },
543
  "nsubj:caus":{
544
  "p":0.0,
@@ -561,9 +557,9 @@
561
  "f":0.0
562
  },
563
  "vocative":{
564
- "p":0.8333333333,
565
  "r":0.625,
566
- "f":0.7142857143
567
  },
568
  "dislocated":{
569
  "p":0.0,
@@ -571,9 +567,9 @@
571
  "f":0.0
572
  },
573
  "flat:foreign":{
574
- "p":1.0,
575
  "r":0.1428571429,
576
- "f":0.25
577
  },
578
  "orphan":{
579
  "p":0.0,
@@ -591,34 +587,34 @@
591
  "f":0.0
592
  }
593
  },
594
- "tag_acc":0.947075496,
595
- "lemma_acc":0.9094739005,
596
- "ents_p":0.843850032,
597
- "ents_r":0.8442216084,
598
- "ents_f":0.8440357793,
599
  "ents_per_type":{
600
  "PER":{
601
- "p":0.9113682873,
602
- "r":0.9253421222,
603
- "f":0.9183020478
604
  },
605
  "LOC":{
606
- "p":0.8441938007,
607
- "r":0.8578599515,
608
- "f":0.8509720119
609
  },
610
  "ORG":{
611
- "p":0.7981488775,
612
- "r":0.7734732824,
613
- "f":0.7856173677
614
  },
615
  "MISC":{
616
- "p":0.7300527786,
617
- "r":0.6856684648,
618
- "f":0.7071648748
619
  }
620
  },
621
- "speed":4058.7135091792
622
  },
623
  "sources":[
624
  {
 
1
  {
2
  "lang":"fr",
3
  "name":"core_news_lg",
4
+ "version":"3.3.0",
5
  "description":"French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"[email protected]",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
+ "spacy_version":">=3.3.0.dev0,<3.4.0",
11
+ "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
 
255
  "vocative",
256
  "xcomp"
257
  ],
 
 
 
 
258
  "attribute_ruler":[
259
 
260
  ],
 
293
  "token_p":0.9844389844,
294
  "token_r":0.9896058454,
295
  "token_f":0.9870156531,
296
+ "pos_acc":0.9745439555,
297
+ "morph_acc":0.9674226804,
298
+ "morph_micro_p":0.9879126273,
299
+ "morph_micro_r":0.9809309126,
300
+ "morph_micro_f":0.984409391,
301
  "morph_per_feat":{
302
  "Definite":{
303
+ "p":0.9904831625,
304
+ "r":0.9875912409,
305
+ "f":0.9890350877
306
  },
307
  "Number":{
308
+ "p":0.9933345677,
309
+ "r":0.9876656848,
310
+ "f":0.9904920151
311
  },
312
  "PronType":{
313
+ "p":0.995492595,
314
+ "r":0.9891234805,
315
+ "f":0.9922978177
316
  },
317
  "Gender":{
318
+ "p":0.9823212913,
319
+ "r":0.9798108868,
320
+ "f":0.9810644831
321
  },
322
  "Mood":{
323
+ "p":0.9764065336,
324
+ "r":0.9555950266,
325
+ "f":0.9658886894
326
  },
327
  "Person":{
328
+ "p":0.9910141207,
329
+ "r":0.9710691824,
330
+ "f":0.9809402795
331
  },
332
  "Tense":{
333
+ "p":0.9741735537,
334
+ "r":0.9632277835,
335
+ "f":0.9686697483
336
  },
337
  "VerbForm":{
338
+ "p":0.9840871022,
339
+ "r":0.9726821192,
340
+ "f":0.9783513739
341
  },
342
  "NumType":{
343
+ "p":1.0,
344
+ "r":0.9624573379,
345
+ "f":0.9808695652
346
  },
347
  "Reflex":{
348
  "p":1.0,
 
350
  "f":1.0
351
  },
352
  "Voice":{
353
+ "p":0.9316239316,
354
+ "r":0.9732142857,
355
+ "f":0.9519650655
356
  },
357
  "Poss":{
358
  "p":1.0,
 
361
  },
362
  "Polarity":{
363
  "p":1.0,
364
+ "r":0.9764705882,
365
+ "f":0.9880952381
366
  }
367
  },
368
+ "sents_p":0.8619047619,
369
+ "sents_r":0.8890685142,
370
+ "sents_f":0.8701923077,
371
+ "dep_uas":0.8971464953,
372
+ "dep_las":0.8595070015,
373
  "dep_las_per_type":{
374
  "det":{
375
+ "p":0.9846029173,
376
+ "r":0.98062954,
377
+ "f":0.9826122119
378
  },
379
  "nsubj":{
380
+ "p":0.8823529412,
381
+ "r":0.8674698795,
382
+ "f":0.8748481166
383
  },
384
  "aux:tense":{
385
+ "p":0.9224806202,
386
  "r":0.952,
387
+ "f":0.937007874
388
  },
389
  "root":{
390
+ "p":0.8747044917,
391
+ "r":0.8980582524,
392
+ "f":0.8862275449
393
  },
394
  "obj":{
395
+ "p":0.8868501529,
396
+ "r":0.8605341246,
397
+ "f":0.8734939759
398
  },
399
  "cc":{
400
+ "p":0.9049773756,
401
+ "r":0.9216589862,
402
+ "f":0.9132420091
403
  },
404
  "case":{
405
+ "p":0.9722598106,
406
+ "r":0.9788828338,
407
+ "f":0.9755600815
408
  },
409
  "obl:mod":{
410
+ "p":0.6437125749,
411
+ "r":0.6417910448,
412
+ "f":0.6427503737
413
  },
414
  "nmod":{
415
+ "p":0.8200192493,
416
+ "r":0.8511488511,
417
+ "f":0.8352941176
418
  },
419
  "conj":{
420
+ "p":0.561752988,
421
  "r":0.5551181102,
422
+ "f":0.5584158416
423
  },
424
  "nummod":{
425
+ "p":0.903030303,
426
+ "r":0.8816568047,
427
+ "f":0.8922155689
428
  },
429
  "amod":{
430
+ "p":0.9291044776,
431
+ "r":0.9071038251,
432
+ "f":0.9179723502
433
  },
434
  "acl":{
435
+ "p":0.683908046,
436
+ "r":0.6878612717,
437
+ "f":0.6858789625
438
  },
439
  "mark":{
440
+ "p":0.8755555556,
441
+ "r":0.8678414097,
442
+ "f":0.8716814159
443
  },
444
  "xcomp":{
445
+ "p":0.8671328671,
446
+ "r":0.821192053,
447
+ "f":0.843537415
448
  },
449
  "flat:name":{
450
+ "p":0.93,
451
+ "r":0.8857142857,
452
+ "f":0.9073170732
453
  },
454
  "cop":{
455
+ "p":0.8681318681,
456
+ "r":0.8777777778,
457
+ "f":0.8729281768
458
  },
459
  "advmod":{
460
+ "p":0.8444444444,
461
+ "r":0.8338557994,
462
+ "f":0.8391167192
463
  },
464
  "obl:arg":{
465
+ "p":0.7242990654,
466
+ "r":0.7045454545,
467
+ "f":0.7142857143
468
  },
469
  "appos":{
470
+ "p":0.4698795181,
471
+ "r":0.4698795181,
472
+ "f":0.4698795181
473
  },
474
  "nsubj:pass":{
475
+ "p":0.8604651163,
476
  "r":0.8705882353,
477
+ "f":0.865497076
478
  },
479
  "aux:pass":{
480
+ "p":0.9304347826,
481
+ "r":0.9553571429,
482
+ "f":0.9427312775
483
  },
484
  "acl:relcl":{
485
+ "p":0.6419753086,
486
+ "r":0.6046511628,
487
+ "f":0.622754491
488
  },
489
  "advcl":{
490
+ "p":0.4659090909,
491
+ "r":0.5256410256,
492
+ "f":0.4939759036
493
  },
494
  "fixed":{
495
+ "p":0.7604166667,
496
+ "r":0.73,
497
+ "f":0.7448979592
498
  },
499
  "dep":{
500
+ "p":0.2033898305,
501
+ "r":0.4137931034,
502
+ "f":0.2727272727
503
  },
504
  "expl:subj":{
505
+ "p":0.78125,
506
+ "r":0.78125,
507
+ "f":0.78125
508
  },
509
  "expl:comp":{
510
+ "p":0.6341463415,
511
+ "r":0.8666666667,
512
+ "f":0.7323943662
513
  },
514
  "expl:pass":{
515
+ "p":0.3333333333,
516
+ "r":0.1428571429,
517
+ "f":0.2
518
  },
519
  "ccomp":{
520
+ "p":0.8,
521
+ "r":0.7058823529,
522
+ "f":0.75
 
 
 
 
 
523
  },
524
  "parataxis":{
525
+ "p":0.6923076923,
526
+ "r":0.3214285714,
527
+ "f":0.4390243902
528
  },
529
  "iobj":{
530
+ "p":0.8,
531
+ "r":0.48,
532
+ "f":0.6
533
+ },
534
+ "obl:agent":{
535
+ "p":0.9210526316,
536
+ "r":0.8333333333,
537
+ "f":0.875
538
  },
539
  "nsubj:caus":{
540
  "p":0.0,
 
557
  "f":0.0
558
  },
559
  "vocative":{
560
+ "p":1.0,
561
  "r":0.625,
562
+ "f":0.7692307692
563
  },
564
  "dislocated":{
565
  "p":0.0,
 
567
  "f":0.0
568
  },
569
  "flat:foreign":{
570
+ "p":0.5,
571
  "r":0.1428571429,
572
+ "f":0.2222222222
573
  },
574
  "orphan":{
575
  "p":0.0,
 
587
  "f":0.0
588
  }
589
  },
590
+ "tag_acc":0.9457899619,
591
+ "lemma_acc":0.9082408549,
592
+ "ents_p":0.8424958383,
593
+ "ents_r":0.8407589768,
594
+ "ents_f":0.8416265115,
595
  "ents_per_type":{
596
  "PER":{
597
+ "p":0.9084447572,
598
+ "r":0.9249122304,
599
+ "f":0.9166045372
600
  },
601
  "LOC":{
602
+ "p":0.84611891,
603
+ "r":0.8572324939,
604
+ "f":0.8516394465
605
  },
606
  "ORG":{
607
+ "p":0.7897670715,
608
+ "r":0.7570610687,
609
+ "f":0.7730683036
610
  },
611
  "MISC":{
612
+ "p":0.7238526382,
613
+ "r":0.6760460709,
614
+ "f":0.6991330569
615
  }
616
  },
617
+ "speed":4250.2217767942
618
  },
619
  "sources":[
620
  {
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e8d2be042fd235a4c9504e62256764658b05e4c9acb04da98f455f41de6b5227
3
- size 76433
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78ad060331a930372daa8dce4acd41b0753b20835e4cee9c7919ec0b1309cacb
3
+ size 76485
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:62e96bb7c90a432aaeceb54b56a34c799a9067b3a25b976eaa16734feab7e2f7
3
- size 7091792
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a5e1779a1f459969e4075de48031e46e0a9167dfb717f4ae9fb6ef590cf4905
3
+ size 6496592
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e22b464c9877cd55d1284b65859a35df5ca640c6f3644353a1d0a5b1911c7f34
3
  size 304828
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e776d762e856efddbff68e9d84a33d690dbaed07d3191e9d8186d24a345bcdb
3
  size 304828
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":25255},"1":{"":21680},"2":{"case":7258,"det":6062,"nsubj":1982,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":164,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1078,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":409,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
 
1
+ ��moves��{"0":{"":25345},"1":{"":21571},"2":{"case":7318,"det":6066,"nsubj":1969,"punct":1660,"cc":1214,"advmod":1209,"mark":1055,"aux:tense":673,"amod":664,"nummod":609,"aux:pass":546,"obl:mod":480,"nsubj:pass":420,"cop":366,"expl:comp":204,"obj":170,"expl:subj":165,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":4995,"punct":4040,"amod":2051,"conj":1514,"obj":1405,"obl:mod":1188,"obl:arg":1070,"acl":785,"xcomp":739,"flat:name":622,"advmod":564,"fixed":413,"appos":412,"acl:relcl":368,"advcl":306,"ccomp":238,"obl:agent":203,"dep":142,"nummod":124,"parataxis":95,"nsubj":76,"flat:foreign":59},"4":{"ROOT":2231}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:67384ea49c3a155ab1e76bb200b16ab582952bb3afc0fa61ba7de5c45611e836
3
- size 219901
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89e920872f6b2b236a7b15c8512c211271cffa56153a7cba51a95e43900707a4
3
+ size 219953
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b0db6a263e9e98c1ddbb79012aad2e62cb056179274078e49ee9cd0f1c581966
3
- size 6960804
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6790a9d5cfdb564c69805f3db8569322d57796b9876032eb7d30d6537b43def8
3
+ size 6365604
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/key2row CHANGED
Binary files a/vocab/key2row and b/vocab/key2row differ