initial commit
Browse files
README.md
CHANGED
@@ -3,17 +3,17 @@ tags:
|
|
3 |
- flair
|
4 |
- token-classification
|
5 |
- sequence-tagger-model
|
6 |
-
language:
|
7 |
datasets:
|
8 |
- conll2003
|
9 |
inference: false
|
10 |
---
|
11 |
|
12 |
-
##
|
13 |
|
14 |
-
This is the large 4-class NER model for
|
15 |
|
16 |
-
F1-Score: **
|
17 |
|
18 |
**! This model only works with Flair version 0.8 (will be released in the next few days) !**
|
19 |
|
@@ -39,10 +39,10 @@ from flair.data import Sentence
|
|
39 |
from flair.models import SequenceTagger
|
40 |
|
41 |
# load tagger
|
42 |
-
tagger = SequenceTagger.load("flair/ner-
|
43 |
|
44 |
# make example sentence
|
45 |
-
sentence = Sentence("George Washington ging
|
46 |
|
47 |
# predict NER tags
|
48 |
tagger.predict(sentence)
|
@@ -64,7 +64,7 @@ Span [1,2]: "George Washington" [− Labels: PER (1.0)]
|
|
64 |
Span [5]: "Washington" [− Labels: LOC (1.0)]
|
65 |
```
|
66 |
|
67 |
-
So, the entities "*George Washington*" (labeled as a **person**) and "*Washington*" (labeled as a **location**) are found in the sentence "*George Washington ging
|
68 |
|
69 |
|
70 |
---
|
@@ -77,9 +77,9 @@ The following Flair script was used to train this model:
|
|
77 |
import torch
|
78 |
|
79 |
# 1. get the corpus
|
80 |
-
from flair.datasets import
|
81 |
|
82 |
-
corpus =
|
83 |
|
84 |
# 2. what tag do we want to predict?
|
85 |
tag_type = 'ner'
|
@@ -119,7 +119,7 @@ trainer = ModelTrainer(tagger, corpus, optimizer=torch.optim.AdamW)
|
|
119 |
# 7. run training with XLM parameters (20 epochs, small LR)
|
120 |
from torch.optim.lr_scheduler import OneCycleLR
|
121 |
|
122 |
-
trainer.train('resources/taggers/ner-
|
123 |
learning_rate=5.0e-6,
|
124 |
mini_batch_size=4,
|
125 |
mini_batch_chunk_size=1,
|
|
|
3 |
- flair
|
4 |
- token-classification
|
5 |
- sequence-tagger-model
|
6 |
+
language: nl
|
7 |
datasets:
|
8 |
- conll2003
|
9 |
inference: false
|
10 |
---
|
11 |
|
12 |
+
## Dutch NER in Flair (large model)
|
13 |
|
14 |
+
This is the large 4-class NER model for Dutch that ships with [Flair](https://github.com/flairNLP/flair/).
|
15 |
|
16 |
+
F1-Score: **95,25** (CoNLL-03 Dutch)
|
17 |
|
18 |
**! This model only works with Flair version 0.8 (will be released in the next few days) !**
|
19 |
|
|
|
39 |
from flair.models import SequenceTagger
|
40 |
|
41 |
# load tagger
|
42 |
+
tagger = SequenceTagger.load("flair/ner-dutch-large")
|
43 |
|
44 |
# make example sentence
|
45 |
+
sentence = Sentence("George Washington ging naar Washington")
|
46 |
|
47 |
# predict NER tags
|
48 |
tagger.predict(sentence)
|
|
|
64 |
Span [5]: "Washington" [− Labels: LOC (1.0)]
|
65 |
```
|
66 |
|
67 |
+
So, the entities "*George Washington*" (labeled as a **person**) and "*Washington*" (labeled as a **location**) are found in the sentence "*George Washington ging naar Washington*".
|
68 |
|
69 |
|
70 |
---
|
|
|
77 |
import torch
|
78 |
|
79 |
# 1. get the corpus
|
80 |
+
from flair.datasets import CONLL_03_DUTCH
|
81 |
|
82 |
+
corpus = CONLL_03_DUTCH()
|
83 |
|
84 |
# 2. what tag do we want to predict?
|
85 |
tag_type = 'ner'
|
|
|
119 |
# 7. run training with XLM parameters (20 epochs, small LR)
|
120 |
from torch.optim.lr_scheduler import OneCycleLR
|
121 |
|
122 |
+
trainer.train('resources/taggers/ner-dutch-large',
|
123 |
learning_rate=5.0e-6,
|
124 |
mini_batch_size=4,
|
125 |
mini_batch_chunk_size=1,
|