File size: 6,135 Bytes
9f0771c
 
 
 
 
 
 
 
 
5488480
 
 
 
 
 
 
9f0771c
 
 
 
 
 
 
8171496
9f0771c
 
 
8171496
9f0771c
 
 
8171496
 
 
 
 
 
 
 
9f0771c
 
8171496
7633c29
8171496
0f8a1bf
 
 
 
 
 
 
 
 
 
 
 
 
8171496
 
 
0f8a1bf
 
 
 
 
 
 
 
 
 
 
 
 
8171496
 
 
0f8a1bf
 
 
 
 
 
 
 
 
 
 
 
 
 
8171496
9f0771c
 
 
 
8171496
 
 
 
 
 
 
9f0771c
 
 
 
 
 
 
 
61ba558
9f0771c
8171496
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7633c29
8171496
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7633c29
8171496
 
 
 
 
 
7633c29
8171496
 
 
 
 
 
9f0771c
 
 
 
 
 
5488480
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
- generated_from_keras_callback
model-index:
- name: huseyincenik/conll_ner_with_bert
  results: []
datasets:
- tner/conll2003
language:
- en
metrics:
- accuracy
pipeline_tag: token-classification
---

<!-- This model card has been generated automatically according to the information Keras had access to. You should
probably proofread and complete it, then remove this comment. -->

# huseyincenik/conll_ner_with_bert

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the CoNLL-2003 dataset for Named Entity Recognition (NER). 

## Model description

This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks. 

## Intended uses & limitations

### Intended Uses

- **Named Entity Recognition**: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC).

### Limitations

- **Domain Specificity**: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data.
- **Subword Tokens**: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases.

## Training and evaluation data
- **Training Dataset**: CoNLL-2003

- **Training Evaluation Metrics**:
| Label   | Precision | Recall | F1-Score | Support |
|---------|-----------|--------|----------|---------|
| B-PER   | 0.98      | 0.98   | 0.98     | 11273   |
| I-PER   | 0.98      | 0.99   | 0.99     | 9323    |
| B-ORG   | 0.88      | 0.92   | 0.90     | 10447   |
| I-ORG   | 0.81      | 0.92   | 0.86     | 5137    |
| B-LOC   | 0.86      | 0.94   | 0.90     | 9621    |
| I-LOC   | 1.00      | 0.08   | 0.14     | 1267    |
| B-MISC  | 0.81      | 0.73   | 0.77     | 4793    |
| I-MISC  | 0.83      | 0.36   | 0.50     | 1329    |
| **Micro Avg** | **0.90** | **0.90** | **0.90** | **53190** |
| **Macro Avg** | **0.89** | **0.74** | **0.75** | **53190** |
| **Weighted Avg** | **0.90** | **0.90** | **0.89** | **53190** |


- **Validation Evaluation Metrics**:
| Label   | Precision | Recall | F1-Score | Support |
|---------|-----------|--------|----------|---------|
| B-PER   | 0.97      | 0.98   | 0.97     | 3018    |
| I-PER   | 0.98      | 0.98   | 0.98     | 2741    |
| B-ORG   | 0.86      | 0.91   | 0.88     | 2056    |
| I-ORG   | 0.77      | 0.81   | 0.79     | 900     |
| B-LOC   | 0.86      | 0.94   | 0.90     | 2618    |
| I-LOC   | 1.00      | 0.10   | 0.18     | 281     |
| B-MISC  | 0.77      | 0.74   | 0.76     | 1231    |
| I-MISC  | 0.77      | 0.34   | 0.48     | 390     |
| **Micro Avg** | **0.90** | **0.89** | **0.89** | **13235** |
| **Macro Avg** | **0.87** | **0.73** | **0.74** | **13235** |
| **Weighted Avg** | **0.90** | **0.89** | **0.88** | **13235** |


- **Test Evaluation Metrics**:
| Label   | Precision | Recall | F1-Score | Support |
|---------|-----------|--------|----------|---------|
| B-PER   | 0.96      | 0.95   | 0.96     | 2714    |
| I-PER   | 0.98      | 0.99   | 0.98     | 2487    |
| B-ORG   | 0.81      | 0.87   | 0.84     | 2588    |
| I-ORG   | 0.74      | 0.87   | 0.80     | 1050    |
| B-LOC   | 0.81      | 0.90   | 0.85     | 2121    |
| I-LOC   | 0.89      | 0.12   | 0.22     | 276     |
| B-MISC  | 0.75      | 0.67   | 0.71     | 996     |
| I-MISC  | 0.85      | 0.49   | 0.62     | 241     |
| **Micro Avg** | **0.87** | **0.88** | **0.87** | **12473** |
| **Macro Avg** | **0.85** | **0.73** | **0.75** | **12473** |
| **Weighted Avg** | **0.87** | **0.88** | **0.86** | **12473** |




## Training procedure

### Training Hyperparameters

- **Optimizer**: AdamWeightDecay
  - Learning Rate: 2e-05
  - Decay Schedule: PolynomialDecay
  - Warmup Steps: 0.1
  - Weight Decay Rate: 0.01

- training_precision: float32

### Training results

| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 0.1016     | 0.0254          | 0     |
| 0.0228     | 0.0180          | 1     |

### Optimizer Details

```python
from transformers import create_optimizer

batch_size = 32
num_train_epochs = 2
num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs

optimizer, lr_schedule = create_optimizer(
    init_lr=2e-5,
    num_train_steps=num_train_steps,
    weight_decay_rate=0.01,
    num_warmup_steps=0.1
)
```

## How to Use

### Using a Pipeline

```python
from transformers import pipeline

pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert")

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert")
model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")

```

Abbreviation|Description
-|-
O|Outside of a named entity
B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity
I-MISC | Miscellaneous entity
B-PER |Beginning of a person’s name right after another person’s name
I-PER |Person’s name
B-ORG |Beginning of an organization right after another organization
I-ORG |organization
B-LOC |Beginning of a location right after another location
I-LOC |Location


### CoNLL-2003 English Dataset Statistics
This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper. 

#### # of training examples per entity type
Dataset|LOC|MISC|ORG|PER
-|-|-|-|-
Train|7140|3438|6321|6600
Dev|1837|922|1341|1842
Test|1668|702|1661|1617

#### # of articles/sentences/tokens per dataset
Dataset |Articles |Sentences |Tokens
-|-|-|-
Train |946 |14,987 |203,621
Dev |216 |3,466 |51,362
Test |231 |3,684 |46,435

### Framework versions

- Transformers 4.45.0.dev0
- TensorFlow 2.17.0
- Datasets 2.21.0
- Tokenizers 0.19.1