File size: 3,610 Bytes
0a0f1ad
 
04a446e
 
 
 
 
 
 
 
 
 
 
 
 
b2fbb7d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a0f1ad
 
04a446e
0a0f1ad
 
 
04a446e
0a0f1ad
04a446e
 
 
 
 
0a0f1ad
04a446e
0a0f1ad
04a446e
 
0a0f1ad
 
 
04a446e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a0f1ad
 
 
04a446e
0a0f1ad
04a446e
 
 
 
 
 
 
 
 
 
 
0a0f1ad
 
 
 
04a446e
 
0a0f1ad
04a446e
0a0f1ad
04a446e
0a0f1ad
 
04a446e
 
 
 
 
 
 
 
 
0a0f1ad
 
 
04a446e
 
 
0a0f1ad
04a446e
0a0f1ad
04a446e
 
 
 
 
 
 
 
 
0a0f1ad
04a446e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
library_name: transformers
tags:
- bert
- ner
license: apache-2.0
datasets:
- eriktks/conll2003
base_model:
- google-bert/bert-base-uncased
pipeline_tag: token-classification
language:
- en

results:
- task:
  type: token-classification
  name: Token Classification
dataset:
  name: conll2003
  type: conll2003
  config: conll2003
  split: test
metrics:
- name: Precision
  type: precision
  value: 0.8992
  verified: true
- name: Recall
  type: recall
  value: 0.9115
  verified: true
- name: F1
  type: f1
  value: 0.0.9053
  verified: true
- name: loss
  type: loss
  value: 0.040937
  verified: true
---

# Model Card for Bert Named Entity Recognition

### Model Description

This is a chat fine-tuned version of `google-bert/bert-base-uncased`, designed to perform Named Entity Recognition on a text sentence imput.

- **Developed by:** [Sartaj](https://huggingface.co/sartajbhuvaji)
- **Finetuned from model:** `google-bert/bert-base-uncased`
- **Language(s):** English
- **License:** apache-2.0
- **Framework:** Hugging Face Transformers

### Model Sources 

- **Repository:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
- **Paper:** [BERT-paper](https://huggingface.co/papers/1810.04805)

## Uses

Model can be used to recognize Named Entities in text.

## Usage

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("sartajbhuvaji/bert-named-entity-recognition")
model = AutoModelForTokenClassification.from_pretrained("sartajbhuvaji/bert-named-entity-recognition")

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Wolfgang and I live in Berlin"

ner_results = nlp(example)
print(ner_results)

```

```json
[
  {
    "end": 19,
    "entity": "B-PER",
    "index": 4,
    "score": 0.99633455,
    "start": 11,
    "word": "wolfgang"
  },
  {
    "end": 40,
    "entity": "B-LOC",
    "index": 9,
    "score": 0.9987465,
    "start": 34,
    "word": "berlin"
  }
]
```

## Training Details

- **Dataset** : [eriktks/conll2003](https://huggingface.co/datasets/eriktks/conll2003)

| Abbreviation | Description |
|---|---|
| O | Outside of a named entity |
| B-MISC | Beginning of a miscellaneous entity right after another miscellaneous entity |
| I-MISC | Miscellaneous entity |
| B-PER | Beginning of a person's name right after another person's name |
| I-PER | Person's name |
| B-ORG | Beginning of an organization right after another organization |
| I-ORG | Organization |
| B-LOC | Beginning of a location right after another location |
| I-LOC | Location |


### Training Procedure

- Full Model Finetune
- Epochs : 5

#### Training Loss Curves

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6354695712edd0ed5dc46b04/vVra4giLk3EPjXo48Sbax.png)


## Trainer 
- global_step: 4390
- training_loss: 0.040937909830132485
- train_runtime: 206.3611
- train_samples_per_second: 340.205
- train_steps_per_second: 21.273
- total_flos: 1702317283240608.0
- train_loss: 0.040937909830132485
- epoch: 5.0

## Evaluation

- Precision: 0.8992
- Recall: 0.9115
- F1 Score: 0.9053

### Classification Report

| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| LOC | 0.91 | 0.93 | 0.92 | 1668 |
| MISC | 0.76 | 0.81 | 0.78 | 702 |
| ORG | 0.87 | 0.88 | 0.88 | 1661 |
| PER | 0.98 | 0.97 | 0.97 | 1617 |
| **Micro Avg** | 0.90 | 0.91 | 0.91 | 5648 |
| **Macro Avg** | 0.88 | 0.90 | 0.89 | 5648 |
| **Weighted Avg** | 0.90 | 0.91 | 0.91 | 5648 |

- Evaluation Dataset : eriktks/conll2003