|
--- |
|
library_name: transformers |
|
tags: |
|
- bert |
|
- ner |
|
license: apache-2.0 |
|
datasets: |
|
- eriktks/conll2003 |
|
base_model: |
|
- google-bert/bert-base-uncased |
|
pipeline_tag: token-classification |
|
language: |
|
- en |
|
|
|
results: |
|
- task: |
|
type: token-classification |
|
name: Token Classification |
|
dataset: |
|
name: conll2003 |
|
type: conll2003 |
|
config: conll2003 |
|
split: test |
|
metrics: |
|
- name: Precision |
|
type: precision |
|
value: 0.8992 |
|
verified: true |
|
- name: Recall |
|
type: recall |
|
value: 0.9115 |
|
verified: true |
|
- name: F1 |
|
type: f1 |
|
value: 0.0.9053 |
|
verified: true |
|
- name: loss |
|
type: loss |
|
value: 0.040937 |
|
verified: true |
|
--- |
|
|
|
# Model Card for Bert Named Entity Recognition |
|
|
|
### Model Description |
|
|
|
This is a chat fine-tuned version of `google-bert/bert-base-uncased`, designed to perform Named Entity Recognition on a text sentence imput. |
|
|
|
- **Developed by:** [Sartaj](https://huggingface.co/sartajbhuvaji) |
|
- **Finetuned from model:** `google-bert/bert-base-uncased` |
|
- **Language(s):** English |
|
- **License:** apache-2.0 |
|
- **Framework:** Hugging Face Transformers |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) |
|
- **Paper:** [BERT-paper](https://huggingface.co/papers/1810.04805) |
|
|
|
## Uses |
|
|
|
Model can be used to recognize Named Entities in text. |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForTokenClassification |
|
from transformers import pipeline |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("sartajbhuvaji/bert-named-entity-recognition") |
|
model = AutoModelForTokenClassification.from_pretrained("sartajbhuvaji/bert-named-entity-recognition") |
|
|
|
nlp = pipeline("ner", model=model, tokenizer=tokenizer) |
|
example = "My name is Wolfgang and I live in Berlin" |
|
|
|
ner_results = nlp(example) |
|
print(ner_results) |
|
|
|
``` |
|
|
|
```json |
|
[ |
|
{ |
|
"end": 19, |
|
"entity": "B-PER", |
|
"index": 4, |
|
"score": 0.99633455, |
|
"start": 11, |
|
"word": "wolfgang" |
|
}, |
|
{ |
|
"end": 40, |
|
"entity": "B-LOC", |
|
"index": 9, |
|
"score": 0.9987465, |
|
"start": 34, |
|
"word": "berlin" |
|
} |
|
] |
|
``` |
|
|
|
## Training Details |
|
|
|
- **Dataset** : [eriktks/conll2003](https://huggingface.co/datasets/eriktks/conll2003) |
|
|
|
| Abbreviation | Description | |
|
|---|---| |
|
| O | Outside of a named entity | |
|
| B-MISC | Beginning of a miscellaneous entity right after another miscellaneous entity | |
|
| I-MISC | Miscellaneous entity | |
|
| B-PER | Beginning of a person's name right after another person's name | |
|
| I-PER | Person's name | |
|
| B-ORG | Beginning of an organization right after another organization | |
|
| I-ORG | Organization | |
|
| B-LOC | Beginning of a location right after another location | |
|
| I-LOC | Location | |
|
|
|
|
|
### Training Procedure |
|
|
|
- Full Model Finetune |
|
- Epochs : 5 |
|
|
|
#### Training Loss Curves |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6354695712edd0ed5dc46b04/vVra4giLk3EPjXo48Sbax.png) |
|
|
|
|
|
## Trainer |
|
- global_step: 4390 |
|
- training_loss: 0.040937909830132485 |
|
- train_runtime: 206.3611 |
|
- train_samples_per_second: 340.205 |
|
- train_steps_per_second: 21.273 |
|
- total_flos: 1702317283240608.0 |
|
- train_loss: 0.040937909830132485 |
|
- epoch: 5.0 |
|
|
|
## Evaluation |
|
|
|
- Precision: 0.8992 |
|
- Recall: 0.9115 |
|
- F1 Score: 0.9053 |
|
|
|
### Classification Report |
|
|
|
| Class | Precision | Recall | F1-Score | Support | |
|
|---|---|---|---|---| |
|
| LOC | 0.91 | 0.93 | 0.92 | 1668 | |
|
| MISC | 0.76 | 0.81 | 0.78 | 702 | |
|
| ORG | 0.87 | 0.88 | 0.88 | 1661 | |
|
| PER | 0.98 | 0.97 | 0.97 | 1617 | |
|
| **Micro Avg** | 0.90 | 0.91 | 0.91 | 5648 | |
|
| **Macro Avg** | 0.88 | 0.90 | 0.89 | 5648 | |
|
| **Weighted Avg** | 0.90 | 0.91 | 0.91 | 5648 | |
|
|
|
- Evaluation Dataset : eriktks/conll2003 |
|
|