File size: 3,610 Bytes

0a0f1ad
 
04a446e
 
 
 
 
 
 
 
 
 
 
 
 
b2fbb7d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a0f1ad
 
04a446e
0a0f1ad
 
 
04a446e
0a0f1ad
04a446e
 
 
 
 
0a0f1ad
04a446e
0a0f1ad
04a446e
 
0a0f1ad
 
 
04a446e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0a0f1ad
 
 
04a446e
0a0f1ad
04a446e
 
 
 
 
 
 
 
 
 
 
0a0f1ad
 
 
 
04a446e
 
0a0f1ad
04a446e
0a0f1ad
04a446e
0a0f1ad
 
04a446e
 
 
 
 
 
 
 
 
0a0f1ad
 
 
04a446e
 
 
0a0f1ad
04a446e
0a0f1ad
04a446e
 
 
 
 
 
 
 
 
0a0f1ad
04a446e

---
library_name: transformers
tags:
- bert
- ner
license: apache-2.0
datasets:
- eriktks/conll2003
base_model:
- google-bert/bert-base-uncased
pipeline_tag: token-classification
language:
- en

results:
- task:
  type: token-classification
  name: Token Classification
dataset:
  name: conll2003
  type: conll2003
  config: conll2003
  split: test
metrics:
- name: Precision
  type: precision
  value: 0.8992
  verified: true
- name: Recall
  type: recall
  value: 0.9115
  verified: true
- name: F1
  type: f1
  value: 0.0.9053
  verified: true
- name: loss
  type: loss
  value: 0.040937
  verified: true
---

# Model Card for Bert Named Entity Recognition

### Model Description

This is a chat fine-tuned version of `google-bert/bert-base-uncased`, designed to perform Named Entity Recognition on a text sentence imput.

- **Developed by:** [Sartaj](https://huggingface.co/sartajbhuvaji)
- **Finetuned from model:** `google-bert/bert-base-uncased`
- **Language(s):** English
- **License:** apache-2.0
- **Framework:** Hugging Face Transformers

### Model Sources 

- **Repository:** [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
- **Paper:** [BERT-paper](https://huggingface.co/papers/1810.04805)

## Uses

Model can be used to recognize Named Entities in text.

## Usage

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("sartajbhuvaji/bert-named-entity-recognition")
model = AutoModelForTokenClassification.from_pretrained("sartajbhuvaji/bert-named-entity-recognition")

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Wolfgang and I live in Berlin"

ner_results = nlp(example)
print(ner_results)

```

```json
[
  {
    "end": 19,
    "entity": "B-PER",
    "index": 4,
    "score": 0.99633455,
    "start": 11,
    "word": "wolfgang"
  },
  {
    "end": 40,
    "entity": "B-LOC",
    "index": 9,
    "score": 0.9987465,
    "start": 34,
    "word": "berlin"
  }
]
```

## Training Details

- **Dataset** : [eriktks/conll2003](https://huggingface.co/datasets/eriktks/conll2003)

| Abbreviation | Description |
|---|---|
| O | Outside of a named entity |
| B-MISC | Beginning of a miscellaneous entity right after another miscellaneous entity |
| I-MISC | Miscellaneous entity |
| B-PER | Beginning of a person's name right after another person's name |
| I-PER | Person's name |
| B-ORG | Beginning of an organization right after another organization |
| I-ORG | Organization |
| B-LOC | Beginning of a location right after another location |
| I-LOC | Location |


### Training Procedure

- Full Model Finetune
- Epochs : 5

#### Training Loss Curves

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6354695712edd0ed5dc46b04/vVra4giLk3EPjXo48Sbax.png)


## Trainer 
- global_step: 4390
- training_loss: 0.040937909830132485
- train_runtime: 206.3611
- train_samples_per_second: 340.205
- train_steps_per_second: 21.273
- total_flos: 1702317283240608.0
- train_loss: 0.040937909830132485
- epoch: 5.0

## Evaluation

- Precision: 0.8992
- Recall: 0.9115
- F1 Score: 0.9053

### Classification Report

| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| LOC | 0.91 | 0.93 | 0.92 | 1668 |
| MISC | 0.76 | 0.81 | 0.78 | 702 |
| ORG | 0.87 | 0.88 | 0.88 | 1661 |
| PER | 0.98 | 0.97 | 0.97 | 1617 |
| **Micro Avg** | 0.90 | 0.91 | 0.91 | 5648 |
| **Macro Avg** | 0.88 | 0.90 | 0.89 | 5648 |
| **Weighted Avg** | 0.90 | 0.91 | 0.91 | 5648 |

- Evaluation Dataset : eriktks/conll2003