sartajbhuvaji
/

bert-named-entity-recognition

Token Classification

Model card Files Files and versions Community

bert-named-entity-recognition / README.md

sartajbhuvaji's picture

Update README.md

b2fbb7d verified 7 months ago

|

history blame contribute delete

3.61 kB

	---
	library_name: transformers
	tags:
	- bert
	- ner
	license: apache-2.0
	datasets:
	- eriktks/conll2003
	base_model:
	- google-bert/bert-base-uncased
	pipeline_tag: token-classification
	language:
	- en

	results:
	- task:
	type: token-classification
	name: Token Classification
	dataset:
	name: conll2003
	type: conll2003
	config: conll2003
	split: test
	metrics:
	- name: Precision
	type: precision
	value: 0.8992
	verified: true
	- name: Recall
	type: recall
	value: 0.9115
	verified: true
	- name: F1
	type: f1
	value: 0.0.9053
	verified: true
	- name: loss
	type: loss
	value: 0.040937
	verified: true
	---

	# Model Card for Bert Named Entity Recognition

	### Model Description

	This is a chat fine-tuned version of `google-bert/bert-base-uncased`, designed to perform Named Entity Recognition on a text sentence imput.

	- Developed by: [Sartaj](https://huggingface.co/sartajbhuvaji)
	- Finetuned from model: `google-bert/bert-base-uncased`
	- Language(s): English
	- License: apache-2.0
	- Framework: Hugging Face Transformers

	### Model Sources

	- Repository: [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased)
	- Paper: [BERT-paper](https://huggingface.co/papers/1810.04805)

	## Uses

	Model can be used to recognize Named Entities in text.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	from transformers import pipeline

	tokenizer = AutoTokenizer.from_pretrained("sartajbhuvaji/bert-named-entity-recognition")
	model = AutoModelForTokenClassification.from_pretrained("sartajbhuvaji/bert-named-entity-recognition")

	nlp = pipeline("ner", model=model, tokenizer=tokenizer)
	example = "My name is Wolfgang and I live in Berlin"

	ner_results = nlp(example)
	print(ner_results)

	```

	```json
	[
	{
	"end": 19,
	"entity": "B-PER",
	"index": 4,
	"score": 0.99633455,
	"start": 11,
	"word": "wolfgang"
	},
	{
	"end": 40,
	"entity": "B-LOC",
	"index": 9,
	"score": 0.9987465,
	"start": 34,
	"word": "berlin"
	}
	]
	```

	## Training Details

	- Dataset : [eriktks/conll2003](https://huggingface.co/datasets/eriktks/conll2003)

	\| Abbreviation \| Description \|
	\|---\|---\|
	\| O \| Outside of a named entity \|
	\| B-MISC \| Beginning of a miscellaneous entity right after another miscellaneous entity \|
	\| I-MISC \| Miscellaneous entity \|
	\| B-PER \| Beginning of a person's name right after another person's name \|
	\| I-PER \| Person's name \|
	\| B-ORG \| Beginning of an organization right after another organization \|
	\| I-ORG \| Organization \|
	\| B-LOC \| Beginning of a location right after another location \|
	\| I-LOC \| Location \|


	### Training Procedure

	- Full Model Finetune
	- Epochs : 5

	#### Training Loss Curves

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6354695712edd0ed5dc46b04/vVra4giLk3EPjXo48Sbax.png)


	## Trainer
	- global_step: 4390
	- training_loss: 0.040937909830132485
	- train_runtime: 206.3611
	- train_samples_per_second: 340.205
	- train_steps_per_second: 21.273
	- total_flos: 1702317283240608.0
	- train_loss: 0.040937909830132485
	- epoch: 5.0

	## Evaluation

	- Precision: 0.8992
	- Recall: 0.9115
	- F1 Score: 0.9053

	### Classification Report

	\| Class \| Precision \| Recall \| F1-Score \| Support \|
	\|---\|---\|---\|---\|---\|
	\| LOC \| 0.91 \| 0.93 \| 0.92 \| 1668 \|
	\| MISC \| 0.76 \| 0.81 \| 0.78 \| 702 \|
	\| ORG \| 0.87 \| 0.88 \| 0.88 \| 1661 \|
	\| PER \| 0.98 \| 0.97 \| 0.97 \| 1617 \|
	\| Micro Avg \| 0.90 \| 0.91 \| 0.91 \| 5648 \|
	\| Macro Avg \| 0.88 \| 0.90 \| 0.89 \| 5648 \|
	\| Weighted Avg \| 0.90 \| 0.91 \| 0.91 \| 5648 \|

	- Evaluation Dataset : eriktks/conll2003