MrRobson9
/

distilbert-base-cased-finetuned-conll2003-english-ner

Token Classification

Inference Endpoints

Model card Files Files and versions Community

MrRobson9 commited on Oct 7, 2024

Commit

1d92511

·

verified ·

1 Parent(s): 14b6881

Update README.md

Files changed (1) hide show

README.md +61 -3

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- eriktks/conll2003
+language:
+- en
+metrics:
+- accuracy
+- precision
+- recall
+- f1
+base_model:
+- distilbert/distilbert-base-cased
+---
+# DistilBERT Base Cased Fine-Tuned on CoNLL2003 for English Named Entity Recognition (NER)
+This model is a fine-tuned version of [DistilBERT-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on the [CoNLL2003](https://huggingface.co/datasets/eriktks/conll2003) dataset for Named Entity Recognition (NER) in English. The CoNLL2003 dataset contains four types of named entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC).
+## Model Details
+- Model Architecture: BERT (Bidirectional Encoder Representations from Transformers)
+- Pre-trained Base Model: bert-base-cased
+- Dataset: CoNLL2003 (NER task)
+- Languages: English
+- Fine-tuned for: Named Entity Recognition (NER)
+- Entities recognized:
+- PER: Person
+- LOC: Location
+- ORG: Organization
+- MISC: Miscellaneous entities
+## Use Cases
+This model is ideal for tasks that require identifying and classifying named entities within English text, such as:
+- Information extraction from unstructured text
+- Content classification and tagging
+- Automated text summarization
+- Question answering systems with a focus on entity recognition
+## How to Use
+To use this model in your code, you can load it via Hugging Face’s Transformers library:
+```python
+from transformers import AutoTokenizer, AutoModelForTokenClassification
+from transformers import pipeline
+tokenizer = AutoTokenizer.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
+model = AutoModelForTokenClassification.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
+nlp_ner = pipeline("ner", model=model, tokenizer=tokenizer)
+result = nlp_ner("John lives in New York and works for the United Nations.")
+print(result)
+```
+## Performance
+|accuracy |precision |recall |f1-score|
+|:-------:|:--------:|:-----:|:------:|
+| 0.987   | 0.937    | 0.941 | 0.939  |
+## License
+This model is licensed under the same terms as the BERT-base-cased model and the CoNLL2003 dataset. Please ensure compliance with all respective licenses when using this model.