MrRobson9 commited on
Commit
1d92511
·
verified ·
1 Parent(s): 14b6881

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - eriktks/conll2003
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ - precision
10
+ - recall
11
+ - f1
12
+ base_model:
13
+ - distilbert/distilbert-base-cased
14
+ ---
15
+
16
+ # DistilBERT Base Cased Fine-Tuned on CoNLL2003 for English Named Entity Recognition (NER)
17
+
18
+ This model is a fine-tuned version of [DistilBERT-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on the [CoNLL2003](https://huggingface.co/datasets/eriktks/conll2003) dataset for Named Entity Recognition (NER) in English. The CoNLL2003 dataset contains four types of named entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC).
19
+
20
+ ## Model Details
21
+ - Model Architecture: BERT (Bidirectional Encoder Representations from Transformers)
22
+ - Pre-trained Base Model: bert-base-cased
23
+ - Dataset: CoNLL2003 (NER task)
24
+ - Languages: English
25
+ - Fine-tuned for: Named Entity Recognition (NER)
26
+ - Entities recognized:
27
+ - PER: Person
28
+ - LOC: Location
29
+ - ORG: Organization
30
+ - MISC: Miscellaneous entities
31
+
32
+ ## Use Cases
33
+ This model is ideal for tasks that require identifying and classifying named entities within English text, such as:
34
+
35
+ - Information extraction from unstructured text
36
+ - Content classification and tagging
37
+ - Automated text summarization
38
+ - Question answering systems with a focus on entity recognition
39
+
40
+ ## How to Use
41
+ To use this model in your code, you can load it via Hugging Face’s Transformers library:
42
+
43
+ ```python
44
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
45
+ from transformers import pipeline
46
+
47
+ tokenizer = AutoTokenizer.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
48
+ model = AutoModelForTokenClassification.from_pretrained("MrRobson9/distilbert-base-cased-finetuned-conll2003-english-ner")
49
+
50
+ nlp_ner = pipeline("ner", model=model, tokenizer=tokenizer)
51
+ result = nlp_ner("John lives in New York and works for the United Nations.")
52
+ print(result)
53
+ ```
54
+
55
+ ## Performance
56
+ |accuracy |precision |recall |f1-score|
57
+ |:-------:|:--------:|:-----:|:------:|
58
+ | 0.987 | 0.937 | 0.941 | 0.939 |
59
+
60
+ ## License
61
+ This model is licensed under the same terms as the BERT-base-cased model and the CoNLL2003 dataset. Please ensure compliance with all respective licenses when using this model.