colette-exe commited on
Commit
55da8b9
·
verified ·
1 Parent(s): c15ac55

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -11,6 +11,9 @@ metrics:
11
  model-index:
12
  - name: distilbert-finetuned-ner-for-articles
13
  results: []
 
 
 
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -28,11 +31,36 @@ It achieves the following results on the evaluation set:
28
 
29
  ## Model description
30
 
31
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  ## Intended uses & limitations
34
 
35
- More information needed
 
36
 
37
  ## Training and evaluation data
38
 
@@ -67,4 +95,4 @@ The following hyperparameters were used during training:
67
  - Transformers 4.40.1
68
  - Pytorch 2.2.1+cu121
69
  - Datasets 2.19.0
70
- - Tokenizers 0.19.1
 
11
  model-index:
12
  - name: distilbert-finetuned-ner-for-articles
13
  results: []
14
+ language:
15
+ - en
16
+ library_name: transformers
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
31
 
32
  ## Model description
33
 
34
+ Distilbert finetuned for detecting crime, accidents, and natural disaster occurrences.
35
+
36
+ Tags (IOBES/BIOES tagging format):
37
+ - O: not an entity
38
+ - S-CRIME
39
+ - S-CRIMINAL
40
+ - S-VICTIM
41
+ - S-SUSPECT
42
+ - S-TIMEDATE: date with month, day, year, either one, two, or all of them together
43
+ - S-TIMEWORD: words signifying time (last, weekend, earlier, week, today, etc.)
44
+ - S-TIMEDAY: days of the week
45
+ - S-TIMEDAYPART: morning, afternoon, evening, night
46
+ - S-TIMENUM: 4:31, 6:30, etc.
47
+ - S-TIMEMISC: New Year, Christmas, etc.
48
+ - S-LOC: location word (mentioned alone)
49
+ - B-LOC: beginning (part of a series of location names mentioned)
50
+ - I-LOC: inside
51
+ - E-LOC: end (the last location word specified)
52
+ - S-LOCWORD: junction, island, street, etc.
53
+ - S-LOCDIR: north, south, etc.
54
+ - S-ACCIDENT
55
+ - S-NATDISAS: type of natural disaster
56
+ - S-OTHEROCC: other occurrences (not really labeled much in the dataset)
57
+
58
+ Dataset used is of size 502, manually annotated the dataset from the paper "MN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classification" using Doccano (a free NER annotation tool).
59
 
60
  ## Intended uses & limitations
61
 
62
+ - Needs a bigger dataset.
63
+ - More training is highly recommended.
64
 
65
  ## Training and evaluation data
66
 
 
95
  - Transformers 4.40.1
96
  - Pytorch 2.2.1+cu121
97
  - Datasets 2.19.0
98
+ - Tokenizers 0.19.1