lisaterumi commited on
Commit
7eb8862
1 Parent(s): 023447b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md CHANGED
@@ -13,5 +13,107 @@ datasets:
13
  ## A Biomedical Pos-Tagger for English
14
  Trained with the GENIA corpus.
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  See more in: https://github.com/lisaterumi/postagger-bio-english
 
13
  ## A Biomedical Pos-Tagger for English
14
  Trained with the GENIA corpus.
15
 
16
+ Eval:
17
+ ```
18
+ precision recall f1-score support
19
+
20
+ 0 0.98 1.00 0.99 263
21
+ 3 0.93 1.00 0.97 14
22
+ 5 1.00 1.00 1.00 8
23
+ 6 0.99 0.99 0.99 169
24
+ 7 1.00 1.00 1.00 203
25
+ 8 0.99 1.00 1.00 195
26
+ 9 0.95 0.78 0.85 98
27
+ 10 0.83 1.00 0.91 5
28
+ 11 0.96 0.97 0.96 532
29
+ 12 1.00 1.00 1.00 252
30
+ 13 0.99 0.98 0.99 1575
31
+ 14 0.95 0.95 0.95 133
32
+ 15 0.89 0.89 0.89 9
33
+ 16 1.00 1.00 1.00 3
34
+ 18 0.99 1.00 0.99 69
35
+ 19 1.00 0.95 0.98 22
36
+ 20 0.99 1.00 1.00 395
37
+ 22 1.00 1.00 1.00 1328
38
+ 23 1.00 1.00 1.00 987
39
+ 24 1.00 1.00 1.00 6
40
+ 25 0.00 0.00 0.00 0
41
+ 26 1.00 1.00 1.00 620
42
+ 27 0.00 0.00 0.00 1
43
+ 28 1.00 1.00 1.00 39
44
+ 29 0.98 0.99 0.98 5674
45
+ 30 0.97 0.96 0.96 2075
46
+ 31 1.00 0.71 0.83 7
47
+ 32 1.00 0.80 0.89 5
48
+ 33 1.00 1.00 1.00 58
49
+ 34 1.00 1.00 1.00 2
50
+ 35 0.96 0.96 0.96 336
51
+ 37 0.99 1.00 1.00 1579
52
+ 38 1.00 1.00 1.00 1446
53
+ 39 1.00 0.98 0.99 57
54
+
55
+ accuracy 0.99 18165
56
+ macro avg 0.92 0.91 0.91 18165
57
+ weighted avg 0.99 0.99 0.99 18165
58
+
59
+ F1: 0.985267446136761 Accuracy: 0.9853564547206166
60
+ ```
61
+
62
+ Tags:
63
+ ```
64
+ {0: 'VBD',
65
+ 1: 'N',
66
+ 2: 'XT',
67
+ 3: 'JJS',
68
+ 4: 'E2A',
69
+ 5: 'WRB',
70
+ 6: 'VB',
71
+ 7: 'TO',
72
+ 8: 'VBP',
73
+ 9: 'FW',
74
+ 10: 'EX',
75
+ 11: 'VBN',
76
+ 12: 'VBZ',
77
+ 13: 'NNS',
78
+ 14: 'VBG',
79
+ 15: 'RBR',
80
+ 16: 'WP',
81
+ 17: 'CT',
82
+ 18: 'PRP',
83
+ 19: 'JJR',
84
+ 20: 'CC',
85
+ 21: 'NNPS',
86
+ 22: 'CD',
87
+ 23: 'DT',
88
+ 24: 'NNP',
89
+ 25: 'PDT',
90
+ 26: 'LS',
91
+ 27: 'PP',
92
+ 28: 'PRP$',
93
+ 29: 'NN',
94
+ 30: 'JJ',
95
+ 31: 'RP',
96
+ 32: 'RBS',
97
+ 33: 'MD',
98
+ 34: 'WP$',
99
+ 35: 'RB',
100
+ 36: 'SYM',
101
+ 37: 'IN',
102
+ 38: 'PUNCT',
103
+ 39: 'WDT',
104
+ 40: 'POS',
105
+ 41: '<pad>'}
106
+ ```
107
+
108
+ Parameters:
109
+ ```
110
+ nepochs = 30 (stop at 18th)
111
+ batch_size = 32
112
+ batch_status = 32
113
+ learning_rate = 1e-5
114
+ early_stop = 3
115
+ max_length = 200
116
+ checkpoint: dmis-lab/biobert-base-cased-v1.2
117
+ ```
118
 
119
  See more in: https://github.com/lisaterumi/postagger-bio-english