Using different model for Values from Pop papers
Browse files
README.md
CHANGED
@@ -121,56 +121,56 @@ The output for different modes and error_formats is:
|
|
121 |
```
|
122 |
|
123 |
#### Values from Popular Papers
|
124 |
-
|
125 |
-
[WNUT-17](https://huggingface.co/datasets/wnut_17)
|
126 |
|
127 |
-
| |
|
128 |
-
|
129 |
-
|
|
130 |
-
|
|
131 |
-
|
|
132 |
-
| seqeval
|
133 |
-
| seqeval relaxed | 0.2803 | 0.4124 | 0.0412 | 0.4105 | 0.0 | 0.1985 | 0.0 |
|
134 |
|
135 |
The traditional count of evaluation parameters would be:
|
136 |
|
137 |
-
| |
|
138 |
-
|
139 |
-
| TP |
|
140 |
-
| FP |
|
141 |
-
| FN |
|
142 |
|
143 |
-
While the fair evaluation parameter count
|
144 |
|
145 |
-
| |
|
146 |
-
|
147 |
-
| TP |
|
148 |
-
| FP |
|
149 |
-
| FN |
|
150 |
-
|
|
151 |
-
|
|
152 |
-
| LBE |
|
153 |
|
154 |
Thus, ratio of each fair error parameter with respect to the total number of errors (`error_format='error_ratio'`) is:
|
155 |
|
156 |
-
| |
|
157 |
-
|
158 |
-
|
|
159 |
-
|
|
160 |
-
|
|
161 |
-
| LE |
|
162 |
-
|
|
|
|
163 |
|
164 |
And the ratio of each fair parameter with respect to the total number of entities (`error_format='entity_ratio'`) is:
|
165 |
|
166 |
-
| |
|
167 |
-
|
168 |
-
| TP |
|
169 |
-
| FP |
|
170 |
-
| FN |
|
171 |
-
|
|
172 |
-
|
|
173 |
-
| LBE |
|
174 |
|
175 |
## Limitations and Bias
|
176 |
The metric is restricted to the input schemes admitted by seqeval. For example, the application does not support numerical
|
|
|
121 |
```
|
122 |
|
123 |
#### Values from Popular Papers
|
124 |
+
Computing the evaluation metrics on the results from [this model](https://huggingface.co/muhtasham/bert-small-finetuned-wnut17-ner)
|
125 |
+
run on the test split of [WNUT-17 dataset](https://huggingface.co/datasets/wnut_17), we obtain the following F1-Scores:
|
126 |
|
127 |
+
| | overall | location | group | person | creative work | corporation | product |
|
128 |
+
|-----------------|---------:|----------:|--------:|--------:|---------------:|-------------:|---------:|
|
129 |
+
| traditional | 0.3471 | 0.5254 | 0.0213 | 0.5489 | 0.0 | 0.0238 | 0.0 |
|
130 |
+
| fair | 0.3717 | 0.5826 | 0.0235 | 0.5835 | 0.0 | 0.0289 | 0.0 |
|
131 |
+
| seqeval strict | 0.3471 | 0.5254 | 0.0213 | 0.5489 | 0.0 | 0.0238 | 0.0 |
|
132 |
+
| seqeval relaxed | 0.3383 | 0.4944 | 0.0203 | 0.5462 | 0.0 | 0.0238 | 0.0 |
|
|
|
133 |
|
134 |
The traditional count of evaluation parameters would be:
|
135 |
|
136 |
+
| | overall | location | group | person | creative work | corporation | product |
|
137 |
+
|----|---------:|----------:|-------:|--------:|---------------:|-------------:|---------:|
|
138 |
+
| TP | 255 | 67 | 2 | 185 | 0 | 1 | 0 |
|
139 |
+
| FP | 135 | 38 | 20 | 60 | 0 | 17 | 0 |
|
140 |
+
| FN | 824 | 83 | 163 | 244 | 142 | 65 | 127 |
|
141 |
|
142 |
+
While the fair evaluation parameter count is:
|
143 |
|
144 |
+
| | overall | location | group | person | creative work | corporation | product |
|
145 |
+
|-----|---------:|----------:|-------:|--------:|---------------:|-------------:|---------:|
|
146 |
+
| TP | 255 | 67 | 2 | 185 | 0 | 1 | 0 |
|
147 |
+
| FP | 31 | 10 | 3 | 16 | 0 | 2 | 0 |
|
148 |
+
| FN | 725 | 71 | 135 | 233 | 120 | 54 | 112 |
|
149 |
+
| LE | 47 | 4 | 18 | 2 | 6 | 7 | 10 |
|
150 |
+
| BE | 30 | 10 | 4 | 13 | 0 | 3 | 0 |
|
151 |
+
| LBE | 29 | 1 | 6 | 0 | 16 | 1 | 5 |
|
152 |
|
153 |
Thus, ratio of each fair error parameter with respect to the total number of errors (`error_format='error_ratio'`) is:
|
154 |
|
155 |
+
| | overall | location | group | person | creative work | corporation | product |
|
156 |
+
|-----|---------:|----------:|--------:|--------:|---------------:|-------------:|---------:|
|
157 |
+
| TP | 255 | 67 | 2 | 185 | 0 | 1 | 0 |
|
158 |
+
| FP | 3,60% | 1,16% | 0,35% | 1,86% | 0,00% | 0,23% | 0,00% |
|
159 |
+
| FN | 84,11% | 8,24% | 15,66% | 27,03% | 13,92% | 6,26% | 12,99% |
|
160 |
+
| LE | 5,45% | 0,46% | 2,09% | 0,23% | 0,70% | 0,81% | 1,16% |
|
161 |
+
| BE | 3,48% | 1,16% | 0,46% | 1,51% | 0,00% | 0,35% | 0,00% |
|
162 |
+
| LBE | 3,36% | 0,12% | 0,70% | 0,00% | 1,86% | 0,12% | 0,58% |
|
163 |
|
164 |
And the ratio of each fair parameter with respect to the total number of entities (`error_format='entity_ratio'`) is:
|
165 |
|
166 |
+
| | overall | location | group | person | creative work | corporation | product |
|
167 |
+
|-----|---------:|----------:|--------:|--------:|---------------:|-------------:|---------:|
|
168 |
+
| TP | 22,83% | 6,00% | 0,18% | 16,56% | 0,00% | 0,09% | 0,00% |
|
169 |
+
| FP | 2,78% | 0,90% | 0,27% | 1,43% | 0,00% | 0,18% | 0,00% |
|
170 |
+
| FN | 64,91% | 6,36% | 12,09% | 20,86% | 10,74% | 4,83% | 10,03% |
|
171 |
+
| LE | 4,21% | 0,36% | 1,61% | 0,18% | 0,54% | 0,63% | 0,90% |
|
172 |
+
| BE | 2,69% | 0,90% | 0,36% | 1,16% | 0,00% | 0,27% | 0,00% |
|
173 |
+
| LBE | 2,60% | 0,09% | 0,54% | 0,00% | 1,43% | 0,09% | 0,45% |
|
174 |
|
175 |
## Limitations and Bias
|
176 |
The metric is restricted to the input schemes admitted by seqeval. For example, the application does not support numerical
|