illorca commited on
Commit
3b9499b
·
1 Parent(s): 6a56d7d

Using different model for Values from Pop papers

Browse files
Files changed (1) hide show
  1. README.md +38 -38
README.md CHANGED
@@ -121,56 +121,56 @@ The output for different modes and error_formats is:
121
  ```
122
 
123
  #### Values from Popular Papers
124
- A basic [DistilBERT model](https://huggingface.co/docs/transformers/model_doc/distilbert) downstream-trained on the
125
- [WNUT-17](https://huggingface.co/datasets/wnut_17) dataset sheds the following F1 scores. Seqeval is shown for comparison.
126
 
127
- | | Overall | Location | Group | Person | Creative Work | Corporation | Product |
128
- |-----------------|---------|----------|--------|--------|---------------|-------------|---------|
129
- | Traditional | 0.2803 | 0.4124 | 0.0412 | 0.4105 | 0.0 | 0.1985 | 0.0 |
130
- | Fair | 0.3199 | 0.5247 | 0.0459 | 0.4643 | 0.0 | 0.2666 | 0.0 |
131
- | Weighted | 0.3842 | 0.5638 | 0.0681 | 0.5676 | 0.0 | 0.2910 | 0.0 |
132
- | seqeval strict | 0.2222 | 0.3425 | 0.0413 | 0.3598 | 0.0 | 0.0408 | 0.0 |
133
- | seqeval relaxed | 0.2803 | 0.4124 | 0.0412 | 0.4105 | 0.0 | 0.1985 | 0.0 |
134
 
135
  The traditional count of evaluation parameters would be:
136
 
137
- | | Overall | Location | Group | Person | Creative Work | Corporation | Product |
138
- |----|---------|----------|-------|--------|---------------|-------------|---------|
139
- | TP | 211 | 53 | 4 | 140 | 0 | 14 | 0 |
140
- | FP | 353 | 42 | 42 | 174 | 1 | 70 | 0 |
141
- | FN | 730 | 144 | 144 | 228 | 116 | 43 | 114 |
142
 
143
- While the fair evaluation parameter count (`error_format='count'`) is:
144
 
145
- | | Overall | Location | Group | Person | Creative Work | Corporation | Product |
146
- |-----|---------|----------|-------|--------|---------------|-------------|---------|
147
- | TP | 211 | 53 | 4 | 140 | 0 | 0 | 0 |
148
- | FP | 125 | 9 | 21 | 62 | 1 | 32 | 0 |
149
- | FN | 544 | 59 | 115 | 153 | 95 | 34 | 88 |
150
- | BE | 105 | 11 | 4 | 87 | 0 | 3 | 0 |
151
- | LE | 66 | 7 | 20 | 12 | 7 | 6 | 14 |
152
- | LBE | 57 | 10 | 6 | 9 | 15 | 2 | 15 |
153
 
154
  Thus, ratio of each fair error parameter with respect to the total number of errors (`error_format='error_ratio'`) is:
155
 
156
- | | Overall | Location | Group | Person | Creative Work | Corporation | Product |
157
- |-----|---------|----------|--------|--------|---------------|-------------|---------|
158
- | FP | 13,94% | 1,00% | 2,34% | 6,91% | 0,11% | 3,57% | 0,00% |
159
- | FN | 60,65% | 6,58% | 12,82% | 17,06% | 10,59% | 3,79% | 9,81% |
160
- | BE | 11,71% | 1,23% | 0,45% | 9,70% | 0,00% | 0,33% | 0,00% |
161
- | LE | 7,36% | 0,78% | 2,23% | 1,34% | 0,78% | 0,67% | 1,56% |
162
- | LBE | 6,35% | 1,11% | 0,67% | 1,00% | 1,67% | 0,22% | 1,67% |
 
163
 
164
  And the ratio of each fair parameter with respect to the total number of entities (`error_format='entity_ratio'`) is:
165
 
166
- | | Overall | Location | Group | Person | Creative Work | Corporation | Product |
167
- |-----|---------|----------|--------|--------|---------------|-------------|---------|
168
- | TP | 19,04% | 4,78% | 0,36% | 12,64% | 0,00% | 0,00% | 0,00% |
169
- | FP | 11,28% | 0,81% | 1,90% | 5,60% | 0,09% | 2,89% | 0,00% |
170
- | FN | 49,10% | 5,32% | 10,38% | 13,81% | 8,57% | 3,07% | 7,94% |
171
- | BE | 9,48% | 0,99% | 0,36% | 7,85% | 0,00% | 0,27% | 0,00% |
172
- | LE | 5,96% | 0,63% | 1,81% | 1,08% | 0,63% | 0,54% | 1,26% |
173
- | LBE | 5,14% | 0,90% | 0,54% | 0,81% | 1,35% | 0,18% | 1,35% |
174
 
175
  ## Limitations and Bias
176
  The metric is restricted to the input schemes admitted by seqeval. For example, the application does not support numerical
 
121
  ```
122
 
123
  #### Values from Popular Papers
124
+ Computing the evaluation metrics on the results from [this model](https://huggingface.co/muhtasham/bert-small-finetuned-wnut17-ner)
125
+ run on the test split of [WNUT-17 dataset](https://huggingface.co/datasets/wnut_17), we obtain the following F1-Scores:
126
 
127
+ | | overall | location | group | person | creative work | corporation | product |
128
+ |-----------------|---------:|----------:|--------:|--------:|---------------:|-------------:|---------:|
129
+ | traditional | 0.3471 | 0.5254 | 0.0213 | 0.5489 | 0.0 | 0.0238 | 0.0 |
130
+ | fair | 0.3717 | 0.5826 | 0.0235 | 0.5835 | 0.0 | 0.0289 | 0.0 |
131
+ | seqeval strict | 0.3471 | 0.5254 | 0.0213 | 0.5489 | 0.0 | 0.0238 | 0.0 |
132
+ | seqeval relaxed | 0.3383 | 0.4944 | 0.0203 | 0.5462 | 0.0 | 0.0238 | 0.0 |
 
133
 
134
  The traditional count of evaluation parameters would be:
135
 
136
+ | | overall | location | group | person | creative work | corporation | product |
137
+ |----|---------:|----------:|-------:|--------:|---------------:|-------------:|---------:|
138
+ | TP | 255 | 67 | 2 | 185 | 0 | 1 | 0 |
139
+ | FP | 135 | 38 | 20 | 60 | 0 | 17 | 0 |
140
+ | FN | 824 | 83 | 163 | 244 | 142 | 65 | 127 |
141
 
142
+ While the fair evaluation parameter count is:
143
 
144
+ | | overall | location | group | person | creative work | corporation | product |
145
+ |-----|---------:|----------:|-------:|--------:|---------------:|-------------:|---------:|
146
+ | TP | 255 | 67 | 2 | 185 | 0 | 1 | 0 |
147
+ | FP | 31 | 10 | 3 | 16 | 0 | 2 | 0 |
148
+ | FN | 725 | 71 | 135 | 233 | 120 | 54 | 112 |
149
+ | LE | 47 | 4 | 18 | 2 | 6 | 7 | 10 |
150
+ | BE | 30 | 10 | 4 | 13 | 0 | 3 | 0 |
151
+ | LBE | 29 | 1 | 6 | 0 | 16 | 1 | 5 |
152
 
153
  Thus, ratio of each fair error parameter with respect to the total number of errors (`error_format='error_ratio'`) is:
154
 
155
+ | | overall | location | group | person | creative work | corporation | product |
156
+ |-----|---------:|----------:|--------:|--------:|---------------:|-------------:|---------:|
157
+ | TP | 255 | 67 | 2 | 185 | 0 | 1 | 0 |
158
+ | FP | 3,60% | 1,16% | 0,35% | 1,86% | 0,00% | 0,23% | 0,00% |
159
+ | FN | 84,11% | 8,24% | 15,66% | 27,03% | 13,92% | 6,26% | 12,99% |
160
+ | LE | 5,45% | 0,46% | 2,09% | 0,23% | 0,70% | 0,81% | 1,16% |
161
+ | BE | 3,48% | 1,16% | 0,46% | 1,51% | 0,00% | 0,35% | 0,00% |
162
+ | LBE | 3,36% | 0,12% | 0,70% | 0,00% | 1,86% | 0,12% | 0,58% |
163
 
164
  And the ratio of each fair parameter with respect to the total number of entities (`error_format='entity_ratio'`) is:
165
 
166
+ | | overall | location | group | person | creative work | corporation | product |
167
+ |-----|---------:|----------:|--------:|--------:|---------------:|-------------:|---------:|
168
+ | TP | 22,83% | 6,00% | 0,18% | 16,56% | 0,00% | 0,09% | 0,00% |
169
+ | FP | 2,78% | 0,90% | 0,27% | 1,43% | 0,00% | 0,18% | 0,00% |
170
+ | FN | 64,91% | 6,36% | 12,09% | 20,86% | 10,74% | 4,83% | 10,03% |
171
+ | LE | 4,21% | 0,36% | 1,61% | 0,18% | 0,54% | 0,63% | 0,90% |
172
+ | BE | 2,69% | 0,90% | 0,36% | 1,16% | 0,00% | 0,27% | 0,00% |
173
+ | LBE | 2,60% | 0,09% | 0,54% | 0,00% | 1,43% | 0,09% | 0,45% |
174
 
175
  ## Limitations and Bias
176
  The metric is restricted to the input schemes admitted by seqeval. For example, the application does not support numerical