Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ language: en
|
|
3 |
thumbnail: https://huggingface.co/front/thumbnails/google.png
|
4 |
license: apache-2.0
|
5 |
base_model:
|
6 |
-
-
|
7 |
pipeline_tag: text-classification
|
8 |
library_name: transformers
|
9 |
metrics:
|
@@ -14,17 +14,13 @@ datasets:
|
|
14 |
- Mozilla/autofill_dataset
|
15 |
---
|
16 |
|
17 |
-
##
|
18 |
|
19 |
-
This is
|
20 |
|
21 |
-
|
22 |
-
[TinyBert](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2)
|
23 |
-
checkpoint.
|
24 |
|
25 |
-
|
26 |
-
|
27 |
-
## How to use TinyBert in `transformers`
|
28 |
|
29 |
```python
|
30 |
from transformers import pipeline
|
@@ -44,7 +40,7 @@ print(
|
|
44 |
```python
|
45 |
HyperParameters: {
|
46 |
'learning_rate': 0.000082,
|
47 |
-
'num_train_epochs':
|
48 |
'weight_decay': 0.1,
|
49 |
'per_device_train_batch_size': 32,
|
50 |
}
|
@@ -55,40 +51,31 @@ More information on how the model was trained can be found here: https://github.
|
|
55 |
# Model Performance
|
56 |
```
|
57 |
Test Performance:
|
58 |
-
Precision: 0.
|
59 |
-
Recall: 0.
|
60 |
-
F1: 0.
|
61 |
|
62 |
precision recall f1-score support
|
63 |
|
64 |
-
CC Expiration 1.000 0.
|
65 |
-
CC Expiration Month 0.
|
66 |
-
CC Expiration Year 0.
|
67 |
-
CC Name 0.
|
68 |
-
CC Number 0.
|
69 |
-
CC Payment Type 0.
|
70 |
-
CC Security Code 0.
|
71 |
CC Type 0.917 0.786 0.846 14
|
72 |
-
Confirm Password 0.
|
73 |
-
Email 0.
|
74 |
-
First Name 0.
|
75 |
Form 0.974 0.974 0.974 39
|
76 |
-
Last Name 0.
|
77 |
-
New Password 0.
|
78 |
-
Other 0.
|
79 |
Phone 1.000 0.667 0.800 3
|
80 |
-
Zip Code 0.
|
81 |
-
|
82 |
-
accuracy 0.967 1846
|
83 |
-
macro avg 0.923 0.907 0.910 1846
|
84 |
-
weighted avg 0.968 0.967 0.967 1846
|
85 |
-
```
|
86 |
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
|
91 |
-
journal={arXiv preprint arXiv:1908.08962v2 },
|
92 |
-
year={2019}
|
93 |
-
}
|
94 |
```
|
|
|
3 |
thumbnail: https://huggingface.co/front/thumbnails/google.png
|
4 |
license: apache-2.0
|
5 |
base_model:
|
6 |
+
- cross-encoder/ms-marco-TinyBERT-L-2-v2
|
7 |
pipeline_tag: text-classification
|
8 |
library_name: transformers
|
9 |
metrics:
|
|
|
14 |
- Mozilla/autofill_dataset
|
15 |
---
|
16 |
|
17 |
+
## Cross-Encoder for MS Marco with TinyBert
|
18 |
|
19 |
+
This is a fine-tuned version of the model checkpointed at [cross-encoder/ms-marco-TinyBert-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2).
|
20 |
|
21 |
+
It was fine-tuned on html tags and labels using [Fathom](https://mozilla.github.io/fathom/commands/label.html).
|
|
|
|
|
22 |
|
23 |
+
## How to use this model in `transformers`
|
|
|
|
|
24 |
|
25 |
```python
|
26 |
from transformers import pipeline
|
|
|
40 |
```python
|
41 |
HyperParameters: {
|
42 |
'learning_rate': 0.000082,
|
43 |
+
'num_train_epochs': 71,
|
44 |
'weight_decay': 0.1,
|
45 |
'per_device_train_batch_size': 32,
|
46 |
}
|
|
|
51 |
# Model Performance
|
52 |
```
|
53 |
Test Performance:
|
54 |
+
Precision: 0.9653
|
55 |
+
Recall: 0.9648
|
56 |
+
F1: 0.9644
|
57 |
|
58 |
precision recall f1-score support
|
59 |
|
60 |
+
CC Expiration 1.000 0.625 0.769 16
|
61 |
+
CC Expiration Month 0.919 0.944 0.932 36
|
62 |
+
CC Expiration Year 0.897 0.946 0.921 37
|
63 |
+
CC Name 0.938 0.968 0.952 31
|
64 |
+
CC Number 0.926 1.000 0.962 50
|
65 |
+
CC Payment Type 0.903 0.867 0.884 75
|
66 |
+
CC Security Code 0.975 0.951 0.963 41
|
67 |
CC Type 0.917 0.786 0.846 14
|
68 |
+
Confirm Password 0.911 0.895 0.903 57
|
69 |
+
Email 0.933 0.959 0.946 73
|
70 |
+
First Name 0.833 1.000 0.909 5
|
71 |
Form 0.974 0.974 0.974 39
|
72 |
+
Last Name 0.667 0.800 0.727 5
|
73 |
+
New Password 0.929 0.938 0.933 97
|
74 |
+
Other 0.985 0.985 0.985 1235
|
75 |
Phone 1.000 0.667 0.800 3
|
76 |
+
Zip Code 0.909 0.938 0.923 32
|
|
|
|
|
|
|
|
|
|
|
77 |
|
78 |
+
accuracy 0.965 1846
|
79 |
+
macro avg 0.919 0.897 0.902 1846
|
80 |
+
weighted avg 0.965 0.965 0.964 1846
|
|
|
|
|
|
|
|
|
81 |
```
|