|
--- |
|
language: en |
|
thumbnail: https://huggingface.co/front/thumbnails/google.png |
|
license: apache-2.0 |
|
base_model: |
|
- cross-encoder/ms-marco-TinyBERT-L-2-v2 |
|
pipeline_tag: text-classification |
|
library_name: transformers |
|
metrics: |
|
- f1 |
|
- precision |
|
- recall |
|
datasets: |
|
- Mozilla/autofill_dataset |
|
--- |
|
|
|
## Cross-Encoder for MS Marco with TinyBert |
|
|
|
This is a fine-tuned version of the model checkpointed at [cross-encoder/ms-marco-TinyBert-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2). |
|
|
|
It was fine-tuned on html tags and labels generated using [Fathom](https://mozilla.github.io/fathom/commands/label.html). |
|
|
|
## How to use this model in `transformers` |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
classifier = pipeline( |
|
"text-classification", |
|
model="Mozilla/tinybert-uncased-autofill" |
|
) |
|
|
|
print( |
|
classifier('<input class="cc-number" placeholder="Enter credit card number..." />') |
|
) |
|
|
|
``` |
|
|
|
## Model Training Info |
|
```python |
|
HyperParameters: { |
|
'learning_rate': 0.000082, |
|
'num_train_epochs': 71, |
|
'weight_decay': 0.1, |
|
'per_device_train_batch_size': 32, |
|
} |
|
``` |
|
|
|
More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill |
|
|
|
# Model Performance |
|
``` |
|
Test Performance: |
|
Precision: 0.9653 |
|
Recall: 0.9648 |
|
F1: 0.9644 |
|
|
|
precision recall f1-score support |
|
|
|
CC Expiration 1.000 0.625 0.769 16 |
|
CC Expiration Month 0.919 0.944 0.932 36 |
|
CC Expiration Year 0.897 0.946 0.921 37 |
|
CC Name 0.938 0.968 0.952 31 |
|
CC Number 0.926 1.000 0.962 50 |
|
CC Payment Type 0.903 0.867 0.884 75 |
|
CC Security Code 0.975 0.951 0.963 41 |
|
CC Type 0.917 0.786 0.846 14 |
|
Confirm Password 0.911 0.895 0.903 57 |
|
Email 0.933 0.959 0.946 73 |
|
First Name 0.833 1.000 0.909 5 |
|
Form 0.974 0.974 0.974 39 |
|
Last Name 0.667 0.800 0.727 5 |
|
New Password 0.929 0.938 0.933 97 |
|
Other 0.985 0.985 0.985 1235 |
|
Phone 1.000 0.667 0.800 3 |
|
Zip Code 0.909 0.938 0.923 32 |
|
|
|
accuracy 0.965 1846 |
|
macro avg 0.919 0.897 0.902 1846 |
|
weighted avg 0.965 0.965 0.964 1846 |
|
``` |