--- language: en thumbnail: https://huggingface.co/front/thumbnails/google.png license: apache-2.0 base_model: - cross-encoder/ms-marco-TinyBERT-L-2-v2 pipeline_tag: text-classification library_name: transformers metrics: - f1 - precision - recall datasets: - Mozilla/autofill_dataset --- ## Cross-Encoder for MS Marco with TinyBert This is a fine-tuned version of the model checkpointed at [cross-encoder/ms-marco-TinyBert-L-2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2). It was fine-tuned on html tags and labels generated using [Fathom](https://mozilla.github.io/fathom/commands/label.html). ## How to use this model in `transformers` ```python from transformers import pipeline classifier = pipeline( "text-classification", model="Mozilla/tinybert-uncased-autofill" ) print( classifier('') ) ``` ## Model Training Info ```python HyperParameters: { 'learning_rate': 0.000082, 'num_train_epochs': 71, 'weight_decay': 0.1, 'per_device_train_batch_size': 32, } ``` More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill # Model Performance ``` Test Performance: Precision: 0.9653 Recall: 0.9648 F1: 0.9644 precision recall f1-score support CC Expiration 1.000 0.625 0.769 16 CC Expiration Month 0.919 0.944 0.932 36 CC Expiration Year 0.897 0.946 0.921 37 CC Name 0.938 0.968 0.952 31 CC Number 0.926 1.000 0.962 50 CC Payment Type 0.903 0.867 0.884 75 CC Security Code 0.975 0.951 0.963 41 CC Type 0.917 0.786 0.846 14 Confirm Password 0.911 0.895 0.903 57 Email 0.933 0.959 0.946 73 First Name 0.833 1.000 0.909 5 Form 0.974 0.974 0.974 39 Last Name 0.667 0.800 0.727 5 New Password 0.929 0.938 0.933 97 Other 0.985 0.985 0.985 1235 Phone 1.000 0.667 0.800 3 Zip Code 0.909 0.938 0.923 32 accuracy 0.965 1846 macro avg 0.919 0.897 0.902 1846 weighted avg 0.965 0.965 0.964 1846 ```