|
--- |
|
license: mit |
|
language: |
|
- en |
|
--- |
|
|
|
# BERT-Tiny (uncased) |
|
This is the smallest version of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) |
|
released by [google-research/bert](https://github.com/google-research/bert). |
|
|
|
These BERT models was released as TensorFlow checkpoints, however, this is the converted version to PyTorch. |
|
More information can be found in [google-research/bert](https://github.com/google-research/bert) or [lyeoni/convert-tf-to-pytorch](https://github.com/lyeoni/convert-tf-to-pytorch). |
|
|
|
## Evaluation |
|
Here are the evaluation scores (F1/Accuracy) for the MPRC task. |
|
|Model|MRPC| |
|
|-|:-:| |
|
|BERT-Tiny|81.22/68.38| |
|
|BERT-Mini|81.43/69.36| |
|
|BERT-Small|81.41/70.34| |
|
|BERT-Medium|83.33/73.53| |
|
|BERT-Base|85.62/78.19| |
|
|
|
### References |
|
``` |
|
@article{turc2019, |
|
title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models}, |
|
author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina}, |
|
journal={arXiv preprint arXiv:1908.08962v2 }, |
|
year={2019} |
|
} |
|
``` |