BERT L6-H768 (uncased)

Mini BERT models from https://arxiv.org/abs/1908.08962 that the HF team didn't convert. The original conversion script is used.

See the original Google repo: google-research/bert

Note: it's not clear if these checkpoints have undergone knowledge distillation.

Model variants

Usage

See other BERT model cards e.g. https://huggingface.co/bert-base-uncased

Citation

@article{turc2019,
  title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
  author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1908.08962v2 },
  year={2019}
}
Downloads last month
50,553
Safetensors
Model size
67.6M params
Tensor type
I64
·
F32
·
Inference API
Examples
Mask token: [MASK]

Datasets used to train gaunernst/bert-L6-H768-uncased

Collection including gaunernst/bert-L6-H768-uncased