The following model is a Pytorch pre-trained model obtained from converting Tensorflow checkpoint found in the official Google BERT repository.

This is one of the smaller pre-trained BERT variants, together with bert-mini bert-small and bert-medium. They were introduced in the study Well-Read Students Learn Better: On the Importance of Pre-training Compact Models (arxiv), and ported to HF for the study Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics (arXiv). These models are supposed to be trained on a downstream task.

If you use the model, please consider citing both the papers:

@misc{bhargava2021generalization,
      title={Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics}, 
      author={Prajjwal Bhargava and Aleksandr Drozd and Anna Rogers},
      year={2021},
      eprint={2110.01518},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@article{DBLP:journals/corr/abs-1908-08962,
  author    = {Iulia Turc and
               Ming{-}Wei Chang and
               Kenton Lee and
               Kristina Toutanova},
  title     = {Well-Read Students Learn Better: The Impact of Student Initialization
               on Knowledge Distillation},
  journal   = {CoRR},
  volume    = {abs/1908.08962},
  year      = {2019},
  url       = {http://arxiv.org/abs/1908.08962},
  eprinttype = {arXiv},
  eprint    = {1908.08962},
  timestamp = {Thu, 29 Aug 2019 16:32:34 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1908-08962.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Config of this model:

Other models to check out:

Original Implementation and more info can be found in this Github repository.

Twitter: @prajjwal_1

Downloads last month
520,417
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for prajjwal1/bert-tiny

Adapters
1 model
Finetunes
58 models
Quantizations
3 models

Spaces using prajjwal1/bert-tiny 3