File size: 1,339 Bytes
459d60d e18d336 459d60d e18d336 803962b e18d336 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
---
license: cc-by-4.0
language:
- sw
---
BERT medium (cased) model trained on a subset of 125M tokens of cc100-Swahili for our work [Scaling Laws for BERT in Low-Resource Settings](https://youtu.be/dQw4w9WgXcQ) at ACL2023 Findings.
The model has 51M parameters (8L), and a vocab size of 50K.
It was trained for 500K steps with a sequence length of 512 tokens.
Authors
-----------
Gorka Urbizu [1], Iñaki San Vicente [1], Xabier Saralegi [1],
Rodrigo Agerri [2] and Aitor Soroa [2]
Affiliation of the authors:
[1] Orai NLP Technologies
[2] HiTZ Center - Ixa, University of the Basque Country UPV/EHU
Licensing
-------------
Copyright (C) by Orai NLP Technologies.
The model is licensed under the Creative Commons Attribution 4.0. International License (CC BY 4.0).
To view a copy of this license, visit [http://creativecommons.org/licenses/by/4.0/](https://creativecommons.org/licenses/by/4.0/deed.eu).
Acknowledgements
-------------------
If you use this model please cite the following paper:
- G. Urbizu, I. San Vicente, X. Saralegi, R. Agerri, A. Soroa. Scaling Laws for BERT in Low-Resource Settings. Findings of the Association for Computational Linguistics: ACL 2023. July, 2023. Toronto, Canada
Contact information
-----------------------
Gorka Urbizu, Iñaki San Vicente: {g.urbizu,i.sanvicente}@orai.eus |