File size: 1,415 Bytes
df8bdb4 49d1899 931394d 49d1899 df8bdb4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
language: wo
tags:
- bert
- language-model
- wo
- wolof
---
# Soraberta: Unsupervised Language Model Pre-training for Wolof
**bert-base-wolof** is pretrained bert-base model on wolof language .
## Soraberta models
| Model name | Number of layers | Attention Heads | Embedding Dimension | Total Parameters |
| :------: | :---: | :---: | :---: | :---: |
| `bert-base` | 6 | 12 | 514 | 56931622 M |
## Using Soraberta with Hugging Face's Transformers
```python
>>> from transformers import pipeline
>>> unmasker = pipeline('fill-mask', model='abdouaziiz/bert-base-wolof')
>>> unmasker("kuy yoot du [MASK].")
[{'sequence': '[CLS] kuy yoot du seqet. [SEP]',
'score': 0.09505125880241394,
'token': 13578},
{'sequence': '[CLS] kuy yoot du daw. [SEP]',
'score': 0.08882280439138412,
'token': 679},
{'sequence': '[CLS] kuy yoot du yoot. [SEP]',
'score': 0.057790059596300125,
'token': 5117},
{'sequence': '[CLS] kuy yoot du seqat. [SEP]',
'score': 0.05671025067567825,
'token': 4992},
{'sequence': '[CLS] kuy yoot du yaqu. [SEP]',
'score': 0.0469999685883522,
'token': 1735}]
```
## Training data
The data sources are [Bible OT](http://biblewolof.com/) , [WOLOF-ONLINE](http://www.wolof-online.com/)
[ALFFA_PUBLIC](https://github.com/getalp/ALFFA_PUBLIC/tree/master/ASR/WOLOF)
## Contact
Please contact [email protected] for any question, feedback or request. |