File size: 1,045 Bytes
58344d9 51534d7 58344d9 51534d7 911e2e4 51534d7 911e2e4 51534d7 911e2e4 51534d7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
---
language: twi
license: mit
---
## TwiBERT
## Model Description
TwiBERT is a language model pretrained on the Twi language, the most spoken language in Ghana, West Africa.
The model has 61 million parameters, 6 attention heads, 768 hidden units and 3072 feed forward size. The model
was trained on the Asanti Twi Bible together with a crowdsourced dataset.
## Limitations:
The model was trained on a very small dataset (about 5MB), which makes it difficult for the model
to learn complex contextual embeddings that will enable it to generalize. Plus, the scope of the dataset (the bible) might
give it strong religious bias.
## How to use it
You can finetune TwiBERT by finetuning it on a downtream task.
The example code below illustrates how you can use the TwiBERT model on a downtream task:
```python
>>> from transformers import AutoTokenizer, AutoModelForTokenClassification
>>> model = AutoModelForTokenClassification.from_pretrained("sakrah/twibert")
>>> tokenizer = AutoTokenizer.from_pretrained("sakrah/twibert")
```
|