|
--- |
|
language: |
|
- multilingual |
|
- ny |
|
- kg |
|
- kmb |
|
- rw |
|
- ln |
|
- lua |
|
- lg |
|
- nso |
|
- rn |
|
- st |
|
- sw |
|
- ss |
|
- ts |
|
- tn |
|
- tum |
|
- umb |
|
- xh |
|
- zu |
|
- fr |
|
- en |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
### How to use |
|
|
|
You can use this model directly with a pipeline for masked language modeling: |
|
|
|
```python |
|
>>> from transformers import pipeline |
|
>>> unmasker = pipeline('fill-mask', model='nairaxo/toumbert') |
|
>>> unmasker("rais wa [MASK] ya tanzania.") |
|
|
|
|
|
``` |
|
|
|
Here is how to use this model to get the features of a given text in PyTorch: |
|
|
|
```python |
|
from transformers import BertTokenizer, BertModel |
|
tokenizer = BertTokenizer.from_pretrained('nairaxo/toumbert') |
|
model = BertModel.from_pretrained("nairaxo/toumbert") |
|
text = "Replace me by any text you'd like." |
|
encoded_input = tokenizer(text, return_tensors='pt') |
|
output = model(**encoded_input) |
|
``` |
|
|
|
and in TensorFlow: |
|
|
|
```python |
|
from transformers import BertTokenizer, TFBertModel |
|
tokenizer = BertTokenizer.from_pretrained('nairaxo/toumbert') |
|
model = TFBertModel.from_pretrained("nairaxo/toumbert") |
|
text = "Replace me by any text you'd like." |
|
encoded_input = tokenizer(text, return_tensors='tf') |
|
output = model(encoded_input) |
|
``` |