|
--- |
|
license: cc |
|
language: |
|
- ve |
|
- nso |
|
metrics: |
|
- perplexity |
|
library_name: transformers |
|
tags: |
|
- tshivenda |
|
- sepedi |
|
- sesotho sa leboa |
|
- nothern sotho |
|
- south africa |
|
- low-resource |
|
- bantu |
|
- xlm-roberta |
|
widget: |
|
- text: Rabulasi wa <mask> u khou bvelela nga u lima |
|
- text: >- |
|
Vhana vhane vha kha ḓi bva u bebwa vha kha khombo ya u <mask> nga |
|
Listeriosis |
|
--- |
|
|
|
# Zabantu - Tshivenda & Sepedi family |
|
|
|
This is a variant of [Zabantu](https://huggingface.co/dsfsi/zabantu-bantu-250m) pre-trained on a multilingual dataset of Tshivenda(ven) and Sepedi(nso) sentences on a |
|
transformer network with 170 million traininable parameters. |
|
|
|
|
|
# Usage Example(s) |
|
|
|
```python |
|
from transformers import pipeline |
|
# Initialize the pipeline for masked language model |
|
unmasker = pipeline('fill-mask', model='dsfsi/zabantu-nso-ven-170m') |
|
|
|
sample_sentences = ["Rabulasi wa <mask> u khou bvelela nga u lima", |
|
"Vhana vhane vha kha ḓi bva u bebwa vha kha khombo ya u <mask> nga Listeriosis"] |
|
|
|
# Perform the fill-mask task |
|
results = unmasker(sentence) |
|
# Display the results |
|
for result in results: |
|
print(f"Predicted word: {result['token_str']} - Score: {result['score']}") |
|
print(f"Full sentence: {result['sequence']}\n") |
|
print("=" * 80) |
|
|
|
``` |