File size: 1,504 Bytes
4ac0757
 
 
 
 
 
 
 
 
107b3b2
4ac0757
94b627b
4ac0757
107b3b2
 
4ac0757
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
language:
- sw
license: apache-2.0
datasets:
- wikiann
pipeline_tag: token-classification
examples: null
widget:
- text: Serikali imetangaza hali ya janga katika wilaya 10 za kusini ambazo zimeathiriwa zaidi na dhoruba.
  example_title: Sentence_1
- text: Faida tano za kula samaki wenye mafuta.
  example_title: Sentence_2
- text: Tahadhari yatolewa kuhusu uwezekano wa mlipuko wa Volkano DR Congo.
  example_title: Sentence_3
metrics:
- accuracy
- f1
- precision
- recall
library_name: transformers
---


## Intended uses & limitations

#### How to use

You can use this model with Transformers *pipeline* for NER.

```python
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("eolang/Swahili-NER-BertBase-Cased")
model = AutoModelForTokenClassification.from_pretrained("eolang/Swahili-NER-BertBase-Cased")

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "Kwa nini Kenya inageukia mazao ya GMO kukabiliana na ukame"

ner_results = nlp(example)
print(ner_results)
```

## Training data

This model was fine-tuned on the Swahili Version of the WikiAnn dataset for cross-lingual name tagging and linking based on Wikipedia articles in 295 languages


## Training procedure

This model was trained on a single NVIDIA A 5000 GPU with recommended hyperparameters from the [original BERT paper](https://arxiv.org/pdf/1810.04805) which trained & evaluated the model on CoNLL-2003 NER task.