Token Classification
Transformers
PyTorch
Safetensors
xmod
named-entity-recognition
swissbert-ner / README.md
jvamvas's picture
Add citation
490a789
|
raw
history blame
1.46 kB
---
license: cc-by-nc-4.0
datasets:
- Babelscape/wikineural
language:
- de
- fr
- it
- rm
- multilingual
inference: false
tags:
- named-entity-recognition
---
The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model fine-tuned on the [WikiNEuRal](https://huggingface.co/datasets/Babelscape/wikineural) dataset for multilingual NER.
Supports German, French and Italian as supervised languages and Romansh Grischun as a zero-shot language.
## Usage
```python
from transformers import pipeline
token_classifier = pipeline(
model="ZurichNLP/swissbert-ner",
aggregation_strategy="simple",
)
```
### German example
```python
token_classifier.model.set_default_language("de_CH")
token_classifier("Mein Name sei Gantenbein.")
```
Output:
```
[{'entity_group': 'PER',
'score': 0.5002625,
'word': 'Gantenbein',
'start': 13,
'end': 24}]
```
### French example
```python
token_classifier.model.set_default_language("fr_CH")
token_classifier("J'habite à Lausanne.")
```
Output:
```
[{'entity_group': 'LOC',
'score': 0.99955386,
'word': 'Lausanne',
'start': 10,
'end': 19}]
```
## Citation
```bibtex
@article{vamvas-etal-2023-swissbert,
title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
author={Jannis Vamvas and Johannes Gra\"en and Rico Sennrich},
year={2023},
eprint={2303.13310},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2303.13310}
}
```