|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- Babelscape/wikineural |
|
language: |
|
- de |
|
- fr |
|
- it |
|
- rm |
|
- multilingual |
|
inference: false |
|
tags: |
|
- named-entity-recognition |
|
--- |
|
|
|
The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model fine-tuned on the [WikiNEuRal](https://huggingface.co/datasets/Babelscape/wikineural) dataset for multilingual NER. |
|
|
|
Supports German, French and Italian as supervised languages and Romansh Grischun as a zero-shot language. |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
token_classifier = pipeline( |
|
model="ZurichNLP/swissbert-ner", |
|
aggregation_strategy="simple", |
|
) |
|
``` |
|
|
|
### German example |
|
```python |
|
token_classifier.model.set_default_language("de_CH") |
|
token_classifier("Mein Name sei Gantenbein.") |
|
``` |
|
Output: |
|
``` |
|
[{'entity_group': 'PER', |
|
'score': 0.5002625, |
|
'word': 'Gantenbein', |
|
'start': 13, |
|
'end': 24}] |
|
``` |
|
|
|
### French example |
|
```python |
|
token_classifier.model.set_default_language("fr_CH") |
|
token_classifier("J'habite à Lausanne.") |
|
``` |
|
Output: |
|
``` |
|
[{'entity_group': 'LOC', |
|
'score': 0.99955386, |
|
'word': 'Lausanne', |
|
'start': 10, |
|
'end': 19}] |
|
``` |
|
|
|
## Citation |
|
```bibtex |
|
@article{vamvas-etal-2023-swissbert, |
|
title={Swiss{BERT}: The Multilingual Language Model for Switzerland}, |
|
author={Jannis Vamvas and Johannes Gra\"en and Rico Sennrich}, |
|
year={2023}, |
|
eprint={2303.13310}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL}, |
|
url={https://arxiv.org/abs/2303.13310} |
|
} |
|
``` |
|
|