|
--- |
|
language: |
|
- multilingual |
|
- en |
|
- de |
|
license: mit |
|
widget: |
|
- text: "ich glaub ich muss echt rewatchen like i so empty was soll ich denn jetzt machen" |
|
example_title: "Example 1" |
|
- text: "Ich hab das selbst gedownloadet I have the receipts" |
|
example_title: "Example 2" |
|
- text: "Ich dachte jz mit dem Date wäre der andere raus I know overthinken ist dein Problem" |
|
example_title: "Example 3" |
|
--- |
|
|
|
# German-English Code-Switching Identification |
|
|
|
The [Tongueswitcher BERT](https://huggingface.co/igorsterner/german-english-code-switching-bert) model finetuned for German-English identification. It was introduced in [this paper](https://openreview.net/forum?id=heYrTpKRny). This model is case sensitive. |
|
|
|
## Overview |
|
- **Initialized language model:** german-english-code-switching-bert |
|
- **Training data:** The Denglish Corpus |
|
- **Infrastructure**: 1x Nvidia A100 GPU |
|
- **Published**: 16 October 2023 |
|
|
|
## Hyperparameters |
|
|
|
``` |
|
batch_size = 16 |
|
epochs = 3 |
|
n_steps = 789 |
|
max_seq_len = 512 |
|
learning_rate = 3e-5 |
|
weight_decay = 0.01 |
|
seed = 2021 |
|
``` |
|
|
|
## Authors |
|
- Igor Sterner: `is473 [at] cam.ac.uk` |
|
- Simone Teufel: `sht25 [at] cam.ac.uk` |
|
|
|
### BibTeX entry and citation info |
|
|
|
```bibtex |
|
@inproceedings{sterner2023tongueswitcher, |
|
author = {Igor Sterner and Simone Teufel}, |
|
title = {TongueSwitcher: Fine-Grained Identification of German-English Code-Switching}, |
|
booktitle = {Sixth Workshop on Computational Approaches to Linguistic Code-Switching}, |
|
publisher = {Empirical Methods in Natural Language Processing}, |
|
year = {2023}, |
|
} |
|
``` |
|
|