metadata
language:
- multilingual
- en
- de
license: mit
widget:
- text: >-
ich glaub ich muss echt rewatchen like i so empty was soll ich denn jetzt
machen
example_title: Example 1
- text: Ich hab das selbst gedownloadet I have the receipts
example_title: Example 2
- text: >-
Ich dachte jz mit dem Date wäre der andere raus I know overthinken ist
dein Problem
example_title: Example 3
German-English Code-Switching Identification
The Tongueswitcher BERT model finetuned for German-English identification. It was introduced in this paper. This model is case sensitive.
Overview
- Initialized language model: german-english-code-switching-bert
- Training data: The Denglish Corpus
- Infrastructure: 1x Nvidia A100 GPU
- Published: 16 October 2023
Hyperparameters
batch_size = 16
epochs = 3
n_steps = 789
max_seq_len = 512
learning_rate = 3e-5
weight_decay = 0.01
seed = 2021
Authors
- Igor Sterner:
is473 [at] cam.ac.uk
- Simone Teufel:
sht25 [at] cam.ac.uk
BibTeX entry and citation info
@inproceedings{sterner2023tongueswitcher,
author = {Igor Sterner and Simone Teufel},
title = {TongueSwitcher: Fine-Grained Identification of German-English Code-Switching},
booktitle = {Sixth Workshop on Computational Approaches to Linguistic Code-Switching},
publisher = {Empirical Methods in Natural Language Processing},
year = {2023},
}