File size: 1,552 Bytes
cec0752
 
 
 
 
 
 
 
 
d58511e
 
 
 
cec0752
 
 
 
53ccbec
cec0752
 
d58511e
cec0752
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a89d142
cec0752
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
language: 
- multilingual
- en
- de
license: mit
widget:
- text: "ich glaub ich muss echt rewatchen like i so empty was soll ich denn jetzt machen"
  example_title: "Example 1"
- text: "Ich hab das selbst gedownloadet I have the receipts"
  example_title: "Example 2"
- text: "Ich dachte jz mit dem Date wäre der andere raus I know overthinken ist dein Problem"
  example_title: "Example 3"
---

# German-English Code-Switching Identification

The [Tongueswitcher BERT](https://huggingface.co/igorsterner/german-english-code-switching-bert) model finetuned for German-English identification. It was introduced in [this paper](https://openreview.net/forum?id=heYrTpKRny). This model is case sensitive.

## Overview
- **Initialized language model:** german-english-code-switching-bert   
- **Training data:** The Denglish Corpus
- **Infrastructure**: 1x Nvidia A100 GPU
- **Published**: 16 October 2023

## Hyperparameters

```
batch_size = 16
epochs = 3
n_steps = 789
max_seq_len = 512
learning_rate = 3e-5
weight_decay = 0.01
seed = 2021
```

## Authors
- Igor Sterner: `is473 [at] cam.ac.uk`
- Simone Teufel: `sht25 [at] cam.ac.uk`

### BibTeX entry and citation info

```bibtex
@inproceedings{sterner2023tongueswitcher,
  author    = {Igor Sterner and Simone Teufel},
  title     = {TongueSwitcher: Fine-Grained Identification of German-English Code-Switching},
  booktitle = {Sixth Workshop on Computational Approaches to Linguistic Code-Switching},
  publisher = {Empirical Methods in Natural Language Processing},
  year      = {2023},
}
```