igorsterner commited on
Commit
cec0752
1 Parent(s): 4050ee0

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - multilingual
4
+ - en
5
+ - de
6
+ license: mit
7
+ widget:
8
+ - text: "ich glaub ich muss echt rewatchen like i so empty was soll ich denn jetzt machen"
9
+ example_title: "Example 1"
10
+ ---
11
+
12
+ # German-English Code-Switching Identification
13
+
14
+ The [Tongueswitcher BERT](https://huggingface.co/igorsterner/german-english-code-switching-bert) model finetuned for German--English identification. It was introduced in [this paper](). This model is case sensitive.
15
+
16
+ ## Overview
17
+ - **Initializd language model:** german-english-code-switching-bert
18
+ - **Training data:** The Denglish Corpus
19
+ - **Infrastructure**: 1x Nvidia A100 GPU
20
+ - **Published**: 16 October 2023
21
+
22
+ ## Hyperparameters
23
+
24
+ ```
25
+ batch_size = 16
26
+ epochs = 3
27
+ n_steps = 789
28
+ max_seq_len = 512
29
+ learning_rate = 3e-5
30
+ weight_decay = 0.01
31
+ Adam beta = (0.9, 0.999)
32
+ lr_schedule = LinearWarmup
33
+ seed = 2021
34
+ ```
35
+
36
+ ## Authors
37
+ - Igor Sterner: `is473 [at] cam.ac.uk`
38
+ - Simone Teufel: `sht25 [at] cam.ac.uk`
39
+
40
+ ### BibTeX entry and citation info
41
+
42
+ ```bibtex
43
+ @inproceedings{sterner2023tongueswitcher,
44
+ author = {Igor Sterner and Simone Teufel},
45
+ title = {TongueSwitcher: Fine-Grained Identification of German-English Code-Switching},
46
+ booktitle = {Sixth Workshop on Computational Approaches to Linguistic Code-Switching},
47
+ year = {2023},
48
+ }
49
+ ```