matejulcar commited on
Commit
3d7939b
1 Parent(s): 1548353

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - hr
4
+ - sl
5
+ - en
6
+ - multilingual
7
+
8
+ license: cc-by-4.0
9
+ ---
10
+ # CroSloEngual BERT
11
+ CroSloEngual BERT is a trilingual model, using bert-base architecture, trained on Croatian, Slovenian, and English corpora. Focusing on three languages, the model performs better than [multilingual BERT](https://huggingface.co/bert-base-multilingual-cased), while still offering an option for cross-lingual knowledge transfer, which a monolingual model wouldn't.
12
+
13
+ Evaluation is presented in our article:
14
+ ```
15
+ @Inproceedings{ulcar-robnik2020finest,
16
+ author = "Ulčar, M. and Robnik-Šikonja, M.",
17
+ year = 2020,
18
+ title = "{FinEst BERT} and {CroSloEngual BERT}: less is more in multilingual models",
19
+ editor = "Sojka, P and Kopeček, I and Pala, K and Horák, A",
20
+ booktitle = "Text, Speech, and Dialogue {TSD 2020}",
21
+ series = "Lecture Notes in Computer Science",
22
+ volume = 12284,
23
+ publisher = "Springer",
24
+ url = "https://doi.org/10.1007/978-3-030-58323-1_11",
25
+ }
26
+ ```
27
+ The preprint is available at [arxiv.org/abs/2006.07890](https://arxiv.org/abs/2006.07890).