Update README.md
Browse files
README.md
CHANGED
@@ -35,9 +35,10 @@ After de-duplicating the data, we were left with a total of 54.5 GB of Bulgarian
|
|
35 |
|
36 |
# Benchmark performance
|
37 |
|
38 |
-
We tested
|
|
|
|
|
39 |
|
40 |
-
Scores are averages of three runs, except for COPA, for which we use 10 runs. We use the same hyperparameter settings for all models.
|
41 |
|
42 |
## Bulgarian
|
43 |
|
|
|
35 |
|
36 |
# Benchmark performance
|
37 |
|
38 |
+
We tested performance of BERTovski on benchmarks of XPOS, UPOS and NER. For Bulgarian, we used the data from the [Universal Dependencies](https://universaldependencies.org/) project. For Macedonian, we used the data sets created in the [babushka-bench](https://github.com/clarinsi/babushka-bench/) project. We also tested on a Google (Bulgarian) and human (Macedonian) translated version of the COPA data set (for details see our [Github repo](https://github.com/RikVN/COPA)). We compare performance to the strong multi-lingual models XLMR-base and XLMR-large. For details regarding the fine-tuning procedure you can checkout our [Github](https://github.com/macocu/LanguageModels).
|
39 |
+
|
40 |
+
Scores are averages of three runs, except for COPA, for which we use 10 runs. We use the same hyperparameter settings for all models for UPOS/XPOS/NER, for COPA we optimized the learning rate on the dev set.
|
41 |
|
|
|
42 |
|
43 |
## Bulgarian
|
44 |
|