roberta-large-wechsel-ukrainian
roberta-base
transferred to Ukrainian using the method from the NAACL2022 paper WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
Evaluation
Evaluation was done on lang-uk's ner-uk project, the Ukrainian portion of WikiANN and the Ukrainian IU corpus from the Universal Dependencies project. Evaluation results are the mean of 5 runs with different seeds.
Validation Results
lang-uk NER (Micro F1) | WikiANN (Micro F1) | UD Ukrainian IU POS (Accuracy) | |
---|---|---|---|
roberta-base-wechsel-ukrainian | 88.06 (0.50) | 92.96 (0.08) | 98.70 (0.05) |
roberta-large-wechsel-ukrainian | 89.27 (0.53) | 93.22 (0.15) | 98.86 (0.03) |
roberta-base-scratch-ukrainian* | 85.49 (0.88) | 91.91 (0.08) | 98.49 (0.04) |
roberta-large-scratch-ukrainian* | 86.54 (0.70) | 92.39 (0.16) | 98.65 (0.09) |
dbmdz/electra-base-ukrainian-cased-discriminator | 87.49 (0.52) | 93.20 (0.16) | 98.60 (0.03) |
xlm-roberta-base | 86.68 (0.44) | 92.41 (0.13) | 98.53 (0.02) |
xlm-roberta-large | 86.64 (1.61) | 93.01 (0.13) | 98.71 (0.04) |
Test Results
lang-uk NER (Micro F1) | WikiANN (Micro F1) | UD Ukrainian IU POS (Accuracy) | |
---|---|---|---|
roberta-base-wechsel-ukrainian | 90.81 (1.51) | 92.98 (0.12) | 98.57 (0.03) |
roberta-large-wechsel-ukrainian | 91.24 (1.16) | 93.22 (0.17) | 98.74 (0.06) |
roberta-base-scratch-ukrainian* | 89.57 (1.01) | 92.05 (0.09) | 98.31 (0.08) |
roberta-large-scratch-ukrainian* | 89.96 (0.89) | 92.49 (0.15) | 98.52 (0.04) |
dbmdz/electra-base-ukrainian-cased-discriminator | 90.43 (1.29) | 92.99 (0.11) | 98.59 (0.06) |
xlm-roberta-base | 90.86 (0.81) | 92.27 (0.09) | 98.45 (0.07) |
xlm-roberta-large | 90.16 (2.98) | 92.92 (0.19) | 98.71 (0.04) |
*trained using the same exact training setup as the wechsel-* models, but without parameter transfer from WECHSEL.
License
MIT
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.