UCSYNLP
/

MyanBERTa

 ---
+language: my
+tags:
+- MyanBERTa
+license: apache-2.0
+datasets:
+- MyCorpus
+- blogs and websites
 ---
+## Model description
+This model is a BERT based Myanmar pre-trained language model.
+MyanBERTa has been pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
+As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.