UCSYNLP commited on
Commit
f3fa478
·
1 Parent(s): c6fa656

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -1,3 +1,16 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
1
  ---
2
+ language: my
3
+ tags:
4
+ - MyanBERTa
5
+ license: apache-2.0
6
+ datasets:
7
+ - MyCorpus
8
+ - blogs and websites
9
  ---
10
+
11
+ ## Model description
12
+
13
+ This model is a BERT based Myanmar pre-trained language model.
14
+ MyanBERTa has been pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
15
+ As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
16
+