Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This model has not been trained on any Cantonese material.
|
2 |
+
|
3 |
+
It is simply a base model in which the embeddings and tokenizer were patched with Cantonese characters.
|
4 |
+
|
5 |
+
I used this repo to identify missing Cantonese characters
|
6 |
+
https://github.com/ayaka14732/bert-tokenizer-cantonese
|
7 |
+
|
8 |
+
My forked and modified version: https://github.com/jedcheng/bert-tokenizer-cantonese
|