ibraheemmoosa
commited on
Commit
·
3c45a9a
1
Parent(s):
33426be
Update language list.
Browse files
README.md
CHANGED
@@ -9,6 +9,11 @@ language:
|
|
9 |
- or
|
10 |
- pa
|
11 |
- si
|
|
|
|
|
|
|
|
|
|
|
12 |
license: apache-2.0
|
13 |
datasets:
|
14 |
- oscar
|
@@ -23,10 +28,11 @@ tags:
|
|
23 |
|
24 |
# XLMIndic Base Uniscript
|
25 |
|
26 |
-
Pretrained ALBERT model on the OSCAR corpus on the languages Assamese, Bengali,
|
27 |
-
Nepali, Oriya, Panjabi and Sinhala.
|
28 |
-
|
29 |
-
|
|
|
30 |
where you can transliterate your text and use it on our model on the inference widget.
|
31 |
|
32 |
|
|
|
9 |
- or
|
10 |
- pa
|
11 |
- si
|
12 |
+
- sa
|
13 |
+
- bpy
|
14 |
+
- mai
|
15 |
+
- bh
|
16 |
+
- gom
|
17 |
license: apache-2.0
|
18 |
datasets:
|
19 |
- oscar
|
|
|
28 |
|
29 |
# XLMIndic Base Uniscript
|
30 |
|
31 |
+
Pretrained ALBERT model on the OSCAR corpus on the languages Assamese, Bengali, Bihari, Bishnupriya Manipuri,
|
32 |
+
Goan Konkani, Gujarati, Hindi, Maithili, Marathi, Nepali, Oriya, Panjabi, Sanskrit and Sinhala.
|
33 |
+
Like ALBERT it was pretrained using as masked language modeling (MLM) and a sentence order prediction (SOP)
|
34 |
+
objective. This model was pretrained after transliterating the text to ISO-15919 format using the Aksharamukha
|
35 |
+
library. A demo of Aksharamukha library is hosted [here](https://aksharamukha.appspot.com/converter)
|
36 |
where you can transliterate your text and use it on our model on the inference widget.
|
37 |
|
38 |
|