---
language:
- as
- bn
- gu
- hi
- mr
- ne
- or
- pa
- si
- sa
- bpy
- mai
- bh
- gom
license: apache-2.0
datasets:
- oscar
tags:
- multilingual
- albert
- masked-language-modeling
- sentence-order-prediction
- fill-mask
- nlp
---

# XLMIndic Base Uniscript

Pretrained ALBERT model on the OSCAR corpus on the languages Assamese, Bengali, Bihari, Bishnupriya Manipuri,
Goan Konkani, Gujarati, Hindi, Maithili, Marathi, Nepali, Oriya, Panjabi, Sanskrit and Sinhala.
Like ALBERT it was pretrained using as masked language modeling (MLM) and a sentence order prediction (SOP)
objective. This model was pretrained after transliterating the text to ISO-15919 format using the Aksharamukha
library. A demo of Aksharamukha library is hosted [here](https://aksharamukha.appspot.com/converter)
where you can transliterate your text and use it on our model on the inference widget.