File size: 986 Bytes
5748b6d
3183e00
 
 
63b5b14
 
 
3183e00
5748b6d
3183e00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
language:
- en
- hi
- multilingual
tags:
- generated_from_trainer
licence: cc-by-sa-4.0
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->


# muril-en-hi-codemixed

 muril-en-hi-codemixed is a masked language model, based on the [MuRIL](https://huggingface.co/google/muril-base-cased) multilingual model.

muril-en-hi-codemixed replaces the tokenizer, vocabulary and the embeddings layer of the MuRIL model. 
The tokenizer and vocabulary used are the same as in the [roberta-en-hi-codemixed](https://huggingface.co/cjvt/roberta-en-hi-codemixed) model.
The new embedding weights were initialized from the MuRIL embeddings.

The new muril-en-hi-codemixed model was further pre-trained for two epochs on the same codemixed English and Hindi corpora
 as the [roberta-en-hi-codemixed](https://huggingface.co/cjvt/roberta-en-hi-codemixed) model.