Usage

Load in transformers library with:

from transformers import AutoTokenizer, AutoModelForMaskedLM
  
  tokenizer = AutoTokenizer.from_pretrained("EMBEDDIA/est-roberta")
  model = AutoModelForMaskedLM.from_pretrained("EMBEDDIA/est-roberta")

Est-RoBERTa

Est-RoBERTa model is a monolingual Estonian BERT-like model. It is closely related to French Camembert model https://camembert-model.fr/. The Estonian corpora used for training the model have 2.51 billion tokens in total. The subword vocabulary contains 40,000 tokens.

Est-RoBERTa was trained for 40 epochs.

Downloads last month
48
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for EMBEDDIA/est-roberta

Finetunes
2 models