AcademicRoBERTa / README.md
EhimeNLP's picture
Update README.md
efe3969

Model description

We pretrained a RoBERTa-based Japanese masked language model on paper abstracts from the academic database CiNii Articles.
A Japanese Masked Language Model for Academic Domain

Vocabulary

The vocabulary consists of 32000 tokens including subwords induced by the unigram language model of sentencepiece.


license: apache-2.0
language:ja