README.md · bertin-project/bertin-base-random-exp-512seqlen at 94a43c636accde356f835155617e41379d82740b

metadata

language: es
license: CC-BY 4.0
tags:
  - spanish
  - roberta
pipeline_tag: fill-mask
widget:
  - text: Fui a la librería a comprar un <mask>.

This is a RoBERTa-base model trained from scratch in Spanish.

The training dataset is mc4 subsampling documents to a total of about 50 million examples. Sampling is random. This model takes the one using sequence length 128 and trains during 25.000 steps using sequence length 512.

Please see our main card for more information.

This is part of the Flax/Jax Community Week, organised by HuggingFace and TPU usage sponsored by Google.

Team members

Eduardo González (edugp)
Javier de la Rosa (versae)
Manu Romero (mrm8488)
María Grandury (mariagrandury)
Pablo González de Prado (Pablogps)
Paulo Villegas (paulo)