Edit model card

This is a smaller version of the google/mt5-base model with only Russian and some English embeddings left.

  • The original model has 582M parameters, with 384M of them being input and output embeddings.
  • After shrinking the sentencepiece vocabulary from 250K to 30K (top 10K English and top 20K Russian tokens) the number of model parameters reduced to 244M parameters, and model size reduced from 2.2GB to 0.9GB - 42% of the original one.

The creation of this model is described in the post How to adapt a multilingual T5 model for a single language along with the source code.

Downloads last month
542
Safetensors
Model size
244M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for cointegrated/rut5-base

Finetunes
2 models