nvidia
/

nemo-megatron-mt5-3B

masked language modeling

Model card Files Files and versions Community

MaximumEntropy commited on Sep 23, 2022

Commit

f87bf66

·

1 Parent(s): 4727dc7

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -53,6 +53,8 @@ NeMo Megatron-mT5 3B is a *multilingual* transformer-based masked language model
 This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html).
 ## List of Languages
 We pre-trained our mT5 model on the following languages from the [mC4](https://github.com/allenai/allennlp/discussions/5265) dataset.

 This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html).
+**NOTE**: Weights are distributed in bfloat16.
 ## List of Languages
 We pre-trained our mT5 model on the following languages from the [mC4](https://github.com/allenai/allennlp/discussions/5265) dataset.