|
--- |
|
license: apache-2.0 |
|
language: |
|
- 'no' |
|
- en |
|
widget: |
|
- text: >- |
|
Mr and Mrs Dursley, of number <extra_id_0>, Privet Drive, were <extra_id_1> to say that they were perfectly normal, |
|
thank you <extra_id_2> much. They were the last people you’d expect to be involved in anything <extra_id_3>, |
|
because they just didn’t <extra_id_4>. |
|
- text: >- |
|
<extra_id_0> hver uke samles Regjeringens medlemmer til Statsråd på |
|
<extra_id_1>. Dette organet er øverste <extra_id_2> i Norge. For at møtet |
|
skal være <extra_id_3>, må over halvparten av regjeringens <extra_id_4> |
|
være til stede. |
|
--- |
|
|
|
This is a pruned version of the ```google/mt5-large``` model. Here, the input and output embeddings are pruned to support a greatly reduced vocabulary. |
|
The chosen vocabulary has 30K norwegian, english and special tokens, ~12% of the old size. This reduces the model size by roughly 37%. |
|
The model is still OK on similar languages, like German and Danish, but very different languages like arabic are not a good fit anymore. |
|
This model is intended as a starting point for finetuning mt5 for norwegian applications. |