metadata

license: mit
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: iva_mt_wslot-m2m100_418M-en-sv
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: iva_mt_wslot
          type: iva_mt_wslot
          config: en-sv
          split: validation
          args: en-sv
        metrics:
          - name: Bleu
            type: bleu
            value: 71.0808
datasets:
  - cartesinus/iva_mt_wslot
language:
  - en
  - sv

iva_mt_wslot-m2m100_418M-en-sv

This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the evaluation set:

Loss: 0.0107
Bleu: 71.0808
Gen Len: 19.7647

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 7
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.0151	1.0	1885	0.0120	67.2332	19.3956
0.0095	2.0	3770	0.0105	69.8147	19.675
0.0065	3.0	5655	0.0104	70.239	19.8404
0.0049	4.0	7540	0.0104	70.3673	19.7154
0.0038	5.0	9425	0.0105	70.1632	19.7743
0.0026	6.0	11310	0.0105	70.7959	19.7809
0.0021	7.0	13195	0.0107	71.0808	19.7647

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.11.0
Tokenizers 0.13.3