metadata

language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_mid3
    results: []

ko-en_mbartLarge_mid3

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.3246
Bleu: 22.9623
Gen Len: 18.7197

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_steps: 1000
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.5377	0.23	2000	1.6122	17.2009	18.7106
1.3891	0.46	4000	1.5059	19.3345	18.7688
1.2812	0.7	6000	1.4348	20.6032	18.9022
1.2374	0.93	8000	1.4035	21.2391	18.8434
1.1734	1.16	10000	1.4039	21.304	18.9964
1.1531	1.39	12000	1.3694	21.9087	18.8573
1.1158	1.62	14000	1.3574	22.004	18.5485
1.0941	1.86	16000	1.3457	21.9785	18.7119
0.9809	2.09	18000	1.3495	22.7983	18.8011
0.9834	2.32	20000	1.3429	22.5654	18.9416
0.9981	2.55	22000	1.3246	22.9493	18.7364
1.0074	2.78	24000	1.3539	22.3874	18.4428
0.9752	3.02	26000	1.3587	22.1907	18.8139
0.8858	3.25	28000	1.3457	22.82	18.8021
0.8895	3.48	30000	1.3603	22.1575	18.5638

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1