metadata

language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp10p
    results: []

ko-en_mbartLarge_exp10p

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1770
Bleu: 27.7431
Gen Len: 18.6157

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_steps: 1000
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.5087	0.46	2000	1.4383	21.689	18.6869
1.3739	0.93	4000	1.3328	23.8363	18.7463
1.2585	1.39	6000	1.2720	24.7319	18.4624
1.2355	1.86	8000	1.2356	26.1612	18.484
1.0973	2.32	10000	1.2074	26.6567	18.554
1.1157	2.78	12000	1.2069	26.4733	18.8044
0.9631	3.25	14000	1.1901	27.1062	18.6803
1.0223	3.71	16000	1.2280	26.3038	18.7993
0.8621	4.18	18000	1.2185	26.8035	18.6679
0.866	4.64	20000	1.1770	27.7431	18.6157
0.7063	5.11	22000	1.2176	27.7268	18.6026
0.7504	5.57	24000	1.2268	27.053	18.5299
0.6986	6.03	26000	1.2739	27.5119	18.7806
0.6193	6.5	28000	1.2745	27.3877	18.5109

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1