metadata

language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp20p_linear_decay
    results: []

ko-en_mbartLarge_exp20p_linear_decay

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2291
Bleu: 26.9332
Gen Len: 18.743

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.4396	0.23	1000	1.3815	21.7052	18.6047
1.338	0.46	2000	1.3044	23.7087	18.9939
1.2938	0.7	3000	1.2556	24.6339	18.8866
1.251	0.93	4000	1.2229	25.2975	19.0918
0.9843	1.16	5000	1.2309	25.609	18.7589
0.9874	1.39	6000	1.2101	26.1792	18.8287
0.9838	1.62	7000	1.2053	26.024	18.4025
0.9927	1.86	8000	1.1907	26.3148	19.09
0.7835	2.09	9000	1.2300	26.5613	18.7196
0.7437	2.32	10000	1.2358	26.8232	18.6513
0.7585	2.55	11000	1.2291	26.9203	18.7513
0.7631	2.78	12000	1.2170	26.8668	18.5441
0.7428	3.02	13000	1.3272	26.2506	18.6959
0.5502	3.25	14000	1.3392	26.419	18.6722
0.5577	3.48	15000	1.3204	26.1621	18.7036

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1