yesj1234's picture
Upload folder using huggingface_hub
50d86e6
metadata
language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp20p_linear_alpha
    results: []

ko-en_mbartLarge_exp20p_linear_alpha

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1682
  • Bleu: 29.1144
  • Gen Len: 18.5459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.404 0.46 4000 1.3738 22.5375 18.6852
1.2629 0.93 8000 1.2458 25.3741 18.7797
1.1951 1.39 12000 1.2067 26.1281 18.6597
1.1317 1.86 16000 1.1768 26.5384 19.2055
0.9906 2.32 20000 1.1363 28.2459 18.7269
0.9894 2.78 24000 1.1239 28.5124 18.6882
0.8965 3.25 28000 1.1278 28.5335 18.4917
0.9138 3.71 32000 1.1216 28.8189 18.7873
0.8272 4.18 36000 1.1468 28.332 18.6516
0.8753 4.64 40000 1.1345 28.2695 18.4919
0.6855 5.11 44000 1.1542 28.7913 18.7596
0.7088 5.57 48000 1.1531 29.0865 18.6626
0.6738 6.03 52000 1.1906 28.0235 18.4243
0.6763 6.5 56000 1.1941 28.1501 18.6932
0.6594 6.96 60000 1.1682 29.1144 18.5459
0.5971 7.43 64000 1.2449 27.9464 18.4482
0.5935 7.89 68000 1.2156 28.6034 18.5967
0.5383 8.35 72000 1.2927 27.891 18.6539
0.6022 8.82 76000 1.2831 27.7624 18.5558

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1