yesj1234's picture
Upload folder using huggingface_hub
894c861
metadata
language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp10p
    results: []

ko-en_mbartLarge_exp10p

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1770
  • Bleu: 27.7431
  • Gen Len: 18.6157

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.5087 0.46 2000 1.4383 21.689 18.6869
1.3739 0.93 4000 1.3328 23.8363 18.7463
1.2585 1.39 6000 1.2720 24.7319 18.4624
1.2355 1.86 8000 1.2356 26.1612 18.484
1.0973 2.32 10000 1.2074 26.6567 18.554
1.1157 2.78 12000 1.2069 26.4733 18.8044
0.9631 3.25 14000 1.1901 27.1062 18.6803
1.0223 3.71 16000 1.2280 26.3038 18.7993
0.8621 4.18 18000 1.2185 26.8035 18.6679
0.866 4.64 20000 1.1770 27.7431 18.6157
0.7063 5.11 22000 1.2176 27.7268 18.6026
0.7504 5.57 24000 1.2268 27.053 18.5299
0.6986 6.03 26000 1.2739 27.5119 18.7806
0.6193 6.5 28000 1.2745 27.3877 18.5109

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1