yesj1234's picture
Upload folder using huggingface_hub
4a15ec8
metadata
language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: koen_mbartLarge_64p_run1
    results: []

koen_mbartLarge_64p_run1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0989
  • Bleu: 33.8958
  • Gen Len: 18.5033

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.5543 0.13 2500 1.5068 25.2368 18.5097
1.4399 0.26 5000 1.3972 27.0554 18.5539
1.3448 0.39 7500 1.3132 28.8579 18.6315
1.3205 0.52 10000 1.2873 29.5611 18.7781
1.2786 0.65 12500 1.2399 30.3042 18.5644
1.2561 0.78 15000 1.2173 30.5801 19.0186
1.2479 0.91 17500 1.2125 30.8896 18.7636
1.1891 1.04 20000 1.1776 31.9834 18.7002
1.1943 1.17 22500 1.1651 32.0205 18.7054
1.1375 1.3 25000 1.1492 32.3658 18.6287
1.1351 1.43 27500 1.1460 32.339 18.7655
1.0859 1.56 30000 1.1623 31.5418 19.016
1.0373 1.69 32500 1.1383 32.672 18.7224
1.0824 1.82 35000 1.1232 33.2231 18.6697
1.0242 1.95 37500 1.1313 32.813 18.2553
1.0649 2.08 40000 1.1182 33.2021 18.7216
1.054 2.21 42500 1.1329 33.0588 18.4992
1.0143 2.34 45000 1.1187 33.2176 18.7156
1.0037 2.47 47500 1.1162 33.3754 18.6443
0.9928 2.61 50000 1.1306 33.0727 18.6361
0.9497 2.74 52500 1.1170 33.227 18.7638
1.0157 2.87 55000 1.1072 33.685 18.5847
0.9876 3.0 57500 1.1035 33.6971 18.6873
0.9665 3.13 60000 1.0989 33.8919 18.5258
0.9197 3.26 62500 1.1060 33.7036 18.5407
0.9427 3.39 65000 1.0995 33.7642 18.7
0.8993 3.52 67500 1.1364 33.1757 18.646
0.8957 3.65 70000 1.1251 33.0954 18.3129

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1