yesj1234's picture
Upload folder using huggingface_hub
982c00e
|
raw
history blame
2.05 kB
metadata
language:
  - ko
  - ja
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: koja_mbartLarge_55p_run2
    results: []

koja_mbartLarge_55p_run2

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9303
  • Bleu: 57.3778
  • Gen Len: 16.682

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.0633 0.48 8000 1.0419 52.4575 17.4003
0.9731 0.97 16000 0.9550 55.7136 16.9686
0.7608 1.45 24000 0.9372 56.8788 16.7537
0.7213 1.93 32000 0.9303 57.4421 16.6742
0.5702 2.42 40000 0.9622 56.774 16.4703
0.5416 2.9 48000 0.9697 57.4192 16.6763
0.4226 3.38 56000 1.0399 56.5425 16.4626

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1