hugo-albert's picture
Training complete
e580162 verified
metadata
license: cc-by-nc-4.0
base_model: facebook/nllb-200-distilled-600M
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: nllb-200-distilled-600M-finetuned-py2cpp
    results: []

nllb-200-distilled-600M-finetuned-py2cpp

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7738
  • Bleu: 67.4647
  • Gen Len: 75.9455

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 67 2.6896 29.0389 96.5455
No log 2.0 134 1.6534 30.4693 96.6727
No log 3.0 201 1.2046 55.0467 76.7455
No log 4.0 268 1.0048 59.5519 76.9091
No log 5.0 335 0.9176 64.2229 75.5455
No log 6.0 402 0.8610 65.8311 73.6909
No log 7.0 469 0.8160 65.5771 76.4727
1.5731 8.0 536 0.7968 67.9558 74.7636
1.5731 9.0 603 0.7794 67.5994 75.8
1.5731 10.0 670 0.7738 67.4647 75.9455

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.13.3