Jezia's picture
Update README.md
895c235
|
raw
history blame
2.15 kB
metadata
license: cc-by-nc-4.0
base_model: facebook/nllb-200-distilled-600M
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: nllb-200-distilled-600M-finetuned-ar-to-en
    results: []
pipeline_tag: translation

nllb-200-distilled-600M-finetuned-ar-to-en

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7281
  • Bleu: 63.3172
  • Gen Len: 65.7

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.4803 1.0 695 0.9925 48.0036 68.092
1.0588 2.0 1390 0.8618 53.6714 67.794
0.8397 3.0 2085 0.8034 56.8749 67.316
0.7816 4.0 2780 0.7718 59.7588 65.822
0.7349 5.0 3475 0.7509 60.9155 66.205
0.6737 6.0 4170 0.7422 61.9048 65.348
0.6373 7.0 4865 0.7338 62.8549 65.607
0.617 8.0 5560 0.7308 63.6105 65.335
0.6068 9.0 6255 0.7276 63.452 65.594
0.5913 10.0 6950 0.7281 63.3172 65.7

Framework versions

  • Transformers 4.31.0
  • Pytorch 1.13.1
  • Datasets 2.14.4
  • Tokenizers 0.13.3