Edit model card

nllb-200-distilled-600M-finetuned_ramayana_sns_prose_lexrank

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.8356
  • Rouge1: 15.0234
  • Rouge2: 1.2752
  • Rougel: 12.4341
  • Rougelsum: 13.036

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-06
  • train_batch_size: 5
  • eval_batch_size: 5
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.9309 1.0 86 4.4877 17.9643 1.8368 13.627 16.743
4.5531 2.0 172 4.2961 7.7741 0.7819 6.5664 7.0319
4.4082 3.0 258 4.1867 15.3171 1.2616 12.5029 13.4603
4.3142 4.0 344 4.1128 14.7478 1.299 12.4545 12.8832
4.22 5.0 430 4.0577 14.6397 1.1974 12.3644 12.5395
4.1743 6.0 516 4.0173 14.7595 1.3556 12.4877 12.6788
4.1279 7.0 602 3.9858 14.3561 1.3361 12.071 12.5251
4.0927 8.0 688 3.9564 15.0213 1.3697 12.7109 13.1084
4.0625 9.0 774 3.9320 15.2813 1.3317 12.635 13.4154
4.0361 10.0 860 3.9113 15.0786 1.2544 12.6139 13.0141
3.9913 11.0 946 3.8951 15.0242 1.2899 12.7049 13.1678
3.9949 12.0 1032 3.8822 15.1567 1.3332 12.7349 13.1691
3.9643 13.0 1118 3.8724 15.0434 1.2552 12.6509 13.1845
3.96 14.0 1204 3.8608 14.5834 1.2768 12.1898 12.5734
3.9524 15.0 1290 3.8533 14.6872 1.2161 12.2549 12.7557
3.9345 16.0 1376 3.8443 15.0962 1.3235 12.5689 13.1217
3.9359 17.0 1462 3.8407 14.7724 1.2323 12.4059 12.6775
3.9213 18.0 1548 3.8367 14.6599 1.285 12.2237 12.7231
3.9213 19.0 1634 3.8356 15.0234 1.2752 12.4341 13.036

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
7
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for manojbalaji1/nllb-200-distilled-600M-finetuned_ramayana_sns_prose_lexrank

Finetuned
(72)
this model