nllb-200-distilled-600M-finetuned_ramayana_sns_lexr

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2899
  • Rouge1: 18.7327
  • Rouge2: 2.1067
  • Rougel: 14.4307
  • Rougelsum: 16.856

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-06
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.5091 1.0 427 4.0603 16.2779 1.6057 13.7589 14.9914
4.1329 2.0 854 3.8728 16.1259 1.3608 13.0307 14.5339
3.9846 3.0 1281 3.7534 17.0824 1.7524 13.8469 15.5679
3.8863 4.0 1708 3.6693 16.9334 1.7307 13.8506 15.5077
3.8144 5.0 2135 3.5996 18.0132 1.9515 14.3725 16.3556
3.7542 6.0 2562 3.5460 17.4338 1.8704 14.1021 15.9508
3.7068 7.0 2989 3.5052 17.5684 2.0413 14.0332 15.9409
3.665 8.0 3416 3.4695 18.2728 2.0033 14.2529 16.6322
3.6288 9.0 3843 3.4347 18.0755 2.2161 14.0724 16.4611
3.5969 10.0 4270 3.4081 18.3883 2.1437 14.3847 16.7879
3.5718 11.0 4697 3.3840 18.9654 2.2593 14.6253 17.3421
3.5464 12.0 5124 3.3619 18.9897 2.352 14.679 17.4166
3.5302 13.0 5551 3.3465 18.9671 2.246 14.4441 17.16
3.5102 14.0 5978 3.3309 18.5565 2.1854 14.1515 16.6572
3.4992 15.0 6405 3.3171 19.0665 2.1941 14.7519 17.2236
3.4851 16.0 6832 3.3075 18.5714 2.1059 14.3258 16.7627
3.4765 17.0 7259 3.2998 18.5252 2.083 14.1246 16.724
3.464 18.0 7686 3.2944 18.9694 2.1503 14.6684 17.0653
3.4647 19.0 8113 3.2911 18.6916 2.1447 14.4372 16.8423
3.4585 20.0 8540 3.2899 18.7327 2.1067 14.4307 16.856

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.4
  • Tokenizers 0.15.2
Downloads last month
21
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for TheDarkLord69696969/nllb-200-distilled-600M-finetuned_ramayana_sns_lexr

Finetuned
(86)
this model