Edit model card

nllb-200-distilled-600M-finetuned_ramayana_sns_prose

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5584
  • Rouge1: 19.8304
  • Rouge2: 2.4248
  • Rougel: 13.9446
  • Rougelsum: 18.3552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-06
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.595 1.0 427 4.1428 15.776 1.4778 12.3256 14.3476
4.1875 2.0 854 3.9597 16.6403 1.7896 12.7134 15.0805
4.0304 3.0 1281 3.8653 16.6812 1.8152 12.5282 15.1217
3.9243 4.0 1708 3.7984 17.1533 1.788 13.1478 15.5023
3.8485 5.0 2135 3.7505 17.3886 1.8682 13.1805 15.711
3.786 6.0 2562 3.7141 17.7897 1.9953 13.2044 15.9314
3.732 7.0 2989 3.6815 18.3797 2.0735 13.8603 16.7243
3.6865 8.0 3416 3.6559 18.2702 2.0286 13.3494 16.5957
3.6515 9.0 3843 3.6354 18.0194 1.9282 12.9295 16.4714
3.6177 10.0 4270 3.6193 18.7825 2.0085 13.2207 17.1223
3.5877 11.0 4697 3.6030 19.1192 2.1276 13.9442 17.609
3.5665 12.0 5124 3.5943 19.5031 2.3146 13.7631 17.9879
3.5454 13.0 5551 3.5828 19.7688 2.2574 13.9943 18.2914
3.5247 14.0 5978 3.5763 19.4478 2.3024 13.8854 17.9616
3.509 15.0 6405 3.5704 19.3998 2.2633 13.707 17.9534
3.4983 16.0 6832 3.5646 19.6401 2.3265 13.9141 18.2001
3.4865 17.0 7259 3.5604 19.1833 2.398 13.6566 17.7596
3.4802 18.0 7686 3.5584 19.8304 2.4248 13.9446 18.3552

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for manojbalaji1/nllb-200-distilled-600M-finetuned_ramayana_sns_prose

Finetuned
(72)
this model