nllb-200-distilled-600M-finetuned_ramayana_sns_prose_lexrank_new

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5955
  • Rouge1: 17.0715
  • Rouge2: 1.7786
  • Rougel: 13.4279
  • Rougelsum: 15.116

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 5
  • eval_batch_size: 5
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.8967 1.0 86 3.7945 15.1996 1.2821 12.4518 13.4409
3.8364 2.0 172 3.7584 15.4522 1.4203 12.6976 13.5883
3.8006 3.0 258 3.7351 15.6107 1.5487 12.7653 13.6495
3.7663 4.0 344 3.7081 15.7318 1.4526 12.9915 13.8208
3.7108 5.0 430 3.6849 14.9819 1.335 12.3487 12.9351
3.6932 6.0 516 3.6721 15.7441 1.3281 12.943 13.6367
3.6635 7.0 602 3.6599 15.7133 1.4432 12.6204 13.7309
3.6417 8.0 688 3.6425 16.0359 1.5975 13.0271 14.1899
3.6241 9.0 774 3.6298 16.6481 1.7167 13.266 14.5474
3.603 10.0 860 3.6209 16.5086 1.7139 13.059 14.5272
3.5692 11.0 946 3.6120 16.7846 1.5967 13.171 14.6977
3.5757 12.0 1032 3.6078 16.7106 1.7489 13.277 14.8431
3.553 13.0 1118 3.6010 17.297 1.7352 13.4176 15.4798
3.547 14.0 1204 3.5955 17.0715 1.7786 13.4279 15.116

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
16
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for manojbalaji1/nllb-200-distilled-600M-finetuned_ramayana_sns_prose_lexrank_new

Finetuned
(86)
this model