RedaAlami's picture
update model card README.md
e6354e5
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: t5_recommendation_sports_equipment_english2
    results: []

t5_recommendation_sports_equipment_english2

This model is a fine-tuned version of t5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5359
  • Rouge1: 74.1270
  • Rouge2: 66.6667
  • Rougel: 74.1270
  • Rougelsum: 73.8095
  • Gen Len: 4.0476

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 80

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 1 9.9716 12.4868 0.0 12.5845 12.5051 19.0
No log 2.0 2 10.1466 9.9134 0.0 9.9471 9.8413 19.0
No log 3.0 3 8.3378 10.5739 0.0 10.6349 10.5291 19.0
No log 4.0 4 7.3021 10.5739 0.0 10.6349 10.5291 19.0
No log 5.0 5 6.3242 10.4605 0.0 10.5471 10.4567 19.0
No log 6.0 6 5.4331 10.2886 0.7937 10.2319 10.3793 19.0
No log 7.0 7 4.7152 10.8989 0.7937 10.8388 10.9525 18.9524
No log 8.0 8 3.9937 13.9421 3.7009 14.0590 13.9456 15.0952
No log 9.0 9 3.1163 16.0431 1.0025 15.7736 15.9707 6.4762
No log 10.0 10 2.3306 23.1746 7.1429 22.8571 23.6508 4.1429
No log 11.0 11 1.9695 21.2698 7.1429 20.9524 21.4286 4.0476
No log 12.0 12 1.5552 23.8095 7.1429 23.3333 23.8095 3.9048
No log 13.0 13 0.8986 9.0476 0.0 9.0476 9.0476 3.7619
No log 14.0 14 0.7398 17.4603 2.3810 18.2540 17.4603 4.1905
No log 15.0 15 0.6966 12.6984 0.0 12.6984 12.6984 3.6667
No log 16.0 16 0.6352 32.5397 14.2857 32.5397 32.5397 3.7619
No log 17.0 17 0.5722 43.6508 23.8095 43.6508 42.8571 4.0952
No log 18.0 18 0.5628 43.6508 23.8095 43.6508 42.8571 3.8571
No log 19.0 19 0.5526 43.1746 23.8095 43.1746 42.8571 3.8571
No log 20.0 20 0.5522 48.4127 38.0952 48.4127 48.4127 3.7619
No log 21.0 21 0.5201 42.8571 28.5714 42.8571 42.3810 4.2381
No log 22.0 22 0.5262 37.1429 19.0476 36.9841 36.9841 4.2857
No log 23.0 23 0.5093 37.6190 23.8095 37.6190 37.6190 4.1429
No log 24.0 24 0.4818 45.3175 33.3333 45.2381 45.2381 4.1429
No log 25.0 25 0.4547 50.7937 38.0952 50.7937 50.7937 4.1429
No log 26.0 26 0.4455 50.7937 38.0952 50.7937 50.7937 4.1429
No log 27.0 27 0.4660 53.1746 42.8571 53.1746 53.1746 4.0476
No log 28.0 28 0.4825 53.1746 42.8571 53.1746 53.1746 4.0
No log 29.0 29 0.4928 53.1746 42.8571 53.1746 53.1746 4.0476
No log 30.0 30 0.4838 57.7778 42.8571 57.2222 57.5397 4.0476
No log 31.0 31 0.4955 60.3175 47.6190 60.3175 60.3175 4.0476
No log 32.0 32 0.5066 62.6984 52.3810 62.6984 62.6984 4.1429
No log 33.0 33 0.5189 62.6984 52.3810 62.6984 62.6984 4.1905
No log 34.0 34 0.5234 62.6984 52.3810 62.6984 62.6984 4.1905
No log 35.0 35 0.5225 62.6984 52.3810 62.6984 62.6984 4.1905
No log 36.0 36 0.5225 62.6984 52.3810 62.6984 62.6984 4.1905
No log 37.0 37 0.5058 62.8571 52.3810 62.2222 62.6984 4.1429
No log 38.0 38 0.4861 69.8413 61.9048 69.8413 69.8413 4.1905
No log 39.0 39 0.4625 69.8413 61.9048 69.8413 69.8413 4.1905
No log 40.0 40 0.4438 72.2222 66.6667 72.2222 72.2222 4.0952
No log 41.0 41 0.4231 72.2222 66.6667 72.2222 72.2222 4.0952
No log 42.0 42 0.4073 72.2222 66.6667 72.2222 72.2222 4.0952
No log 43.0 43 0.3938 72.2222 66.6667 72.2222 72.2222 4.0952
No log 44.0 44 0.3912 72.2222 66.6667 72.2222 72.2222 4.0952
No log 45.0 45 0.3980 72.2222 66.6667 72.2222 72.2222 4.1429
No log 46.0 46 0.4062 72.2222 66.6667 72.2222 72.2222 4.1905
No log 47.0 47 0.4121 76.9841 71.4286 76.9841 76.9841 4.2857
No log 48.0 48 0.4150 76.9841 71.4286 76.9841 76.9841 4.1905
No log 49.0 49 0.4183 76.9841 71.4286 76.9841 76.9841 4.1429
No log 50.0 50 0.4205 76.9841 71.4286 76.9841 76.9841 4.1905
No log 51.0 51 0.4306 79.3651 76.1905 79.3651 79.3651 4.0952
No log 52.0 52 0.4411 76.5079 71.4286 76.5079 76.1905 4.0
No log 53.0 53 0.4526 76.5079 71.4286 76.5079 76.1905 4.0476
No log 54.0 54 0.4667 76.5079 71.4286 76.5079 76.1905 4.0
No log 55.0 55 0.4871 76.5079 71.4286 76.5079 76.1905 4.0
No log 56.0 56 0.5063 76.5079 71.4286 76.5079 76.1905 4.0
No log 57.0 57 0.5196 76.5079 71.4286 76.5079 76.1905 4.0
No log 58.0 58 0.5265 76.5079 71.4286 76.5079 76.1905 3.9524
No log 59.0 59 0.5308 76.5079 71.4286 76.5079 76.1905 3.9524
No log 60.0 60 0.5333 76.5079 71.4286 76.5079 76.1905 3.9524
No log 61.0 61 0.5344 76.5079 71.4286 76.5079 76.1905 3.9524
No log 62.0 62 0.5348 76.5079 71.4286 76.5079 76.1905 3.9524
No log 63.0 63 0.5354 76.5079 71.4286 76.5079 76.1905 3.9524
No log 64.0 64 0.5359 76.5079 71.4286 76.5079 76.1905 3.9524
No log 65.0 65 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 66.0 66 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 67.0 67 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 68.0 68 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 69.0 69 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 70.0 70 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 71.0 71 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 72.0 72 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 73.0 73 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 74.0 74 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 75.0 75 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 76.0 76 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 77.0 77 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 78.0 78 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 79.0 79 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476
No log 80.0 80 0.5359 74.1270 66.6667 74.1270 73.8095 4.0476

Framework versions

  • Transformers 4.26.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.8.0
  • Tokenizers 0.13.3