metadata
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
model-index:
- name: long_t5_test
results: []
long_t5_test
This model is a fine-tuned version of google/long-t5-tglobal-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.3506
- Rouge Rouge1: 0.4697
- Rouge Rouge2: 0.1989
- Rouge Rougel: 0.274
- Rouge Rougelsum: 0.2736
- Gen Len: 388.0152
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge Rouge1 | Rouge Rouge2 | Rouge Rougel | Rouge Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 394 | 1.9389 | 0.0284 | 0.0089 | 0.0167 | 0.0165 | 30.2273 |
3.6937 | 2.0 | 788 | 1.4702 | 0.4261 | 0.1598 | 0.254 | 0.2539 | 399.0 |
1.8772 | 3.0 | 1182 | 1.4362 | 0.4397 | 0.1699 | 0.2592 | 0.2591 | 398.5152 |
1.7418 | 4.0 | 1576 | 1.4204 | 0.4434 | 0.1779 | 0.2627 | 0.2628 | 397.7374 |
1.7418 | 5.0 | 1970 | 1.4108 | 0.4474 | 0.181 | 0.2631 | 0.263 | 394.798 |
1.6623 | 6.0 | 2364 | 1.3932 | 0.4546 | 0.1873 | 0.2675 | 0.2673 | 391.8586 |
1.6449 | 7.0 | 2758 | 1.3872 | 0.4559 | 0.1882 | 0.2665 | 0.2664 | 393.4848 |
1.5757 | 8.0 | 3152 | 1.3814 | 0.458 | 0.1906 | 0.2692 | 0.2692 | 397.1061 |
1.5527 | 9.0 | 3546 | 1.3718 | 0.4607 | 0.1912 | 0.2705 | 0.2706 | 391.7222 |
1.5527 | 10.0 | 3940 | 1.3703 | 0.4649 | 0.194 | 0.2717 | 0.2719 | 393.8788 |
1.5302 | 11.0 | 4334 | 1.3621 | 0.4664 | 0.197 | 0.2726 | 0.2724 | 386.2071 |
1.5142 | 12.0 | 4728 | 1.3537 | 0.4694 | 0.1977 | 0.2731 | 0.2731 | 388.9798 |
1.4721 | 13.0 | 5122 | 1.3528 | 0.4652 | 0.1961 | 0.2716 | 0.2714 | 390.2828 |
1.4745 | 14.0 | 5516 | 1.3550 | 0.4708 | 0.2009 | 0.2742 | 0.2739 | 393.8131 |
1.4745 | 15.0 | 5910 | 1.3500 | 0.471 | 0.199 | 0.2742 | 0.2741 | 385.4192 |
1.4799 | 16.0 | 6304 | 1.3505 | 0.4725 | 0.2008 | 0.2764 | 0.2761 | 387.6364 |
1.4558 | 17.0 | 6698 | 1.3535 | 0.4743 | 0.2032 | 0.2765 | 0.2764 | 389.4192 |
1.4426 | 18.0 | 7092 | 1.3494 | 0.4743 | 0.2042 | 0.278 | 0.2776 | 386.4394 |
1.4426 | 19.0 | 7486 | 1.3513 | 0.4719 | 0.2019 | 0.2753 | 0.2752 | 388.6515 |
1.4411 | 20.0 | 7880 | 1.3506 | 0.4697 | 0.1989 | 0.274 | 0.2736 | 388.0152 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.1.1+cu121
- Datasets 3.0.1
- Tokenizers 0.15.1