long_t5_test / README.md
zera09's picture
End of training
c426394 verified
---
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
model-index:
- name: long_t5_test
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# long_t5_test
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.3506
- Rouge Rouge1: 0.4697
- Rouge Rouge2: 0.1989
- Rouge Rougel: 0.274
- Rouge Rougelsum: 0.2736
- Gen Len: 388.0152
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge Rouge1 | Rouge Rouge2 | Rouge Rougel | Rouge Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:------------:|:------------:|:------------:|:---------------:|:--------:|
| No log | 1.0 | 394 | 1.9389 | 0.0284 | 0.0089 | 0.0167 | 0.0165 | 30.2273 |
| 3.6937 | 2.0 | 788 | 1.4702 | 0.4261 | 0.1598 | 0.254 | 0.2539 | 399.0 |
| 1.8772 | 3.0 | 1182 | 1.4362 | 0.4397 | 0.1699 | 0.2592 | 0.2591 | 398.5152 |
| 1.7418 | 4.0 | 1576 | 1.4204 | 0.4434 | 0.1779 | 0.2627 | 0.2628 | 397.7374 |
| 1.7418 | 5.0 | 1970 | 1.4108 | 0.4474 | 0.181 | 0.2631 | 0.263 | 394.798 |
| 1.6623 | 6.0 | 2364 | 1.3932 | 0.4546 | 0.1873 | 0.2675 | 0.2673 | 391.8586 |
| 1.6449 | 7.0 | 2758 | 1.3872 | 0.4559 | 0.1882 | 0.2665 | 0.2664 | 393.4848 |
| 1.5757 | 8.0 | 3152 | 1.3814 | 0.458 | 0.1906 | 0.2692 | 0.2692 | 397.1061 |
| 1.5527 | 9.0 | 3546 | 1.3718 | 0.4607 | 0.1912 | 0.2705 | 0.2706 | 391.7222 |
| 1.5527 | 10.0 | 3940 | 1.3703 | 0.4649 | 0.194 | 0.2717 | 0.2719 | 393.8788 |
| 1.5302 | 11.0 | 4334 | 1.3621 | 0.4664 | 0.197 | 0.2726 | 0.2724 | 386.2071 |
| 1.5142 | 12.0 | 4728 | 1.3537 | 0.4694 | 0.1977 | 0.2731 | 0.2731 | 388.9798 |
| 1.4721 | 13.0 | 5122 | 1.3528 | 0.4652 | 0.1961 | 0.2716 | 0.2714 | 390.2828 |
| 1.4745 | 14.0 | 5516 | 1.3550 | 0.4708 | 0.2009 | 0.2742 | 0.2739 | 393.8131 |
| 1.4745 | 15.0 | 5910 | 1.3500 | 0.471 | 0.199 | 0.2742 | 0.2741 | 385.4192 |
| 1.4799 | 16.0 | 6304 | 1.3505 | 0.4725 | 0.2008 | 0.2764 | 0.2761 | 387.6364 |
| 1.4558 | 17.0 | 6698 | 1.3535 | 0.4743 | 0.2032 | 0.2765 | 0.2764 | 389.4192 |
| 1.4426 | 18.0 | 7092 | 1.3494 | 0.4743 | 0.2042 | 0.278 | 0.2776 | 386.4394 |
| 1.4426 | 19.0 | 7486 | 1.3513 | 0.4719 | 0.2019 | 0.2753 | 0.2752 | 388.6515 |
| 1.4411 | 20.0 | 7880 | 1.3506 | 0.4697 | 0.1989 | 0.274 | 0.2736 | 388.0152 |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.1.1+cu121
- Datasets 3.0.1
- Tokenizers 0.15.1