Word-selector
This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.5118
- Rouge1: 0.3547
- Rouge2: 0.0761
- Rougel: 0.2663
- Rougelsum: 0.2667
- Gen Len: 25.195
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 12
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 400 | 3.9221 | 0.2104 | 0.0295 | 0.1684 | 0.1684 | 34.3531 |
4.5051 | 2.0 | 800 | 3.7571 | 0.285 | 0.0449 | 0.2195 | 0.2197 | 20.2419 |
3.9507 | 3.0 | 1200 | 3.6847 | 0.2976 | 0.0513 | 0.2309 | 0.2311 | 22.7119 |
3.6575 | 4.0 | 1600 | 3.6350 | 0.3137 | 0.0595 | 0.2411 | 0.2411 | 25.9231 |
3.4177 | 5.0 | 2000 | 3.6229 | 0.3311 | 0.0636 | 0.2527 | 0.2527 | 22.3788 |
3.4177 | 6.0 | 2400 | 3.6223 | 0.3359 | 0.0658 | 0.254 | 0.2543 | 21.2994 |
3.1741 | 7.0 | 2800 | 3.6313 | 0.3453 | 0.0674 | 0.2617 | 0.2618 | 21.9181 |
3.013 | 8.0 | 3200 | 3.6278 | 0.3453 | 0.0689 | 0.2649 | 0.2651 | 22.93 |
2.8253 | 9.0 | 3600 | 3.6755 | 0.3511 | 0.0705 | 0.2658 | 0.2662 | 23.1806 |
2.6705 | 10.0 | 4000 | 3.7081 | 0.3509 | 0.0742 | 0.2663 | 0.2664 | 22.5356 |
2.6705 | 11.0 | 4400 | 3.7424 | 0.3528 | 0.0716 | 0.264 | 0.2643 | 23.3775 |
2.5081 | 12.0 | 4800 | 3.8135 | 0.3553 | 0.0753 | 0.2686 | 0.2686 | 22.985 |
2.3745 | 13.0 | 5200 | 3.8369 | 0.3548 | 0.0753 | 0.2671 | 0.2675 | 23.7719 |
2.2399 | 14.0 | 5600 | 3.8816 | 0.3591 | 0.0762 | 0.2708 | 0.2709 | 23.1612 |
2.1414 | 15.0 | 6000 | 3.9132 | 0.361 | 0.0781 | 0.2719 | 0.2721 | 24.4581 |
2.1414 | 16.0 | 6400 | 3.9946 | 0.3579 | 0.077 | 0.2715 | 0.2714 | 23.2131 |
2.0099 | 17.0 | 6800 | 4.0376 | 0.3595 | 0.0766 | 0.2701 | 0.2703 | 23.6681 |
1.9252 | 18.0 | 7200 | 4.0829 | 0.3576 | 0.0774 | 0.2691 | 0.2694 | 23.79 |
1.8406 | 19.0 | 7600 | 4.1218 | 0.3613 | 0.0776 | 0.2718 | 0.272 | 23.9888 |
1.7602 | 20.0 | 8000 | 4.1754 | 0.3588 | 0.0787 | 0.2702 | 0.2704 | 24.5425 |
1.7602 | 21.0 | 8400 | 4.2440 | 0.3602 | 0.0769 | 0.2716 | 0.2717 | 24.9531 |
1.6725 | 22.0 | 8800 | 4.2860 | 0.3581 | 0.0775 | 0.2688 | 0.2691 | 24.6638 |
1.6036 | 23.0 | 9200 | 4.3163 | 0.3582 | 0.0764 | 0.2697 | 0.27 | 24.5994 |
1.5572 | 24.0 | 9600 | 4.3655 | 0.3545 | 0.0749 | 0.2655 | 0.2658 | 25.145 |
1.5034 | 25.0 | 10000 | 4.3811 | 0.3583 | 0.0781 | 0.2695 | 0.2698 | 25.6856 |
1.5034 | 26.0 | 10400 | 4.4350 | 0.3593 | 0.0788 | 0.2691 | 0.2692 | 25.2394 |
1.4617 | 27.0 | 10800 | 4.4539 | 0.357 | 0.078 | 0.2686 | 0.269 | 25.2906 |
1.4175 | 28.0 | 11200 | 4.4785 | 0.3549 | 0.0757 | 0.2657 | 0.2661 | 25.62 |
1.3971 | 29.0 | 11600 | 4.5061 | 0.3567 | 0.0767 | 0.2661 | 0.2665 | 25.1988 |
1.3828 | 30.0 | 12000 | 4.5118 | 0.3547 | 0.0761 | 0.2663 | 0.2667 | 25.195 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.1.1+cu121
- Datasets 3.0.1
- Tokenizers 0.15.1
- Downloads last month
- 64
Model tree for zera09/Word-selector
Base model
google/long-t5-tglobal-base