--- license: apache-2.0 base_model: google/long-t5-tglobal-base tags: - generated_from_trainer metrics: - rouge model-index: - name: Word-selector results: [] --- # Word-selector This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 4.5118 - Rouge1: 0.3547 - Rouge2: 0.0761 - Rougel: 0.2663 - Rougelsum: 0.2667 - Gen Len: 25.195 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 16 - eval_batch_size: 12 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 30 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:| | No log | 1.0 | 400 | 3.9221 | 0.2104 | 0.0295 | 0.1684 | 0.1684 | 34.3531 | | 4.5051 | 2.0 | 800 | 3.7571 | 0.285 | 0.0449 | 0.2195 | 0.2197 | 20.2419 | | 3.9507 | 3.0 | 1200 | 3.6847 | 0.2976 | 0.0513 | 0.2309 | 0.2311 | 22.7119 | | 3.6575 | 4.0 | 1600 | 3.6350 | 0.3137 | 0.0595 | 0.2411 | 0.2411 | 25.9231 | | 3.4177 | 5.0 | 2000 | 3.6229 | 0.3311 | 0.0636 | 0.2527 | 0.2527 | 22.3788 | | 3.4177 | 6.0 | 2400 | 3.6223 | 0.3359 | 0.0658 | 0.254 | 0.2543 | 21.2994 | | 3.1741 | 7.0 | 2800 | 3.6313 | 0.3453 | 0.0674 | 0.2617 | 0.2618 | 21.9181 | | 3.013 | 8.0 | 3200 | 3.6278 | 0.3453 | 0.0689 | 0.2649 | 0.2651 | 22.93 | | 2.8253 | 9.0 | 3600 | 3.6755 | 0.3511 | 0.0705 | 0.2658 | 0.2662 | 23.1806 | | 2.6705 | 10.0 | 4000 | 3.7081 | 0.3509 | 0.0742 | 0.2663 | 0.2664 | 22.5356 | | 2.6705 | 11.0 | 4400 | 3.7424 | 0.3528 | 0.0716 | 0.264 | 0.2643 | 23.3775 | | 2.5081 | 12.0 | 4800 | 3.8135 | 0.3553 | 0.0753 | 0.2686 | 0.2686 | 22.985 | | 2.3745 | 13.0 | 5200 | 3.8369 | 0.3548 | 0.0753 | 0.2671 | 0.2675 | 23.7719 | | 2.2399 | 14.0 | 5600 | 3.8816 | 0.3591 | 0.0762 | 0.2708 | 0.2709 | 23.1612 | | 2.1414 | 15.0 | 6000 | 3.9132 | 0.361 | 0.0781 | 0.2719 | 0.2721 | 24.4581 | | 2.1414 | 16.0 | 6400 | 3.9946 | 0.3579 | 0.077 | 0.2715 | 0.2714 | 23.2131 | | 2.0099 | 17.0 | 6800 | 4.0376 | 0.3595 | 0.0766 | 0.2701 | 0.2703 | 23.6681 | | 1.9252 | 18.0 | 7200 | 4.0829 | 0.3576 | 0.0774 | 0.2691 | 0.2694 | 23.79 | | 1.8406 | 19.0 | 7600 | 4.1218 | 0.3613 | 0.0776 | 0.2718 | 0.272 | 23.9888 | | 1.7602 | 20.0 | 8000 | 4.1754 | 0.3588 | 0.0787 | 0.2702 | 0.2704 | 24.5425 | | 1.7602 | 21.0 | 8400 | 4.2440 | 0.3602 | 0.0769 | 0.2716 | 0.2717 | 24.9531 | | 1.6725 | 22.0 | 8800 | 4.2860 | 0.3581 | 0.0775 | 0.2688 | 0.2691 | 24.6638 | | 1.6036 | 23.0 | 9200 | 4.3163 | 0.3582 | 0.0764 | 0.2697 | 0.27 | 24.5994 | | 1.5572 | 24.0 | 9600 | 4.3655 | 0.3545 | 0.0749 | 0.2655 | 0.2658 | 25.145 | | 1.5034 | 25.0 | 10000 | 4.3811 | 0.3583 | 0.0781 | 0.2695 | 0.2698 | 25.6856 | | 1.5034 | 26.0 | 10400 | 4.4350 | 0.3593 | 0.0788 | 0.2691 | 0.2692 | 25.2394 | | 1.4617 | 27.0 | 10800 | 4.4539 | 0.357 | 0.078 | 0.2686 | 0.269 | 25.2906 | | 1.4175 | 28.0 | 11200 | 4.4785 | 0.3549 | 0.0757 | 0.2657 | 0.2661 | 25.62 | | 1.3971 | 29.0 | 11600 | 4.5061 | 0.3567 | 0.0767 | 0.2661 | 0.2665 | 25.1988 | | 1.3828 | 30.0 | 12000 | 4.5118 | 0.3547 | 0.0761 | 0.2663 | 0.2667 | 25.195 | ### Framework versions - Transformers 4.37.2 - Pytorch 2.1.1+cu121 - Datasets 3.0.1 - Tokenizers 0.15.1