--- license: apache-2.0 base_model: google/flan-t5-large tags: - generated_from_trainer metrics: - rouge model-index: - name: deductor-flant5-large results: [] --- # deductor-flant5-large This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.2461 - Rouge1: 92.1213 - Rouge2: 86.4281 - Rougel: 90.5846 - Rougelsum: 90.5294 - Gen Len: 11.2014 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 16 - eval_batch_size: 32 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 64 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | 0.306 | 0.19 | 50 | 0.2959 | 89.3028 | 82.5127 | 87.4173 | 87.3544 | 11.2211 | | 0.2774 | 0.38 | 100 | 0.2717 | 90.8414 | 84.2378 | 88.9385 | 88.9058 | 11.2571 | | 0.2366 | 0.57 | 150 | 0.2613 | 91.0152 | 84.6687 | 89.2107 | 89.1735 | 11.2081 | | 0.2166 | 0.77 | 200 | 0.2585 | 91.5215 | 85.4308 | 89.7742 | 89.7422 | 11.2802 | | 0.22 | 0.96 | 250 | 0.2517 | 91.5587 | 85.6107 | 89.8835 | 89.8621 | 11.2655 | | 0.1564 | 1.15 | 300 | 0.2630 | 91.999 | 86.0835 | 90.3611 | 90.3168 | 11.2039 | | 0.1803 | 1.34 | 350 | 0.2546 | 91.5183 | 85.6214 | 89.9752 | 89.9323 | 11.2462 | | 0.1737 | 1.53 | 400 | 0.2483 | 91.8342 | 86.0171 | 90.3042 | 90.2641 | 11.1943 | | 0.157 | 1.72 | 450 | 0.2493 | 91.6585 | 85.4651 | 90.0181 | 89.9991 | 10.9376 | | 0.1561 | 1.92 | 500 | 0.2461 | 92.1213 | 86.4281 | 90.5846 | 90.5294 | 11.2014 | | 0.1191 | 2.11 | 550 | 0.2585 | 92.4493 | 86.6961 | 90.9293 | 90.8761 | 11.2416 | | 0.1134 | 2.3 | 600 | 0.2633 | 92.4707 | 86.833 | 90.9516 | 90.9195 | 11.1675 | | 0.1227 | 2.49 | 650 | 0.2592 | 92.2738 | 86.5064 | 90.7556 | 90.6998 | 11.2642 | | 0.1175 | 2.68 | 700 | 0.2657 | 92.0861 | 86.2203 | 90.6168 | 90.5657 | 11.1700 | | 0.1132 | 2.87 | 750 | 0.2644 | 92.3834 | 86.7237 | 90.8761 | 90.8389 | 11.2123 | | 0.1097 | 3.07 | 800 | 0.2692 | 92.3356 | 86.7021 | 90.8717 | 90.8185 | 11.1822 | | 0.0949 | 3.26 | 850 | 0.2690 | 92.5746 | 87.001 | 91.1734 | 91.1222 | 11.2785 | | 0.0813 | 3.45 | 900 | 0.2875 | 92.5641 | 86.9813 | 91.0881 | 91.0411 | 11.2257 | | 0.0861 | 3.64 | 950 | 0.2800 | 92.4738 | 86.9379 | 91.0384 | 90.9995 | 11.2136 | | 0.0879 | 3.83 | 1000 | 0.2770 | 92.6025 | 87.105 | 91.1632 | 91.1292 | 11.2303 | ### Framework versions - Transformers 4.36.2 - Pytorch 2.0.1 - Datasets 2.18.0 - Tokenizers 0.15.2