flan-t5-rouge-squad-qg-test

This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4416
  • Rouge1: 0.3489
  • Rouge2: 0.1081
  • Rougel: 0.3225
  • Rougelsum: 0.3335

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 80
  • eval_batch_size: 80
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 320
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 160

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
41.7986 1.0 3 14.9730 0.0645 0.0187 0.0619 0.0626
18.0367 2.0 6 6.4506 0.0696 0.0369 0.0673 0.0690
11.6807 3.0 9 4.8843 0.1134 0.0385 0.0991 0.1016
9.5977 4.0 12 4.1902 0.0615 0.0227 0.0529 0.0562
8.382 5.0 15 3.7084 0.0108 0.0017 0.0108 0.0107
7.3099 6.0 18 3.1393 0.0334 0.0139 0.0321 0.0319
6.2255 7.0 21 2.6959 0.0484 0.0206 0.0471 0.0478
5.3866 8.0 24 2.2886 0.0942 0.0388 0.0902 0.0928
4.4362 9.0 27 1.6919 0.1476 0.0517 0.1244 0.1336
3.5819 10.0 30 1.2444 0.2204 0.0785 0.1939 0.2064
2.7713 11.0 33 0.9173 0.3423 0.1261 0.3144 0.3277
2.1415 12.0 36 0.6726 0.3756 0.1188 0.3440 0.3607
1.6248 13.0 39 0.4801 0.3757 0.1220 0.3387 0.3573
1.3172 14.0 42 0.3892 0.3855 0.1316 0.3478 0.3655
1.0707 15.0 45 0.3425 0.3863 0.1358 0.3514 0.3691
0.8661 16.0 48 0.3104 0.3820 0.1376 0.3447 0.3624
0.7925 17.0 51 0.2946 0.3937 0.1408 0.3620 0.3738
0.6878 18.0 54 0.2863 0.3893 0.1375 0.3568 0.3672
0.6841 19.0 57 0.2810 0.3959 0.1427 0.3635 0.3731
0.6014 20.0 60 0.2782 0.3991 0.1447 0.3663 0.3794
0.5921 21.0 63 0.2786 0.4018 0.1467 0.3696 0.3841
0.5582 22.0 66 0.2776 0.3967 0.1414 0.3618 0.3767
0.5268 23.0 69 0.2785 0.3984 0.1479 0.3669 0.3822
0.4784 24.0 72 0.2796 0.4031 0.1519 0.3709 0.3851
0.4378 25.0 75 0.2831 0.4001 0.1501 0.3667 0.3805
0.4395 26.0 78 0.2876 0.4015 0.1522 0.3691 0.3811
0.4269 27.0 81 0.2897 0.4055 0.1455 0.3747 0.3845
0.3955 28.0 84 0.2925 0.3912 0.1330 0.3595 0.3694
0.3876 29.0 87 0.2976 0.3881 0.1354 0.3592 0.3677
0.3593 30.0 90 0.3008 0.3875 0.1374 0.3580 0.3686
0.3477 31.0 93 0.3038 0.3792 0.1303 0.3507 0.3609
0.3368 32.0 96 0.3079 0.3854 0.1331 0.3601 0.3677
0.3019 33.0 99 0.3134 0.3820 0.1272 0.3524 0.3633
0.3141 34.0 102 0.3202 0.3733 0.1229 0.3431 0.3541
0.2914 35.0 105 0.3233 0.3814 0.1257 0.3514 0.3638
0.2817 36.0 108 0.3250 0.3822 0.1316 0.3563 0.3636
0.2875 37.0 111 0.3280 0.3898 0.1405 0.3650 0.3737
0.267 38.0 114 0.3343 0.3878 0.1353 0.3616 0.3708
0.264 39.0 117 0.3375 0.3761 0.1182 0.3484 0.3589
0.2519 40.0 120 0.3372 0.3781 0.1228 0.3504 0.3606
0.2508 41.0 123 0.3382 0.3810 0.1244 0.3538 0.3635
0.2373 42.0 126 0.3460 0.3805 0.1230 0.3533 0.3632
0.2316 43.0 129 0.3533 0.3692 0.1125 0.3396 0.3514
0.2271 44.0 132 0.3552 0.3576 0.1133 0.3313 0.3394
0.2133 45.0 135 0.3565 0.3643 0.1244 0.3401 0.3481
0.2167 46.0 138 0.3602 0.3683 0.1245 0.3408 0.3490
0.2119 47.0 141 0.3647 0.3694 0.1278 0.3399 0.3493
0.1976 48.0 144 0.3677 0.3590 0.1194 0.3322 0.3414
0.2133 49.0 147 0.3720 0.3531 0.1115 0.3275 0.3351
0.1923 50.0 150 0.3746 0.3621 0.1189 0.3339 0.3413
0.1854 51.0 153 0.3760 0.3707 0.1280 0.3438 0.3528
0.1872 52.0 156 0.3767 0.3635 0.1219 0.3358 0.3463
0.1827 53.0 159 0.3790 0.3657 0.1196 0.3384 0.3494
0.1801 54.0 162 0.3833 0.3611 0.1195 0.3276 0.3426
0.1787 55.0 165 0.3903 0.3595 0.1202 0.3285 0.3411
0.1713 56.0 168 0.3923 0.3566 0.1179 0.3258 0.3379
0.1626 57.0 171 0.3941 0.3497 0.1152 0.3185 0.3325
0.1599 58.0 174 0.3922 0.3605 0.1216 0.3305 0.3448
0.1603 59.0 177 0.3929 0.3478 0.1079 0.3188 0.3329
0.1794 60.0 180 0.3958 0.3455 0.1057 0.3179 0.3319
0.1626 61.0 183 0.3997 0.3481 0.1078 0.3203 0.3320
0.1433 62.0 186 0.4019 0.3529 0.1129 0.3278 0.3386
0.1489 63.0 189 0.4008 0.3446 0.1137 0.3220 0.3291
0.1595 64.0 192 0.4009 0.3579 0.1159 0.3345 0.3421
0.1557 65.0 195 0.4044 0.3506 0.1165 0.3269 0.3342
0.1435 66.0 198 0.4094 0.3404 0.1082 0.3159 0.3257
0.1427 67.0 201 0.4140 0.3450 0.1103 0.3193 0.3301
0.1494 68.0 204 0.4163 0.3421 0.1090 0.3198 0.3276
0.1493 69.0 207 0.4137 0.3481 0.1101 0.3230 0.3318
0.14 70.0 210 0.4107 0.3438 0.1083 0.3193 0.3277
0.1338 71.0 213 0.4107 0.3432 0.1068 0.3199 0.3270
0.1302 72.0 216 0.4134 0.3573 0.1097 0.3317 0.3428
0.1354 73.0 219 0.4162 0.3525 0.1092 0.3270 0.3376
0.1379 74.0 222 0.4193 0.3402 0.1069 0.3177 0.3249
0.1272 75.0 225 0.4233 0.3397 0.1059 0.3173 0.3244
0.1331 76.0 228 0.4248 0.3364 0.1021 0.3149 0.3223
0.1211 77.0 231 0.4258 0.3459 0.1076 0.3235 0.3312
0.1324 78.0 234 0.4267 0.3488 0.1066 0.3257 0.3335
0.1275 79.0 237 0.4272 0.3458 0.1165 0.3201 0.3301
0.1265 80.0 240 0.4279 0.3519 0.1188 0.3288 0.3366
0.1227 81.0 243 0.4293 0.3458 0.1093 0.3261 0.3317
0.1213 82.0 246 0.4323 0.3437 0.1051 0.3189 0.3288
0.1275 83.0 249 0.4347 0.3457 0.1065 0.3212 0.3318
0.1233 84.0 252 0.4346 0.3491 0.1048 0.3235 0.3337
0.1168 85.0 255 0.4349 0.3450 0.1035 0.3208 0.3314
0.1184 86.0 258 0.4347 0.3480 0.1050 0.3255 0.3336
0.1246 87.0 261 0.4336 0.3483 0.1058 0.3272 0.3347
0.1167 88.0 264 0.4333 0.3470 0.1065 0.3269 0.3343
0.1203 89.0 267 0.4334 0.3494 0.1112 0.3278 0.3351
0.1139 90.0 270 0.4339 0.3460 0.1114 0.3253 0.3314
0.1202 91.0 273 0.4341 0.3497 0.1103 0.3252 0.3352
0.1174 92.0 276 0.4344 0.3497 0.1103 0.3252 0.3352
0.1164 93.0 279 0.4350 0.3504 0.1099 0.3249 0.3365
0.1114 94.0 282 0.4357 0.3445 0.1073 0.3188 0.3299
0.1094 95.0 285 0.4368 0.3455 0.1076 0.3197 0.3308
0.114 96.0 288 0.4376 0.3483 0.1105 0.3236 0.3336
0.1147 97.0 291 0.4381 0.3458 0.1099 0.3207 0.3303
0.116 98.0 294 0.4386 0.3458 0.1099 0.3207 0.3303
0.1187 99.0 297 0.4393 0.3499 0.1100 0.3234 0.3341
0.1112 100.0 300 0.4399 0.3519 0.1146 0.3260 0.3368
0.1124 101.0 303 0.4404 0.3519 0.1146 0.3260 0.3368
0.117 102.0 306 0.4408 0.3489 0.1081 0.3225 0.3335
0.1101 103.0 309 0.4412 0.3489 0.1081 0.3225 0.3335
0.1135 104.0 312 0.4415 0.3472 0.1075 0.3208 0.3311
0.1141 105.0 315 0.4416 0.3489 0.1081 0.3225 0.3335
0.1201 106.0 318 0.4416 0.3489 0.1081 0.3225 0.3335
0.2258 106.8 320 0.4416 0.3489 0.1081 0.3225 0.3335

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
113
Safetensors
Model size
77M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for devagonal/flan-t5-rouge-squad-qg-test

Finetuned
(322)
this model