t5-base-squad-qag

This model is a fine-tuned version of t5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1945

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 7 12.2471
No log 2.0 14 7.2702
No log 3.0 21 5.6811
No log 4.0 28 4.6100
No log 5.0 35 0.6711
No log 6.0 42 0.4312
No log 7.0 49 0.4167
No log 8.0 56 0.4011
No log 9.0 63 0.3785
No log 10.0 70 0.3256
No log 11.0 77 0.2868
No log 12.0 84 0.2607
No log 13.0 91 0.2423
No log 14.0 98 0.2277
No log 15.0 105 0.2053
No log 16.0 112 0.1962
No log 17.0 119 0.1866
No log 18.0 126 0.1822
No log 19.0 133 0.1796
No log 20.0 140 0.1789
No log 21.0 147 0.1782
No log 22.0 154 0.1774
No log 23.0 161 0.1760
No log 24.0 168 0.1754
No log 25.0 175 0.1754
No log 26.0 182 0.1748
No log 27.0 189 0.1739
No log 28.0 196 0.1730
No log 29.0 203 0.1728
No log 30.0 210 0.1728
No log 31.0 217 0.1734
No log 32.0 224 0.1736
No log 33.0 231 0.1733
No log 34.0 238 0.1731
No log 35.0 245 0.1738
No log 36.0 252 0.1744
No log 37.0 259 0.1747
No log 38.0 266 0.1745
No log 39.0 273 0.1739
No log 40.0 280 0.1747
No log 41.0 287 0.1752
No log 42.0 294 0.1757
No log 43.0 301 0.1768
No log 44.0 308 0.1776
No log 45.0 315 0.1787
No log 46.0 322 0.1800
No log 47.0 329 0.1799
No log 48.0 336 0.1801
No log 49.0 343 0.1801
No log 50.0 350 0.1808
No log 51.0 357 0.1827
No log 52.0 364 0.1842
No log 53.0 371 0.1839
No log 54.0 378 0.1841
No log 55.0 385 0.1844
No log 56.0 392 0.1835
No log 57.0 399 0.1835
No log 58.0 406 0.1839
No log 59.0 413 0.1837
No log 60.0 420 0.1838
No log 61.0 427 0.1841
No log 62.0 434 0.1846
No log 63.0 441 0.1849
No log 64.0 448 0.1857
No log 65.0 455 0.1865
No log 66.0 462 0.1877
No log 67.0 469 0.1887
No log 68.0 476 0.1893
No log 69.0 483 0.1893
No log 70.0 490 0.1896
No log 71.0 497 0.1898
0.6248 72.0 504 0.1906
0.6248 73.0 511 0.1910
0.6248 74.0 518 0.1915
0.6248 75.0 525 0.1920
0.6248 76.0 532 0.1924
0.6248 77.0 539 0.1926
0.6248 78.0 546 0.1923
0.6248 79.0 553 0.1924
0.6248 80.0 560 0.1926
0.6248 81.0 567 0.1927
0.6248 82.0 574 0.1928
0.6248 83.0 581 0.1930
0.6248 84.0 588 0.1930
0.6248 85.0 595 0.1929
0.6248 86.0 602 0.1930
0.6248 87.0 609 0.1930
0.6248 88.0 616 0.1933
0.6248 89.0 623 0.1936
0.6248 90.0 630 0.1938
0.6248 91.0 637 0.1940
0.6248 92.0 644 0.1943
0.6248 93.0 651 0.1945
0.6248 94.0 658 0.1945
0.6248 95.0 665 0.1945
0.6248 96.0 672 0.1946
0.6248 97.0 679 0.1945
0.6248 98.0 686 0.1945
0.6248 99.0 693 0.1945
0.6248 100.0 700 0.1945

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.0
  • Tokenizers 0.21.0
Downloads last month
9
Safetensors
Model size
223M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for devagonal/t5-base-squad-qag

Base model

google-t5/t5-base
Finetuned
(596)
this model