vxbrandon
/

t5-base_sst2_dense

+---
+license: apache-2.0
+base_model: t5-base
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- accuracy
+model-index:
+- name: t5-base_sst2_dense
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: sst2
+      split: validation
+      args: sst2
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.9243119266055045
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# t5-base_sst2_dense
+This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3118
+- Accuracy: 0.9243
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 64
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 200
+- num_epochs: 5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.7121        | 0.01  | 10   | 0.6973          | 0.4989   |
+| 0.6719        | 0.02  | 20   | 0.6858          | 0.5092   |
+| 0.6727        | 0.03  | 30   | 0.6851          | 0.5092   |
+| 0.6621        | 0.04  | 40   | 0.6685          | 0.5092   |
+| 0.6359        | 0.05  | 50   | 0.6438          | 0.5975   |
+| 0.6219        | 0.06  | 60   | 0.6044          | 0.8280   |
+| 0.5648        | 0.07  | 70   | 0.5312          | 0.8452   |
+| 0.4609        | 0.08  | 80   | 0.4129          | 0.8899   |
+| 0.3486        | 0.09  | 90   | 0.3354          | 0.8842   |
+| 0.291         | 0.1   | 100  | 0.2685          | 0.9106   |
+| 0.28          | 0.1   | 110  | 0.2745          | 0.9014   |
+| 0.2078        | 0.11  | 120  | 0.2994          | 0.9025   |
+| 0.229         | 0.12  | 130  | 0.3541          | 0.8899   |
+| 0.3003        | 0.13  | 140  | 0.2503          | 0.9106   |
+| 0.1828        | 0.14  | 150  | 0.2430          | 0.9140   |
+| 0.1957        | 0.15  | 160  | 0.2335          | 0.9140   |
+| 0.2385        | 0.16  | 170  | 0.2552          | 0.9094   |
+| 0.1792        | 0.17  | 180  | 0.2527          | 0.9174   |
+| 0.2147        | 0.18  | 190  | 0.2657          | 0.9128   |
+| 0.23          | 0.19  | 200  | 0.2290          | 0.9151   |
+| 0.2376        | 0.2   | 210  | 0.2495          | 0.9209   |
+| 0.2331        | 0.21  | 220  | 0.2370          | 0.9243   |
+| 0.215         | 0.22  | 230  | 0.2258          | 0.9209   |
+| 0.1833        | 0.23  | 240  | 0.2225          | 0.9209   |
+| 0.2277        | 0.24  | 250  | 0.2202          | 0.9232   |
+| 0.1969        | 0.25  | 260  | 0.2164          | 0.9209   |
+| 0.2038        | 0.26  | 270  | 0.2147          | 0.9220   |
+| 0.1421        | 0.27  | 280  | 0.2172          | 0.9186   |
+| 0.1604        | 0.28  | 290  | 0.2408          | 0.9209   |
+| 0.1864        | 0.29  | 300  | 0.2336          | 0.9220   |
+| 0.1629        | 0.29  | 310  | 0.2293          | 0.9255   |
+| 0.2334        | 0.3   | 320  | 0.2201          | 0.9243   |
+| 0.1676        | 0.31  | 330  | 0.2108          | 0.9255   |
+| 0.1672        | 0.32  | 340  | 0.2233          | 0.9209   |
+| 0.1886        | 0.33  | 350  | 0.2229          | 0.9220   |
+| 0.2081        | 0.34  | 360  | 0.2227          | 0.9209   |
+| 0.2145        | 0.35  | 370  | 0.2185          | 0.9243   |
+| 0.1322        | 0.36  | 380  | 0.2286          | 0.9209   |
+| 0.2552        | 0.37  | 390  | 0.2193          | 0.9232   |
+| 0.1542        | 0.38  | 400  | 0.2234          | 0.9232   |
+| 0.2285        | 0.39  | 410  | 0.2190          | 0.9232   |
+| 0.1633        | 0.4   | 420  | 0.2256          | 0.9255   |
+| 0.1592        | 0.41  | 430  | 0.2386          | 0.9220   |
+| 0.1525        | 0.42  | 440  | 0.2369          | 0.9255   |
+| 0.2523        | 0.43  | 450  | 0.3649          | 0.9220   |
+| 0.1938        | 0.44  | 460  | 0.2203          | 0.9255   |
+| 0.1894        | 0.45  | 470  | 0.2067          | 0.9278   |
+| 0.143         | 0.46  | 480  | 0.2143          | 0.9266   |
+| 0.179         | 0.47  | 490  | 0.2090          | 0.9300   |
+| 0.1589        | 0.48  | 500  | 0.2288          | 0.9255   |
+| 0.1267        | 0.48  | 510  | 0.2129          | 0.9255   |
+| 0.1822        | 0.49  | 520  | 0.2193          | 0.9255   |
+| 0.172         | 0.5   | 530  | 0.3245          | 0.9220   |
+| 0.1268        | 0.51  | 540  | 0.3119          | 0.9300   |
+| 0.1243        | 0.52  | 550  | 0.3271          | 0.9255   |
+| 0.141         | 0.53  | 560  | 0.3441          | 0.9220   |
+| 0.1907        | 0.54  | 570  | 0.3205          | 0.9278   |
+| 0.1688        | 0.55  | 580  | 0.3240          | 0.9243   |
+| 0.1602        | 0.56  | 590  | 0.3146          | 0.9243   |
+| 0.1292        | 0.57  | 600  | 0.3043          | 0.9289   |
+| 0.1588        | 0.58  | 610  | 0.3345          | 0.9209   |
+| 0.1381        | 0.59  | 620  | 0.3118          | 0.9243   |
+### Framework versions
+- Transformers 4.33.3
+- Pytorch 2.0.1+cu118
+- Datasets 2.14.5
+- Tokenizers 0.13.3