vxbrandon
/

t5-base_cola_dense

+---
+license: apache-2.0
+base_model: t5-base
+tags:
+- generated_from_trainer
+datasets:
+- glue
+metrics:
+- accuracy
+model-index:
+- name: t5-base_cola_dense
+  results:
+  - task:
+      name: Text Classification
+      type: text-classification
+    dataset:
+      name: glue
+      type: glue
+      config: cola
+      split: validation
+      args: cola
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.8370086289549377
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# t5-base_cola_dense
+This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the glue dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4482
+- Accuracy: 0.8370
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 64
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 200
+- num_epochs: 5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.6484        | 0.07  | 10   | 0.6254          | 0.6913   |
+| 0.6188        | 0.15  | 20   | 0.6181          | 0.6913   |
+| 0.625         | 0.22  | 30   | 0.6137          | 0.6913   |
+| 0.5744        | 0.3   | 40   | 0.6063          | 0.6913   |
+| 0.6055        | 0.37  | 50   | 0.5963          | 0.6913   |
+| 0.5723        | 0.45  | 60   | 0.5788          | 0.6913   |
+| 0.5777        | 0.52  | 70   | 0.5527          | 0.6913   |
+| 0.5332        | 0.6   | 80   | 0.5117          | 0.7354   |
+| 0.4662        | 0.67  | 90   | 0.5060          | 0.7843   |
+| 0.4936        | 0.75  | 100  | 0.4717          | 0.7929   |
+| 0.4898        | 0.82  | 110  | 0.5304          | 0.8015   |
+| 0.4844        | 0.9   | 120  | 0.4771          | 0.8006   |
+| 0.4297        | 0.97  | 130  | 0.4673          | 0.7987   |
+| 0.4658        | 1.04  | 140  | 0.4927          | 0.8063   |
+| 0.3992        | 1.12  | 150  | 0.4884          | 0.8121   |
+| 0.4752        | 1.19  | 160  | 0.4838          | 0.8102   |
+| 0.3934        | 1.27  | 170  | 0.4714          | 0.8092   |
+| 0.4662        | 1.34  | 180  | 0.5192          | 0.7929   |
+| 0.4404        | 1.42  | 190  | 0.4719          | 0.8111   |
+| 0.3746        | 1.49  | 200  | 0.5077          | 0.8015   |
+| 0.4465        | 1.57  | 210  | 0.4425          | 0.8073   |
+| 0.3829        | 1.64  | 220  | 0.4844          | 0.8130   |
+| 0.4021        | 1.72  | 230  | 0.4659          | 0.8169   |
+| 0.4225        | 1.79  | 240  | 0.4277          | 0.8130   |
+| 0.4297        | 1.87  | 250  | 0.4677          | 0.8150   |
+| 0.3476        | 1.94  | 260  | 0.4455          | 0.8207   |
+| 0.4159        | 2.01  | 270  | 0.5063          | 0.8188   |
+| 0.3371        | 2.09  | 280  | 0.4648          | 0.8265   |
+| 0.3383        | 2.16  | 290  | 0.5451          | 0.8178   |
+| 0.3175        | 2.24  | 300  | 0.4551          | 0.8303   |
+| 0.3553        | 2.31  | 310  | 0.4899          | 0.8303   |
+| 0.3138        | 2.39  | 320  | 0.4887          | 0.8265   |
+| 0.3196        | 2.46  | 330  | 0.4632          | 0.8265   |
+| 0.3132        | 2.54  | 340  | 0.5126          | 0.8207   |
+| 0.3167        | 2.61  | 350  | 0.4661          | 0.8245   |
+| 0.3757        | 2.69  | 360  | 0.4596          | 0.8245   |
+| 0.3346        | 2.76  | 370  | 0.4650          | 0.8265   |
+| 0.3018        | 2.84  | 380  | 0.4672          | 0.8284   |
+| 0.3338        | 2.91  | 390  | 0.4822          | 0.8293   |
+| 0.3496        | 2.99  | 400  | 0.4677          | 0.8322   |
+| 0.248         | 3.06  | 410  | 0.4349          | 0.8332   |
+| 0.2804        | 3.13  | 420  | 0.5308          | 0.8322   |
+| 0.292         | 3.21  | 430  | 0.4757          | 0.8284   |
+| 0.249         | 3.28  | 440  | 0.5145          | 0.8284   |
+| 0.315         | 3.36  | 450  | 0.6137          | 0.8322   |
+| 0.2996        | 3.43  | 460  | 0.5499          | 0.8341   |
+| 0.2986        | 3.51  | 470  | 0.4774          | 0.8332   |
+| 0.3124        | 3.58  | 480  | 0.5733          | 0.8284   |
+| 0.2809        | 3.66  | 490  | 0.4938          | 0.8341   |
+| 0.213         | 3.73  | 500  | 0.5208          | 0.8332   |
+| 0.3106        | 3.81  | 510  | 0.4609          | 0.8322   |
+| 0.2226        | 3.88  | 520  | 0.5320          | 0.8274   |
+| 0.3108        | 3.96  | 530  | 0.5457          | 0.8255   |
+| 0.2456        | 4.03  | 540  | 0.4865          | 0.8322   |
+| 0.223         | 4.1   | 550  | 0.5540          | 0.8313   |
+| 0.1884        | 4.18  | 560  | 0.5363          | 0.8341   |
+| 0.1934        | 4.25  | 570  | 0.5706          | 0.8332   |
+| 0.1793        | 4.33  | 580  | 0.5814          | 0.8322   |
+| 0.2952        | 4.4   | 590  | 0.5305          | 0.8360   |
+| 0.2915        | 4.48  | 600  | 0.5104          | 0.8332   |
+| 0.259         | 4.55  | 610  | 0.5076          | 0.8428   |
+| 0.2453        | 4.63  | 620  | 0.5188          | 0.8351   |
+| 0.1903        | 4.7   | 630  | 0.5396          | 0.8399   |
+| 0.2573        | 4.78  | 640  | 0.5584          | 0.8332   |
+| 0.2787        | 4.85  | 650  | 0.5340          | 0.8360   |
+| 0.2256        | 4.93  | 660  | 0.5175          | 0.8351   |
+| 0.257         | 5.0   | 670  | 0.4482          | 0.8370   |
+### Framework versions
+- Transformers 4.33.3
+- Pytorch 2.0.1+cu118
+- Datasets 2.14.5
+- Tokenizers 0.13.3