T5-base-finetuned-stsb

This model is T5 fine-tuned on GLUE STS-B dataset. It acheives the following results on the validation set

  • Pearson Correlation Coefficient: 0.8937

Model Details

T5 is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks and for which each task is converted into a text-to-text format.

Training procedure

Tokenization

Since, T5 is a text-to-text model, the labels of the dataset are converted as follows: For each example, a sentence as been formed as "stsb sentence1: " + stsb_sent1 + "sentence2: " + stsb_sent2 and fed to the tokenizer to get the input_ids and attention_mask. Unlike other GLUE tasks, STS-B is a regression task where the goal is to predict a similarity score between 1 and 5. I have used the same stratey as descibed in the T5 paper for fine-tuning. In the paper, it is mentioned as

regression problem as a 21-class classification problem. ```


### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-4
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: epsilon=1e-08
- num_epochs: 3.0

### Training results


|Epoch | Training Loss | Validation Pearson Correlation Coefficient |
|:----:|:-------------:|:-------------------:|
|   1  |    0.8623     | 0.8200           |
|   2  |    0.7782     | 0.8675        |
|   3  |     0.7040   | **0.8937** |
Downloads last month
111
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train PavanNeerudu/t5-base-finetuned-stsb

Evaluation results