eladven's picture
Evaluation results for SetFit/deberta-v3-base__sst2__all-train model as a base model for other tasks
09f02c8
|
raw
history blame
4.22 kB
---
license: mit
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: deberta-v3-base__sst2__all-train
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# deberta-v3-base__sst2__all-train
This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6964
- Accuracy: 0.49
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| No log | 1.0 | 7 | 0.6964 | 0.49 |
| No log | 2.0 | 14 | 0.7010 | 0.49 |
| No log | 3.0 | 21 | 0.7031 | 0.49 |
| No log | 4.0 | 28 | 0.7054 | 0.49 |
### Framework versions
- Transformers 4.15.0
- Pytorch 1.10.2+cu102
- Datasets 1.18.2
- Tokenizers 0.10.3
## Model Recycling
[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=0.10&mnli_lp=nan&20_newsgroup=0.06&ag_news=0.36&amazon_reviews_multi=0.08&anli=0.63&boolq=1.45&cb=3.57&cola=0.39&copa=-1.40&dbpedia=0.57&esnli=-0.53&financial_phrasebank=1.52&imdb=-0.04&isear=-0.22&mnli=-0.19&mrpc=0.99&multirc=2.00&poem_sentiment=0.77&qnli=-0.19&qqp=0.21&rotten_tomatoes=-0.18&rte=-0.76&sst2=-0.34&sst_5bins=-0.60&stsb=-0.32&trec_coarse=0.24&trec_fine=-0.22&tweet_ev_emoji=0.82&tweet_ev_emotion=0.50&tweet_ev_hate=-3.92&tweet_ev_irony=-0.99&tweet_ev_offensive=-0.17&tweet_ev_sentiment=-0.96&wic=1.20&wnli=-2.61&wsc=2.26&yahoo_answers=-0.27&model_name=SetFit%2Fdeberta-v3-base__sst2__all-train&base_name=microsoft%2Fdeberta-v3-base) using SetFit/deberta-v3-base__sst2__all-train as a base model yields average score of 79.14 in comparison to 79.04 by microsoft/deberta-v3-base.
The model is ranked 3rd among all tested models for the microsoft/deberta-v3-base architecture as of 09/01/2023
Results:
| 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|-------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
| 86.4711 | 90.8 | 66.94 | 59.4063 | 84.4343 | 78.5714 | 86.9607 | 57 | 80 | 91.3986 | 86 | 94.452 | 71.6428 | 89.5952 | 90.1961 | 64.2533 | 87.5 | 93.3187 | 91.9936 | 90.2439 | 81.5884 | 94.7248 | 56.3801 | 89.96 | 98 | 90.8 | 47.014 | 84.4476 | 52.2896 | 78.8265 | 84.8837 | 70.8401 | 72.4138 | 67.6056 | 66.3462 | 71.7667 |
For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)