metadata
license: mit
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: deberta-v3-base__sst2__all-train
results: []
deberta-v3-base__sst2__all-train
This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6964
- Accuracy: 0.49
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
No log | 1.0 | 7 | 0.6964 | 0.49 |
No log | 2.0 | 14 | 0.7010 | 0.49 |
No log | 3.0 | 21 | 0.7031 | 0.49 |
No log | 4.0 | 28 | 0.7054 | 0.49 |
Framework versions
- Transformers 4.15.0
- Pytorch 1.10.2+cu102
- Datasets 1.18.2
- Tokenizers 0.10.3
Model Recycling
Evaluation on 36 datasets using SetFit/deberta-v3-base__sst2__all-train as a base model yields average score of 79.14 in comparison to 79.04 by microsoft/deberta-v3-base.
The model is ranked 3rd among all tested models for the microsoft/deberta-v3-base architecture as of 09/01/2023 Results:
20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
86.4711 | 90.8 | 66.94 | 59.4063 | 84.4343 | 78.5714 | 86.9607 | 57 | 80 | 91.3986 | 86 | 94.452 | 71.6428 | 89.5952 | 90.1961 | 64.2533 | 87.5 | 93.3187 | 91.9936 | 90.2439 | 81.5884 | 94.7248 | 56.3801 | 89.96 | 98 | 90.8 | 47.014 | 84.4476 | 52.2896 | 78.8265 | 84.8837 | 70.8401 | 72.4138 | 67.6056 | 66.3462 | 71.7667 |
For more information, see: Model Recycling