SetFit with microsoft/deberta-v3-base

This is a SetFit model trained on the bhujith10/multi_class_classification_dataset dataset that can be used for Text Classification. This SetFit model uses microsoft/deberta-v3-base as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("bhujith10/deberta-v3-base-setfit_finetuned")
# Run inference
preds = model("Title: Influence of Spin Orbit Coupling in the Iron-Based Superconductors,
Abstract: We report on the influence of spin-orbit coupling (SOC) in the Fe-based
superconductors (FeSCs) via application of circularly-polarized spin and
angle-resolved photoemission spectroscopy. We combine this technique in
representative members of both the Fe-pnictides and Fe-chalcogenides with ab
initio density functional theory and tight-binding calculations to establish an
ubiquitous modification of the electronic structure in these materials imbued
by SOC. The influence of SOC is found to be concentrated on the hole pockets
where the superconducting gap is generally found to be largest. This result
contests descriptions of superconductivity in these materials in terms of pure
spin-singlet eigenstates, raising questions regarding the possible pairing
mechanisms and role of SOC therein.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 23 148.1 303

Training Hyperparameters

  • batch_size: (4, 4)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0002 1 0.4731 -
0.0078 50 0.4561 -
0.0155 100 0.4156 -
0.0233 150 0.2469 -
0.0311 200 0.2396 -
0.0388 250 0.2376 -
0.0466 300 0.2519 -
0.0543 350 0.1987 -
0.0621 400 0.1908 -
0.0699 450 0.161 -
0.0776 500 0.1532 -
0.0854 550 0.17 -
0.0932 600 0.139 -
0.1009 650 0.1406 -
0.1087 700 0.1239 -
0.1165 750 0.1332 -
0.1242 800 0.1566 -
0.1320 850 0.0932 -
0.1398 900 0.1101 -
0.1475 950 0.1153 -
0.1553 1000 0.0979 -
0.1630 1050 0.0741 -
0.1708 1100 0.0603 -
0.1786 1150 0.1027 -
0.1863 1200 0.0948 -
0.1941 1250 0.0968 -
0.2019 1300 0.085 -
0.2096 1350 0.0883 -
0.2174 1400 0.0792 -
0.2252 1450 0.1054 -
0.2329 1500 0.0556 -
0.2407 1550 0.0777 -
0.2484 1600 0.0922 -
0.2562 1650 0.076 -
0.2640 1700 0.0693 -
0.2717 1750 0.0857 -
0.2795 1800 0.0907 -
0.2873 1850 0.0621 -
0.2950 1900 0.0792 -
0.3028 1950 0.0608 -
0.3106 2000 0.052 -
0.3183 2050 0.056 -
0.3261 2100 0.0501 -
0.3339 2150 0.0559 -
0.3416 2200 0.0526 -
0.3494 2250 0.0546 -
0.3571 2300 0.0398 -
0.3649 2350 0.0527 -
0.3727 2400 0.0522 -
0.3804 2450 0.0468 -
0.3882 2500 0.0465 -
0.3960 2550 0.0393 -
0.4037 2600 0.0583 -
0.4115 2650 0.0278 -
0.4193 2700 0.0502 -
0.4270 2750 0.0413 -
0.4348 2800 0.0538 -
0.4425 2850 0.0361 -
0.4503 2900 0.0648 -
0.4581 2950 0.0459 -
0.4658 3000 0.0521 -
0.4736 3050 0.0288 -
0.4814 3100 0.0323 -
0.4891 3150 0.0335 -
0.4969 3200 0.0472 -
0.5047 3250 0.0553 -
0.5124 3300 0.0426 -
0.5202 3350 0.0276 -
0.5280 3400 0.0395 -
0.5357 3450 0.042 -
0.5435 3500 0.0343 -
0.5512 3550 0.0314 -
0.5590 3600 0.0266 -
0.5668 3650 0.0314 -
0.5745 3700 0.0379 -
0.5823 3750 0.0485 -
0.5901 3800 0.0311 -
0.5978 3850 0.0415 -
0.6056 3900 0.0266 -
0.6134 3950 0.0384 -
0.6211 4000 0.0348 -
0.6289 4050 0.0298 -
0.6366 4100 0.032 -
0.6444 4150 0.031 -
0.6522 4200 0.0367 -
0.6599 4250 0.0289 -
0.6677 4300 0.0333 -
0.6755 4350 0.0281 -
0.6832 4400 0.0307 -
0.6910 4450 0.0312 -
0.6988 4500 0.0488 -
0.7065 4550 0.03 -
0.7143 4600 0.0309 -
0.7220 4650 0.031 -
0.7298 4700 0.0268 -
0.7376 4750 0.0324 -
0.7453 4800 0.041 -
0.7531 4850 0.0349 -
0.7609 4900 0.0349 -
0.7686 4950 0.0291 -
0.7764 5000 0.025 -
0.7842 5050 0.0249 -
0.7919 5100 0.0272 -
0.7997 5150 0.0302 -
0.8075 5200 0.0414 -
0.8152 5250 0.0295 -
0.8230 5300 0.033 -
0.8307 5350 0.0203 -
0.8385 5400 0.0275 -
0.8463 5450 0.0354 -
0.8540 5500 0.0254 -
0.8618 5550 0.0313 -
0.8696 5600 0.0296 -
0.8773 5650 0.0248 -
0.8851 5700 0.036 -
0.8929 5750 0.025 -
0.9006 5800 0.0234 -
0.9084 5850 0.0221 -
0.9161 5900 0.0314 -
0.9239 5950 0.0273 -
0.9317 6000 0.0299 -
0.9394 6050 0.0262 -
0.9472 6100 0.0285 -
0.9550 6150 0.021 -
0.9627 6200 0.0215 -
0.9705 6250 0.0312 -
0.9783 6300 0.0259 -
0.9860 6350 0.0234 -
0.9938 6400 0.0222 -
1.0 6440 - 0.1609

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.1.0
  • Sentence Transformers: 3.3.1
  • Transformers: 4.45.2
  • PyTorch: 2.4.0
  • Datasets: 3.0.1
  • Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
10
Safetensors
Model size
184M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for bhujith10/deberta-v3-base-setfit_finetuned

Finetuned
(288)
this model

Dataset used to train bhujith10/deberta-v3-base-setfit_finetuned