SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • ' Who was the Germany national team captain during the 2006 World cup? Michael Ballack was the Germany national team captrain during the 2006 world cup How old was he? '
  • ' Who was the Germany national team captain during the 2006 World cup? Michael Ballack was the Germany national team captrain during the 2006 world cup Who won it back then? '
  • ' How old was Ronaldo when he moved to Real Madrid? Ronaldo moved to Real Madrid after leaving Inter when he was 25 years old. What year did he leave? '
0
  • ' Which ocean surrounds Antarctica? The ocean that surrounds Antarctica is the Southern Ocean. What challenges do scientists face when conducting research in Antarctica? '
  • ' Name a country in Oceania. A country in Oceania is Australia. What are some of the popular tourist attractions in Oceania? '
  • " What's the significance of the Suez Canal? The Suez Canal holds great importance as a crucial Egyptian waterway that links the Mediterranean Sea to the Red Sea. It plays a pivotal role in enhancing maritime trade and transportation between Europe and Asia, providing ships with a shorter and safer route compared to the arduous journey around the southern tip of Africa. How has the Suez Canal impacted global trade? "

Evaluation

Metrics

Label Accuracy
all 0.9348

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("<Question> What is the highest grossing movie at the box office? </Question> <Answer> The highest-grossing movie at the box office is Avatar. </Answer> <Question> How much money did the movie make? </Question>")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 14 44.4406 221
Label Training Sample Count
0 240
1 248

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0008 1 0.5762 -
0.0410 50 0.2742 -
0.0820 100 0.2188 -
0.1230 150 0.0586 -
0.1639 200 0.0194 -
0.2049 250 0.0028 -
0.2459 300 0.0004 -
0.2869 350 0.0003 -
0.3279 400 0.0002 -
0.3689 450 0.0001 -
0.4098 500 0.0001 -
0.4508 550 0.0001 -
0.4918 600 0.0001 -
0.5328 650 0.0006 -
0.5738 700 0.0001 -
0.6148 750 0.0001 -
0.6557 800 0.0001 -
0.6967 850 0.0001 -
0.7377 900 0.0001 -
0.7787 950 0.0001 -
0.8197 1000 0.0001 -
0.8607 1050 0.0001 -
0.9016 1100 0.0001 -
0.9426 1150 0.0001 -
0.9836 1200 0.0 -
0.0008 1 0.0 -
0.0410 50 0.0 -
0.0820 100 0.0003 -
0.1230 150 0.0005 -
0.1639 200 0.0013 -
0.2049 250 0.0008 -
0.2459 300 0.0 -
0.2869 350 0.0 -
0.3279 400 0.0 -
0.3689 450 0.0 -
0.4098 500 0.0 -
0.4508 550 0.0 -
0.4918 600 0.0 -
0.5328 650 0.0 -
0.5738 700 0.0 -
0.6148 750 0.0 -
0.6557 800 0.008 -
0.6967 850 0.0285 -
0.7377 900 0.012 -
0.7787 950 0.0073 -
0.8197 1000 0.0013 -
0.8607 1050 0.0 -
0.9016 1100 0.0 -
0.9426 1150 0.0 -
0.9836 1200 0.0013 -
1.0246 1250 0.0013 -
1.0656 1300 0.0 -
1.1066 1350 0.0 -
1.1475 1400 0.0 -
1.1885 1450 0.0 -
1.2295 1500 0.0 -
1.2705 1550 0.0 -
1.3115 1600 0.0 -
1.3525 1650 0.0022 -
1.3934 1700 0.0 -
1.4344 1750 0.0 -
1.4754 1800 0.0 -
1.5164 1850 0.0013 -
1.5574 1900 0.0 -
1.5984 1950 0.0 -
1.6393 2000 0.0 -
1.6803 2050 0.0 -
1.7213 2100 0.0 -
1.7623 2150 0.0 -
1.8033 2200 0.0 -
1.8443 2250 0.0048 -
1.8852 2300 0.0023 -
1.9262 2350 0.0049 -
1.9672 2400 0.0012 -
2.0082 2450 0.0 -
2.0492 2500 0.0 -
2.0902 2550 0.0 -
2.1311 2600 0.0 -
2.1721 2650 0.0 -
2.2131 2700 0.0 -
2.2541 2750 0.0 -
2.2951 2800 0.0 -
2.3361 2850 0.0 -
2.3770 2900 0.0 -
2.4180 2950 0.0 -
2.4590 3000 0.0 -
2.5 3050 0.0 -
2.5410 3100 0.0 -
2.5820 3150 0.0 -
2.6230 3200 0.0 -
2.6639 3250 0.0 -
2.7049 3300 0.0 -
2.7459 3350 0.0 -
2.7869 3400 0.0 -
2.8279 3450 0.0 -
2.8689 3500 0.0 -
2.9098 3550 0.0007 -
2.9508 3600 0.0 -
2.9918 3650 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.1
  • Transformers: 4.42.2
  • PyTorch: 2.5.1+cu121
  • Datasets: 3.1.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
24
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for EuriskoMobility/trained_SetFit_follow-up_model_taline

Finetuned
(184)
this model

Evaluation results