Edit model card

SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
RequestMoveToFloor
  • 'Please go to the 3rd floor.'
  • 'Can you take me to floor 5?'
  • 'I need to go to the 8th floor.'
Confirm
  • "Yes, that's right."
  • 'Sure.'
  • 'Exactly.'
RequestEmployeeLocation
  • 'Where is Erik Velldal’s office?'
  • 'Which floor is Andreas Austeng on?'
  • 'Can you tell me where Birthe Soppe’s office is?'
Feedback
  • 'Okay, going to the 3rd floor.'
  • 'Sure, heading to floor 5.'
  • 'Understood, taking you to the 8th floor.'
Repeat
  • 'Can you repeat that?'
  • 'Sorry, I didn’t get that. Can you say it again?'
  • 'What was that?'
CurrentFloor
  • 'Which floor are we on?'
  • 'What floor is this?'
  • 'Are we on the 5th floor?'
Stop
  • 'Stop the elevator.'
  • "Wait, don't go to that floor."
  • 'No, not that floor.'
OutOfCoverage
  • "What's the capital of France?"
  • 'How many floors does this building have?'
  • 'Can you make a phone call for me?'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("victomoe/setfit-intent-classifier")
# Run inference
preds = model("Yes, please.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 5.2267 10
Label Training Sample Count
Confirm 22
CurrentFloor 21
Feedback 22
OutOfCoverage 22
Repeat 20
RequestEmployeeLocation 22
RequestMoveToFloor 23
Stop 20

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0012 1 0.0001 -
0.0618 50 0.0001 -
0.1236 100 0.0001 -
0.1854 150 0.0001 -
0.2472 200 0.0001 -
0.3090 250 0.0001 -
0.3708 300 0.0001 -
0.4326 350 0.0001 -
0.4944 400 0.0001 -
0.5562 450 0.0001 -
0.6180 500 0.0001 -
0.6799 550 0.0001 -
0.7417 600 0.0012 -
0.8035 650 0.0001 -
0.8653 700 0.0001 -
0.9271 750 0.0012 -
0.9889 800 0.0001 -
1.0507 850 0.0001 -
1.1125 900 0.0001 -
1.1743 950 0.0001 -
1.2361 1000 0.0001 -
1.2979 1050 0.0001 -
1.3597 1100 0.0001 -
1.4215 1150 0.0001 -
1.4833 1200 0.0001 -
1.5451 1250 0.0001 -
1.6069 1300 0.0001 -
1.6687 1350 0.0001 -
1.7305 1400 0.0001 -
1.7923 1450 0.0001 -
1.8541 1500 0.0023 -
1.9159 1550 0.0018 -
1.9778 1600 0.0007 -
2.0396 1650 0.0001 -
2.1014 1700 0.0001 -
2.1632 1750 0.0001 -
2.2250 1800 0.0001 -
2.2868 1850 0.0001 -
2.3486 1900 0.0001 -
2.4104 1950 0.0001 -
2.4722 2000 0.0001 -
2.5340 2050 0.0001 -
2.5958 2100 0.0001 -
2.6576 2150 0.0001 -
2.7194 2200 0.0001 -
2.7812 2250 0.0001 -
2.8430 2300 0.0001 -
2.9048 2350 0.0001 -
2.9666 2400 0.0001 -
3.0284 2450 0.0001 -
3.0902 2500 0.0001 -
3.1520 2550 0.0001 -
3.2138 2600 0.0001 -
3.2756 2650 0.0001 -
3.3375 2700 0.0001 -
3.3993 2750 0.0001 -
3.4611 2800 0.0001 -
3.5229 2850 0.0001 -
3.5847 2900 0.0001 -
3.6465 2950 0.0001 -
3.7083 3000 0.0001 -
3.7701 3050 0.0001 -
3.8319 3100 0.0 -
3.8937 3150 0.0 -
3.9555 3200 0.0001 -
4.0173 3250 0.0001 -
4.0791 3300 0.0 -
4.1409 3350 0.0001 -
4.2027 3400 0.0001 -
4.2645 3450 0.0001 -
4.3263 3500 0.0 -
4.3881 3550 0.0001 -
4.4499 3600 0.0001 -
4.5117 3650 0.0 -
4.5735 3700 0.0 -
4.6354 3750 0.0 -
4.6972 3800 0.0001 -
4.7590 3850 0.0 -
4.8208 3900 0.0 -
4.8826 3950 0.0 -
4.9444 4000 0.0 -
5.0062 4050 0.0 -
5.0680 4100 0.0 -
5.1298 4150 0.0001 -
5.1916 4200 0.0148 -
5.2534 4250 0.0258 -
5.3152 4300 0.0147 -
5.3770 4350 0.0015 -
5.4388 4400 0.0001 -
5.5006 4450 0.0001 -
5.5624 4500 0.0001 -
5.6242 4550 0.0001 -
5.6860 4600 0.0001 -
5.7478 4650 0.0001 -
5.8096 4700 0.0001 -
5.8714 4750 0.0001 -
5.9333 4800 0.0001 -
5.9951 4850 0.0001 -
6.0569 4900 0.0001 -
6.1187 4950 0.0001 -
6.1805 5000 0.0001 -
6.2423 5050 0.0001 -
6.3041 5100 0.0001 -
6.3659 5150 0.0001 -
6.4277 5200 0.0001 -
6.4895 5250 0.0001 -
6.5513 5300 0.0001 -
6.6131 5350 0.0001 -
6.6749 5400 0.0001 -
6.7367 5450 0.0001 -
6.7985 5500 0.0001 -
6.8603 5550 0.0001 -
6.9221 5600 0.0001 -
6.9839 5650 0.0001 -
7.0457 5700 0.0001 -
7.1075 5750 0.0001 -
7.1693 5800 0.0001 -
7.2311 5850 0.0001 -
7.2930 5900 0.0001 -
7.3548 5950 0.0001 -
7.4166 6000 0.0001 -
7.4784 6050 0.0001 -
7.5402 6100 0.0001 -
7.6020 6150 0.0001 -
7.6638 6200 0.0001 -
7.7256 6250 0.0001 -
7.7874 6300 0.0001 -
7.8492 6350 0.0001 -
7.9110 6400 0.0001 -
7.9728 6450 0.0001 -
8.0346 6500 0.0001 -
8.0964 6550 0.0001 -
8.1582 6600 0.0001 -
8.2200 6650 0.0001 -
8.2818 6700 0.0001 -
8.3436 6750 0.0001 -
8.4054 6800 0.0001 -
8.4672 6850 0.0 -
8.5290 6900 0.0001 -
8.5909 6950 0.0 -
8.6527 7000 0.0 -
8.7145 7050 0.0 -
8.7763 7100 0.0001 -
8.8381 7150 0.0001 -
8.8999 7200 0.0001 -
8.9617 7250 0.0 -
9.0235 7300 0.0 -
9.0853 7350 0.0 -
9.1471 7400 0.0001 -
9.2089 7450 0.0 -
9.2707 7500 0.0 -
9.3325 7550 0.0 -
9.3943 7600 0.0001 -
9.4561 7650 0.0001 -
9.5179 7700 0.0 -
9.5797 7750 0.0 -
9.6415 7800 0.0 -
9.7033 7850 0.0 -
9.7651 7900 0.0001 -
9.8269 7950 0.0 -
9.8888 8000 0.0001 -
9.9506 8050 0.0 -

Framework Versions

  • Python: 3.10.8
  • SetFit: 1.1.0
  • Sentence Transformers: 3.1.1
  • Transformers: 4.38.2
  • PyTorch: 2.1.2
  • Datasets: 2.17.1
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
0
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for victomoe/setfit-intent-classifier

Finetuned
(243)
this model