SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 8 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
RequestMoveToFloor	'Please go to the 3rd floor.' 'Can you take me to floor 5?' 'I need to go to the 8th floor.'
Confirm	"Yes, that's right." 'Sure.' 'Exactly.'
RequestEmployeeLocation	'Where is Erik Velldal’s office?' 'Which floor is Andreas Austeng on?' 'Can you tell me where Birthe Soppe’s office is?'
Feedback	'Okay, going to the 3rd floor.' 'Sure, heading to floor 5.' 'Understood, taking you to the 8th floor.'
Repeat	'Can you repeat that?' 'Sorry, I didn’t get that. Can you say it again?' 'What was that?'
CurrentFloor	'Which floor are we on?' 'What floor is this?' 'Are we on the 5th floor?'
Stop	'Stop the elevator.' "Wait, don't go to that floor." 'No, not that floor.'
OutOfCoverage	"What's the capital of France?" 'How many floors does this building have?' 'Can you make a phone call for me?'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("victomoe/setfit-intent-classifier")
# Run inference
preds = model("Yes, please.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	5.2267	10

Label	Training Sample Count
Confirm	22
CurrentFloor	21
Feedback	22
OutOfCoverage	22
Repeat	20
RequestEmployeeLocation	22
RequestMoveToFloor	23
Stop	20

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0012	1	0.0001	-
0.0618	50	0.0001	-
0.1236	100	0.0001	-
0.1854	150	0.0001	-
0.2472	200	0.0001	-
0.3090	250	0.0001	-
0.3708	300	0.0001	-
0.4326	350	0.0001	-
0.4944	400	0.0001	-
0.5562	450	0.0001	-
0.6180	500	0.0001	-
0.6799	550	0.0001	-
0.7417	600	0.0012	-
0.8035	650	0.0001	-
0.8653	700	0.0001	-
0.9271	750	0.0012	-
0.9889	800	0.0001	-
1.0507	850	0.0001	-
1.1125	900	0.0001	-
1.1743	950	0.0001	-
1.2361	1000	0.0001	-
1.2979	1050	0.0001	-
1.3597	1100	0.0001	-
1.4215	1150	0.0001	-
1.4833	1200	0.0001	-
1.5451	1250	0.0001	-
1.6069	1300	0.0001	-
1.6687	1350	0.0001	-
1.7305	1400	0.0001	-
1.7923	1450	0.0001	-
1.8541	1500	0.0023	-
1.9159	1550	0.0018	-
1.9778	1600	0.0007	-
2.0396	1650	0.0001	-
2.1014	1700	0.0001	-
2.1632	1750	0.0001	-
2.2250	1800	0.0001	-
2.2868	1850	0.0001	-
2.3486	1900	0.0001	-
2.4104	1950	0.0001	-
2.4722	2000	0.0001	-
2.5340	2050	0.0001	-
2.5958	2100	0.0001	-
2.6576	2150	0.0001	-
2.7194	2200	0.0001	-
2.7812	2250	0.0001	-
2.8430	2300	0.0001	-
2.9048	2350	0.0001	-
2.9666	2400	0.0001	-
3.0284	2450	0.0001	-
3.0902	2500	0.0001	-
3.1520	2550	0.0001	-
3.2138	2600	0.0001	-
3.2756	2650	0.0001	-
3.3375	2700	0.0001	-
3.3993	2750	0.0001	-
3.4611	2800	0.0001	-
3.5229	2850	0.0001	-
3.5847	2900	0.0001	-
3.6465	2950	0.0001	-
3.7083	3000	0.0001	-
3.7701	3050	0.0001	-
3.8319	3100	0.0	-
3.8937	3150	0.0	-
3.9555	3200	0.0001	-
4.0173	3250	0.0001	-
4.0791	3300	0.0	-
4.1409	3350	0.0001	-
4.2027	3400	0.0001	-
4.2645	3450	0.0001	-
4.3263	3500	0.0	-
4.3881	3550	0.0001	-
4.4499	3600	0.0001	-
4.5117	3650	0.0	-
4.5735	3700	0.0	-
4.6354	3750	0.0	-
4.6972	3800	0.0001	-
4.7590	3850	0.0	-
4.8208	3900	0.0	-
4.8826	3950	0.0	-
4.9444	4000	0.0	-
5.0062	4050	0.0	-
5.0680	4100	0.0	-
5.1298	4150	0.0001	-
5.1916	4200	0.0148	-
5.2534	4250	0.0258	-
5.3152	4300	0.0147	-
5.3770	4350	0.0015	-
5.4388	4400	0.0001	-
5.5006	4450	0.0001	-
5.5624	4500	0.0001	-
5.6242	4550	0.0001	-
5.6860	4600	0.0001	-
5.7478	4650	0.0001	-
5.8096	4700	0.0001	-
5.8714	4750	0.0001	-
5.9333	4800	0.0001	-
5.9951	4850	0.0001	-
6.0569	4900	0.0001	-
6.1187	4950	0.0001	-
6.1805	5000	0.0001	-
6.2423	5050	0.0001	-
6.3041	5100	0.0001	-
6.3659	5150	0.0001	-
6.4277	5200	0.0001	-
6.4895	5250	0.0001	-
6.5513	5300	0.0001	-
6.6131	5350	0.0001	-
6.6749	5400	0.0001	-
6.7367	5450	0.0001	-
6.7985	5500	0.0001	-
6.8603	5550	0.0001	-
6.9221	5600	0.0001	-
6.9839	5650	0.0001	-
7.0457	5700	0.0001	-
7.1075	5750	0.0001	-
7.1693	5800	0.0001	-
7.2311	5850	0.0001	-
7.2930	5900	0.0001	-
7.3548	5950	0.0001	-
7.4166	6000	0.0001	-
7.4784	6050	0.0001	-
7.5402	6100	0.0001	-
7.6020	6150	0.0001	-
7.6638	6200	0.0001	-
7.7256	6250	0.0001	-
7.7874	6300	0.0001	-
7.8492	6350	0.0001	-
7.9110	6400	0.0001	-
7.9728	6450	0.0001	-
8.0346	6500	0.0001	-
8.0964	6550	0.0001	-
8.1582	6600	0.0001	-
8.2200	6650	0.0001	-
8.2818	6700	0.0001	-
8.3436	6750	0.0001	-
8.4054	6800	0.0001	-
8.4672	6850	0.0	-
8.5290	6900	0.0001	-
8.5909	6950	0.0	-
8.6527	7000	0.0	-
8.7145	7050	0.0	-
8.7763	7100	0.0001	-
8.8381	7150	0.0001	-
8.8999	7200	0.0001	-
8.9617	7250	0.0	-
9.0235	7300	0.0	-
9.0853	7350	0.0	-
9.1471	7400	0.0001	-
9.2089	7450	0.0	-
9.2707	7500	0.0	-
9.3325	7550	0.0	-
9.3943	7600	0.0001	-
9.4561	7650	0.0001	-
9.5179	7700	0.0	-
9.5797	7750	0.0	-
9.6415	7800	0.0	-
9.7033	7850	0.0	-
9.7651	7900	0.0001	-
9.8269	7950	0.0	-
9.8888	8000	0.0001	-
9.9506	8050	0.0	-

Framework Versions

Python: 3.10.8
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.38.2
PyTorch: 2.1.2
Datasets: 2.17.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

victomoe
/

setfit-intent-classifier