SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 6 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
product policy	'Do you offer a gift wrapping service for sneakers?' 'What are the consequences if my account is suspended or terminated for any reason?' 'Do you share my personal information with third parties?'
general faq	'Can you explain why Mashru silk is considered more comfortable to wear compared to pure silk sarees?' 'What are some tips for maximizing the antioxidant content when brewing green tea?' 'Can you recommend K-beauty products for hot and humid climates?'
product discoverability	'Are there any sarees with Kadwa Weave technique?' 'cookie boxes with dividers' 'Are there any products for dry skin?'
Out of Scope	'Is this website secure?' 'How do you handle intellectual property disputes?' 'Do you know how to play the piano?'
order tracking	'I want to deliver candle supplies to Jaipur, how many days will it take to deliver?' 'I want to deliver bags to Pune, how many days will it take to deliver?' 'I need to change the delivery address for my recent order, how can I do that?'
product faq	'Does this product help with dark spots?' '3. Is this product currently in stock?' 'Is the product in stock?'

Evaluation

Metrics

Label	Accuracy
all	0.8711

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("I like to listen to classical music")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	10.66	28

Label	Training Sample Count
Out of Scope	50
general faq	50
order tracking	50
product discoverability	50
product faq	50
product policy	50

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0002	1	0.2592	-
0.0107	50	0.2424	-
0.0213	100	0.1506	-
0.0320	150	0.222	-
0.0427	200	0.1227	-
0.0533	250	0.1801	-
0.0640	300	0.1111	-
0.0747	350	0.0346	-
0.0853	400	0.0313	-
0.0960	450	0.0048	-
0.1067	500	0.0023	-
0.1173	550	0.0018	-
0.1280	600	0.0133	-
0.1387	650	0.0008	-
0.1493	700	0.0006	-
0.1600	750	0.0005	-
0.1706	800	0.0008	-
0.1813	850	0.0007	-
0.1920	900	0.0006	-
0.2026	950	0.0006	-
0.2133	1000	0.0003	-
0.2240	1050	0.0026	-
0.2346	1100	0.0004	-
0.2453	1150	0.0004	-
0.2560	1200	0.0004	-
0.2666	1250	0.0005	-
0.2773	1300	0.0005	-
0.2880	1350	0.0003	-
0.2986	1400	0.0001	-
0.3093	1450	0.0001	-
0.3200	1500	0.0002	-
0.3306	1550	0.0002	-
0.3413	1600	0.0002	-
0.3520	1650	0.0001	-
0.3626	1700	0.0004	-
0.3733	1750	0.0002	-
0.3840	1800	0.0005	-
0.3946	1850	0.0002	-
0.4053	1900	0.0002	-
0.4160	1950	0.0001	-
0.4266	2000	0.0001	-
0.4373	2050	0.0001	-
0.4480	2100	0.0001	-
0.4586	2150	0.0001	-
0.4693	2200	0.0002	-
0.4799	2250	0.0048	-
0.4906	2300	0.0001	-
0.5013	2350	0.001	-
0.5119	2400	0.0002	-
0.5226	2450	0.0002	-
0.5333	2500	0.0001	-
0.5439	2550	0.0001	-
0.5546	2600	0.0001	-
0.5653	2650	0.0001	-
0.5759	2700	0.0001	-
0.5866	2750	0.0001	-
0.5973	2800	0.0001	-
0.6079	2850	0.0001	-
0.6186	2900	0.0001	-
0.6293	2950	0.0001	-
0.6399	3000	0.0001	-
0.6506	3050	0.0001	-
0.6613	3100	0.0001	-
0.6719	3150	0.0001	-
0.6826	3200	0.0001	-
0.6933	3250	0.0001	-
0.7039	3300	0.0001	-
0.7146	3350	0.0001	-
0.7253	3400	0.0001	-
0.7359	3450	0.0001	-
0.7466	3500	0.0001	-
0.7573	3550	0.0001	-
0.7679	3600	0.0001	-
0.7786	3650	0.0001	-
0.7892	3700	0.0001	-
0.7999	3750	0.0001	-
0.8106	3800	0.0001	-
0.8212	3850	0.0	-
0.8319	3900	0.0001	-
0.8426	3950	0.0001	-
0.8532	4000	0.0001	-
0.8639	4050	0.0001	-
0.8746	4100	0.0001	-
0.8852	4150	0.0	-
0.8959	4200	0.0001	-
0.9066	4250	0.0001	-
0.9172	4300	0.0001	-
0.9279	4350	0.0001	-
0.9386	4400	0.0001	-
0.9492	4450	0.0001	-
0.9599	4500	0.0001	-
0.9706	4550	0.0001	-
0.9812	4600	0.0	-
0.9919	4650	0.0001	-
1.0026	4700	0.0	-
1.0132	4750	0.0001	-
1.0239	4800	0.0001	-
1.0346	4850	0.0001	-
1.0452	4900	0.0001	-
1.0559	4950	0.0001	-
1.0666	5000	0.0	-
1.0772	5050	0.0	-
1.0879	5100	0.0001	-
1.0985	5150	0.0	-
1.1092	5200	0.0	-
1.1199	5250	0.0	-
1.1305	5300	0.0001	-
1.1412	5350	0.0001	-
1.1519	5400	0.0	-
1.1625	5450	0.0001	-
1.1732	5500	0.0001	-
1.1839	5550	0.0002	-
1.1945	5600	0.0	-
1.2052	5650	0.0	-
1.2159	5700	0.0	-
1.2265	5750	0.0	-
1.2372	5800	0.0001	-
1.2479	5850	0.0001	-
1.2585	5900	0.0001	-
1.2692	5950	0.0	-
1.2799	6000	0.0	-
1.2905	6050	0.0	-
1.3012	6100	0.0001	-
1.3119	6150	0.0	-
1.3225	6200	0.0	-
1.3332	6250	0.0	-
1.3439	6300	0.0	-
1.3545	6350	0.0	-
1.3652	6400	0.0	-
1.3759	6450	0.0	-
1.3865	6500	0.0	-
1.3972	6550	0.0	-
1.4078	6600	0.0	-
1.4185	6650	0.0	-
1.4292	6700	0.0	-
1.4398	6750	0.0	-
1.4505	6800	0.0	-
1.4612	6850	0.0	-
1.4718	6900	0.0001	-
1.4825	6950	0.0001	-
1.4932	7000	0.0	-
1.5038	7050	0.0	-
1.5145	7100	0.0001	-
1.5252	7150	0.0001	-
1.5358	7200	0.0001	-
1.5465	7250	0.0001	-
1.5572	7300	0.0	-
1.5678	7350	0.0	-
1.5785	7400	0.0	-
1.5892	7450	0.0001	-
1.5998	7500	0.0	-
1.6105	7550	0.0	-
1.6212	7600	0.0	-
1.6318	7650	0.0	-
1.6425	7700	0.0	-
1.6532	7750	0.0	-
1.6638	7800	0.0	-
1.6745	7850	0.0	-
1.6852	7900	0.0	-
1.6958	7950	0.0	-
1.7065	8000	0.0	-
1.7172	8050	0.0	-
1.7278	8100	0.0	-
1.7385	8150	0.0001	-
1.7491	8200	0.0	-
1.7598	8250	0.0	-
1.7705	8300	0.0	-
1.7811	8350	0.0001	-
1.7918	8400	0.0	-
1.8025	8450	0.0	-
1.8131	8500	0.0	-
1.8238	8550	0.0	-
1.8345	8600	0.0001	-
1.8451	8650	0.0	-
1.8558	8700	0.0	-
1.8665	8750	0.0001	-
1.8771	8800	0.0	-
1.8878	8850	0.0	-
1.8985	8900	0.0	-
1.9091	8950	0.0001	-
1.9198	9000	0.0	-
1.9305	9050	0.0	-
1.9411	9100	0.0	-
1.9518	9150	0.0	-
1.9625	9200	0.0	-
1.9731	9250	0.0	-
1.9838	9300	0.0	-
1.9945	9350	0.0	-

Framework Versions

Python: 3.10.16
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.40.2
PyTorch: 2.2.2
Datasets: 2.19.1
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Shankhdhar
/

classifier_woog_base_oos_combined