SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a OneVsRestClassifier instance
Maximum Sequence Length: 512 tokens

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Evaluation

Metrics

Label	Accuracy
all	0.6021

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("anismahmahi/G2_replace_Whata_repetition_with_noPropaganda_SetFit")
# Run inference
preds = model("Columbus police are investigating the shootings.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	23.1093	129

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 10
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0002	1	0.3592	-
0.0121	50	0.2852	-
0.0243	100	0.2694	-
0.0364	150	0.2182	-
0.0486	200	0.2224	-
0.0607	250	0.2634	-
0.0729	300	0.2431	-
0.0850	350	0.2286	-
0.0971	400	0.197	-
0.1093	450	0.2466	-
0.1214	500	0.2374	-
0.1336	550	0.2134	-
0.1457	600	0.2092	-
0.1578	650	0.1987	-
0.1700	700	0.2288	-
0.1821	750	0.1562	-
0.1943	800	0.27	-
0.2064	850	0.1314	-
0.2186	900	0.2144	-
0.2307	950	0.184	-
0.2428	1000	0.2069	-
0.2550	1050	0.1773	-
0.2671	1100	0.0704	-
0.2793	1150	0.1139	-
0.2914	1200	0.2398	-
0.3035	1250	0.0672	-
0.3157	1300	0.1321	-
0.3278	1350	0.0803	-
0.3400	1400	0.0589	-
0.3521	1450	0.0428	-
0.3643	1500	0.0886	-
0.3764	1550	0.0839	-
0.3885	1600	0.1843	-
0.4007	1650	0.0375	-
0.4128	1700	0.114	-
0.4250	1750	0.1264	-
0.4371	1800	0.0585	-
0.4492	1850	0.0586	-
0.4614	1900	0.0805	-
0.4735	1950	0.0686	-
0.4857	2000	0.0684	-
0.4978	2050	0.0803	-
0.5100	2100	0.076	-
0.5221	2150	0.0888	-
0.5342	2200	0.1091	-
0.5464	2250	0.038	-
0.5585	2300	0.0674	-
0.5707	2350	0.0562	-
0.5828	2400	0.0603	-
0.5949	2450	0.0669	-
0.6071	2500	0.0829	-
0.6192	2550	0.1442	-
0.6314	2600	0.0914	-
0.6435	2650	0.0357	-
0.6557	2700	0.0546	-
0.6678	2750	0.0748	-
0.6799	2800	0.0149	-
0.6921	2850	0.1067	-
0.7042	2900	0.0054	-
0.7164	2950	0.0878	-
0.7285	3000	0.0385	-
0.7407	3050	0.036	-
0.7528	3100	0.0902	-
0.7649	3150	0.0734	-
0.7771	3200	0.0369	-
0.7892	3250	0.0031	-
0.8014	3300	0.0113	-
0.8135	3350	0.0862	-
0.8256	3400	0.0549	-
0.8378	3450	0.0104	-
0.8499	3500	0.0072	-
0.8621	3550	0.0546	-
0.8742	3600	0.0579	-
0.8864	3650	0.0789	-
0.8985	3700	0.0711	-
0.9106	3750	0.0361	-
0.9228	3800	0.0292	-
0.9349	3850	0.0121	-
0.9471	3900	0.0066	-
0.9592	3950	0.0091	-
0.9713	4000	0.0027	-
0.9835	4050	0.0891	-
0.9956	4100	0.0186	-
1.0	4118	-	0.2746
1.0078	4150	0.0246	-
1.0199	4200	0.0154	-
1.0321	4250	0.0056	-
1.0442	4300	0.0343	-
1.0563	4350	0.0375	-
1.0685	4400	0.0106	-
1.0806	4450	0.0025	-
1.0928	4500	0.0425	-
1.1049	4550	0.0019	-
1.1170	4600	0.0014	-
1.1292	4650	0.0883	-
1.1413	4700	0.0176	-
1.1535	4750	0.0204	-
1.1656	4800	0.0011	-
1.1778	4850	0.005	-
1.1899	4900	0.0238	-
1.2020	4950	0.0362	-
1.2142	5000	0.0219	-
1.2263	5050	0.0487	-
1.2385	5100	0.0609	-
1.2506	5150	0.0464	-
1.2627	5200	0.0033	-
1.2749	5250	0.0087	-
1.2870	5300	0.0101	-
1.2992	5350	0.0529	-
1.3113	5400	0.0243	-
1.3235	5450	0.001	-
1.3356	5500	0.0102	-
1.3477	5550	0.0047	-
1.3599	5600	0.0034	-
1.3720	5650	0.0118	-
1.3842	5700	0.0742	-
1.3963	5750	0.0538	-
1.4085	5800	0.0162	-
1.4206	5850	0.0079	-
1.4327	5900	0.0027	-
1.4449	5950	0.0035	-
1.4570	6000	0.0581	-
1.4692	6050	0.0813	-
1.4813	6100	0.0339	-
1.4934	6150	0.0312	-
1.5056	6200	0.0323	-
1.5177	6250	0.0521	-
1.5299	6300	0.0016	-
1.5420	6350	0.0009	-
1.5542	6400	0.0967	-
1.5663	6450	0.0009	-
1.5784	6500	0.031	-
1.5906	6550	0.0114	-
1.6027	6600	0.0599	-
1.6149	6650	0.0416	-
1.6270	6700	0.0047	-
1.6391	6750	0.0234	-
1.6513	6800	0.0609	-
1.6634	6850	0.022	-
1.6756	6900	0.0042	-
1.6877	6950	0.0336	-
1.6999	7000	0.0592	-
1.7120	7050	0.0536	-
1.7241	7100	0.1198	-
1.7363	7150	0.1035	-
1.7484	7200	0.0549	-
1.7606	7250	0.027	-
1.7727	7300	0.0251	-
1.7848	7350	0.0225	-
1.7970	7400	0.0027	-
1.8091	7450	0.0309	-
1.8213	7500	0.024	-
1.8334	7550	0.0355	-
1.8456	7600	0.0239	-
1.8577	7650	0.0377	-
1.8698	7700	0.012	-
1.8820	7750	0.0233	-
1.8941	7800	0.0184	-
1.9063	7850	0.0022	-
1.9184	7900	0.0043	-
1.9305	7950	0.014	-
1.9427	8000	0.0083	-
1.9548	8050	0.0084	-
1.9670	8100	0.0009	-
1.9791	8150	0.002	-
1.9913	8200	0.0002	-
2.0	8236	-	0.2768

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.1
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu121
Datasets: 2.16.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

anismahmahi
/

G2_replace_Whata_repetition_with_noPropaganda_SetFit