SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-MiniLM-L6-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 256 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
Word form transmission	"Mother should take care of her own child at first, by this quote we simply can see that problems of government's own country should be placed on the first position." "A building's style may say a lot about its history." 'A lot of artists and entertainment organisations have financional costs because of free using of their contents in the Internet.'
Tense semantics	'Samsung, "Blackberry" and "HTC" in 2015 have almost the same percentage share.' '(5,9%) Overall, almost all unemployment rates have remained on the same level between 2014 and 2015, except EU, Latin America and Middle East.' '15% consist of things which are transported by rail in Eastern Europe in 2008.'
Synonyms	'(the destination between Moscow and Saint Petersburg, for instance, can be easily overcame by "Lastochka" train for 5 hours).' '(the destination between Moscow and Saint Petersburg, for instance, can be easily overcame by "Lastochka" train for 5 hours).' 'There is an extremely clear difference: there are too many men on a tech subjects.'
Copying expression	'15-59 years people in Yemen are increasing, while in Italy this number decreases.' '2013 year is a key one.' '3,6% are people have age 60+ years.'
Transliteration	'A closer look at graphic revails that goods transported by rail had good products, which massive 11%.' "According to first diagramm, half of Yemen's population in 2000 was children 0-14 years old." 'According to my opinion different fabrics make much more harm for our nature.'

Evaluation

Metrics

Label	Accuracy
all	0.6197

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Zlovoblachko/L1-classifier")
# Run inference
preds = model("After 1980 part old people in USA rose slight and in Sweden this point stay unchanged.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	21.005	47

Label	Training Sample Count
Synonyms	99
Copying expression	26
Tense semantics	27
Word form transmission	40
Transliteration	8

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0012	1	0.3375	-
0.0590	50	0.3628	-
0.1179	100	0.3312	-
0.1769	150	0.2342	-
0.2358	200	0.2665	-
0.2948	250	0.1857	-
0.3538	300	0.2134	-
0.4127	350	0.1786	-
0.4717	400	0.092	-
0.5307	450	0.2031	-
0.5896	500	0.1449	-
0.6486	550	0.1234	-
0.7075	600	0.0552	-
0.7665	650	0.0693	-
0.8255	700	0.097	-
0.8844	750	0.0448	-
0.9434	800	0.041	-
1.0024	850	0.0431	-
1.0613	900	0.0227	-
1.1203	950	0.061	-
1.1792	1000	0.0209	-
1.2382	1050	0.0071	-
1.2972	1100	0.0285	-
1.3561	1150	0.0039	-
1.4151	1200	0.0029	-
1.4741	1250	0.0097	-
1.5330	1300	0.0076	-
1.5920	1350	0.0021	-
1.6509	1400	0.015	-
1.7099	1450	0.0027	-
1.7689	1500	0.0204	-
1.8278	1550	0.013	-
1.8868	1600	0.0222	-
1.9458	1650	0.0427	-
2.0047	1700	0.0181	-
2.0637	1750	0.0232	-
2.1226	1800	0.0053	-
2.1816	1850	0.0169	-
2.2406	1900	0.006	-
2.2995	1950	0.0108	-
2.3585	2000	0.0034	-
2.4175	2050	0.0198	-
2.4764	2100	0.0006	-
2.5354	2150	0.0142	-
2.5943	2200	0.0038	-
2.6533	2250	0.0006	-
2.7123	2300	0.0007	-
2.7712	2350	0.0012	-
2.8302	2400	0.0003	-
2.8892	2450	0.0127	-
2.9481	2500	0.0181	-
3.0071	2550	0.006	-
3.0660	2600	0.0006	-
3.125	2650	0.0156	-
3.1840	2700	0.0427	-
3.2429	2750	0.0004	-
3.3019	2800	0.0013	-
3.3608	2850	0.0241	-
3.4198	2900	0.0004	-
3.4788	2950	0.0048	-
3.5377	3000	0.0004	-
3.5967	3050	0.0006	-
3.6557	3100	0.0044	-
3.7146	3150	0.0142	-
3.7736	3200	0.005	-
3.8325	3250	0.0022	-
3.8915	3300	0.0033	-
3.9505	3350	0.0033	-
4.0094	3400	0.0005	-
4.0684	3450	0.0299	-
4.1274	3500	0.0172	-
4.1863	3550	0.0079	-
4.2453	3600	0.0012	-
4.3042	3650	0.0093	-
4.3632	3700	0.0175	-
4.4222	3750	0.0278	-
4.4811	3800	0.0004	-
4.5401	3850	0.0054	-
4.5991	3900	0.002	-
4.6580	3950	0.0248	-
4.7170	4000	0.0173	-
4.7759	4050	0.0004	-
4.8349	4100	0.0154	-
4.8939	4150	0.0162	-
4.9528	4200	0.0052	-
5.0118	4250	0.0142	-
5.0708	4300	0.0109	-
5.1297	4350	0.0003	-
5.1887	4400	0.0002	-
5.2476	4450	0.0003	-
5.3066	4500	0.0081	-
5.3656	4550	0.0005	-
5.4245	4600	0.0229	-
5.4835	4650	0.0002	-
5.5425	4700	0.0004	-
5.6014	4750	0.0233	-
5.6604	4800	0.0086	-
5.7193	4850	0.0084	-
5.7783	4900	0.0177	-
5.8373	4950	0.0102	-
5.8962	5000	0.017	-
5.9552	5050	0.0037	-
6.0142	5100	0.005	-
6.0731	5150	0.0002	-
6.1321	5200	0.0188	-
6.1910	5250	0.0037	-
6.25	5300	0.0003	-
6.3090	5350	0.0137	-
6.3679	5400	0.0107	-
6.4269	5450	0.0045	-
6.4858	5500	0.0002	-
6.5448	5550	0.0238	-
6.6038	5600	0.0209	-
6.6627	5650	0.0003	-
6.7217	5700	0.0002	-
6.7807	5750	0.0029	-
6.8396	5800	0.0177	-
6.8986	5850	0.0165	-
6.9575	5900	0.0045	-
7.0165	5950	0.0203	-
7.0755	6000	0.0048	-
7.1344	6050	0.0251	-
7.1934	6100	0.0147	-
7.2524	6150	0.0033	-
7.3113	6200	0.0166	-
7.3703	6250	0.0129	-
7.4292	6300	0.0169	-
7.4882	6350	0.0001	-
7.5472	6400	0.0002	-
7.6061	6450	0.0029	-
7.6651	6500	0.0264	-
7.7241	6550	0.0079	-
7.7830	6600	0.0002	-
7.8420	6650	0.0157	-
7.9009	6700	0.0116	-
7.9599	6750	0.0031	-
8.0189	6800	0.0055	-
8.0778	6850	0.0113	-
8.1368	6900	0.0004	-
8.1958	6950	0.0301	-
8.2547	7000	0.0002	-
8.3137	7050	0.0169	-
8.3726	7100	0.0001	-
8.4316	7150	0.0165	-
8.4906	7200	0.0201	-
8.5495	7250	0.0168	-
8.6085	7300	0.0197	-
8.6675	7350	0.0165	-
8.7264	7400	0.0165	-
8.7854	7450	0.0002	-
8.8443	7500	0.0134	-
8.9033	7550	0.0037	-
8.9623	7600	0.0043	-
9.0212	7650	0.0001	-
9.0802	7700	0.0034	-
9.1392	7750	0.0036	-
9.1981	7800	0.0001	-
9.2571	7850	0.0069	-
9.3160	7900	0.0304	-
9.375	7950	0.0203	-
9.4340	8000	0.0002	-
9.4929	8050	0.0002	-
9.5519	8100	0.0058	-
9.6108	8150	0.0141	-
9.6698	8200	0.0031	-
9.7288	8250	0.0169	-
9.7877	8300	0.0002	-
9.8467	8350	0.0075	-
9.9057	8400	0.0192	-
9.9646	8450	0.0588	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0.dev0
Sentence Transformers: 2.6.1
Transformers: 4.38.2
PyTorch: 2.2.1+cu121
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Zlovoblachko
/

L1-classifier