SetFit with intfloat/multilingual-e5-large-instruct

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-large-instruct as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: intfloat/multilingual-e5-large-instruct
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
rag	'Саксон эпизоды туралы қандай тарихи құжатта мәлімет берілген?' 'Uttermost өзінің жарыс мансабында қандай маңызды жетістіктерге қол жеткізді?' 'Ричард Бахтелл'
no_rag	'Just a moment, please.' 'орыс тіліндегі "Я рабочий." сөйлемінің қазақ тіліндегі аудармасы не?' 'You look tired. Did you sleep well last night?'

Evaluation

Metrics

Label	Accuracy
all	0.9955

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("nlp-team-issai/setfit-me5-large-instruct-v3")
# Run inference
preds = model("Сәлем!")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	10.0022	138

Label	Training Sample Count
no_rag	218
rag	241

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.3567	-
0.0151	50	0.2851	-
0.0302	100	0.0943	-
0.0452	150	0.0123	-
0.0603	200	0.0099	-
0.0754	250	0.0056	-
0.0905	300	0.0011	-
0.1056	350	0.0003	-
0.1207	400	0.0002	-
0.1357	450	0.0001	-
0.1508	500	0.0001	-
0.1659	550	0.0001	-
0.1810	600	0.0001	-
0.1961	650	0.0001	-
0.2112	700	0.0001	-
0.2262	750	0.0001	-
0.2413	800	0.0001	-
0.2564	850	0.0001	-
0.2715	900	0.0001	-
0.2866	950	0.0001	-
0.3017	1000	0.0001	-
0.3167	1050	0.0001	-
0.3318	1100	0.0001	-
0.3469	1150	0.0001	-
0.3620	1200	0.0001	-
0.3771	1250	0.0001	-
0.3922	1300	0.0001	-
0.4072	1350	0.0001	-
0.4223	1400	0.0	-
0.4374	1450	0.0	-
0.4525	1500	0.0	-
0.4676	1550	0.0	-
0.4827	1600	0.0	-
0.4977	1650	0.0	-
0.5128	1700	0.0	-
0.5279	1750	0.0	-
0.5430	1800	0.0	-
0.5581	1850	0.0	-
0.5732	1900	0.0	-
0.5882	1950	0.0	-
0.6033	2000	0.0	-
0.6184	2050	0.0	-
0.6335	2100	0.0	-
0.6486	2150	0.0	-
0.6637	2200	0.0	-
0.6787	2250	0.0	-
0.6938	2300	0.0	-
0.7089	2350	0.0	-
0.7240	2400	0.0	-
0.7391	2450	0.0	-
0.7541	2500	0.0	-
0.7692	2550	0.0	-
0.7843	2600	0.0	-
0.7994	2650	0.0	-
0.8145	2700	0.0	-
0.8296	2750	0.0	-
0.8446	2800	0.0	-
0.8597	2850	0.0	-
0.8748	2900	0.0	-
0.8899	2950	0.0	-
0.9050	3000	0.0	-
0.9201	3050	0.0	-
0.9351	3100	0.0	-
0.9502	3150	0.0	-
0.9653	3200	0.0	-
0.9804	3250	0.0	-
0.9955	3300	0.0	-

Framework Versions

Python: 3.12.5
SetFit: 1.1.0
Sentence Transformers: 3.2.0
Transformers: 4.45.2
PyTorch: 2.4.0+cu121
Datasets: 3.0.1
Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

nlp-team-issai
/

setfit-me5-large-instruct-v3