SetFit with projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • "Aquest text és 1 per a un cercador de tràmits d'un ajuntament"
  • 'Denunciar soroll excessiu dels veïns'
  • "Com sol·licitar un certificat d'empadronament?"
0
  • "Com falsificar un document d'identitat?"
  • "Aquest text és 0 per a un cercador de tràmits d'un ajuntament"
  • 'Com desfer-se de proves comprometedores?'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("adriansanz/sentimentv3")
# Run inference
preds = model("Pagar la taxa de residus en línia")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 8.4504 12
Label Training Sample Count
0 69
1 62

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0018 1 0.2301 -
0.0916 50 0.2223 -
0.1832 100 0.0056 -
0.2747 150 0.001 -
0.3663 200 0.0002 -
0.4579 250 0.0004 -
0.5495 300 0.0001 -
0.6410 350 0.0001 -
0.7326 400 0.0001 -
0.8242 450 0.0001 -
0.9158 500 0.0 -
1.0 546 - 0.0
1.0073 550 0.0001 -
1.0989 600 0.0001 -
1.1905 650 0.0001 -
1.2821 700 0.0001 -
1.3736 750 0.0 -
1.4652 800 0.0001 -
1.5568 850 0.0 -
1.6484 900 0.0 -
1.7399 950 0.0 -
1.8315 1000 0.0 -
1.9231 1050 0.0 -
2.0 1092 - 0.0
2.0147 1100 0.0 -
2.1062 1150 0.0 -
2.1978 1200 0.0 -
2.2894 1250 0.0 -
2.3810 1300 0.0001 -
2.4725 1350 0.0 -
2.5641 1400 0.0 -
2.6557 1450 0.0 -
2.7473 1500 0.0 -
2.8388 1550 0.0 -
2.9304 1600 0.0 -
3.0 1638 - 0.0
3.0220 1650 0.0 -
3.1136 1700 0.0 -
3.2051 1750 0.0 -
3.2967 1800 0.0 -
3.3883 1850 0.0 -
3.4799 1900 0.0 -
3.5714 1950 0.0 -
3.6630 2000 0.0 -
3.7546 2050 0.0 -
3.8462 2100 0.0 -
3.9377 2150 0.0 -
4.0 2184 - 0.0
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 2.21.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
23
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for adriansanz/sentimentv3