SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'peanut butter cookie dough blizzard is ??????????????????????'
  • 'Free Ebay Sniping RT? http://t.co/B231Ul1O1K Lumbar Extender Back Stretcher Excellent Condition!! ?Please Favorite & Share'
  • "'13 M. Chapoutier Crozes Hermitage so much purple violets slate crushed gravel white pepper. Yum #france #wine #DC http://t.co/skvWN38HZ7"
1
  • 'DUST IN THE WIND: @82ndABNDIV paratroopers move to a loading zone during a dust storm in support of Operation Fury: http://t.co/uGesKLCn8M'
  • 'Delhi Government to Provide Free Treatment to Acid Attack Victims in Private Hospitals http://t.co/H6PM1W7elL'
  • 'National Briefing

Evaluation

Metrics

Label Accuracy
all 0.8099

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("pEpOo/catastrophy")
# Run inference
preds = model("Heat wave warning aa? Ayyo dei. Just when I plan to visit friends after a year.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 15.3737 31
Label Training Sample Count
0 222
1 158

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0005 1 0.3038 -
0.0263 50 0.1867 -
0.0526 100 0.2578 -
0.0789 150 0.2298 -
0.1053 200 0.1253 -
0.1316 250 0.0446 -
0.1579 300 0.1624 -
0.1842 350 0.0028 -
0.2105 400 0.0059 -
0.2368 450 0.0006 -
0.2632 500 0.0287 -
0.2895 550 0.003 -
0.3158 600 0.0004 -
0.3421 650 0.0014 -
0.3684 700 0.0002 -
0.3947 750 0.0001 -
0.4211 800 0.0002 -
0.4474 850 0.0002 -
0.4737 900 0.0002 -
0.5 950 0.0826 -
0.5263 1000 0.0002 -
0.5526 1050 0.0001 -
0.5789 1100 0.0003 -
0.6053 1150 0.0303 -
0.6316 1200 0.0001 -
0.6579 1250 0.0 -
0.6842 1300 0.0001 -
0.7105 1350 0.0 -
0.7368 1400 0.0001 -
0.7632 1450 0.0002 -
0.7895 1500 0.0434 -
0.8158 1550 0.0001 -
0.8421 1600 0.0 -
0.8684 1650 0.0001 -
0.8947 1700 0.0001 -
0.9211 1750 0.0001 -
0.9474 1800 0.0001 -
0.9737 1850 0.0001 -
1.0 1900 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
14
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pEpOo/catastrophy

Finetuned
(189)
this model

Evaluation results