SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • "Was '80s New #Wave a #Casualty of #AIDS?: Tweet And Since they\x89Ûªd grown up watching David\x89Û_ http://t.co/qBecjli7cx"
  • "@CharlesDagnall He's getting 50 here I think. Salt. Wounds. Rub. In."
  • 'Navy sidelines 3 newest subs http://t.co/gpVZV0249Y'
1
  • 'The Latest: More Homes Razed by Northern California Wildfire - ABC News http://t.co/bKsYymvIsg #GN'
  • '@Durban_Knight Rescuers are searching for hundreds of migrants in the Mediterranean after a boat carr... http://t.co/cWCVBuBs01 @Nosy_Be'
  • 'NEMA Ekiti distributed relief materials to affected victims of Rain/Windstorm disaster at Ode-Ekiti in Gbonyin LGA.'

Evaluation

Metrics

Label Accuracy
all 0.8172

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("pEpOo/catastrophy5")
# Run inference
preds = model("Stuart Broad Takes Eight Before Joe Root Runs Riot Against Aussies")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 14.9796 54
Label Training Sample Count
0 1732
1 1313

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0001 1 0.3383 -
0.0066 50 0.352 -
0.0131 100 0.3529 -
0.0197 150 0.2286 -
0.0263 200 0.2654 -
0.0328 250 0.2892 -
0.0394 300 0.1808 -
0.0460 350 0.2056 -
0.0525 400 0.0863 -
0.0591 450 0.2034 -
0.0657 500 0.1339 -
0.0722 550 0.1022 -
0.0788 600 0.1083 -
0.0854 650 0.1035 -
0.0919 700 0.1201 -
0.0985 750 0.0626 -
0.1051 800 0.1257 -
0.1117 850 0.1543 -
0.1182 900 0.0367 -
0.1248 950 0.1749 -
0.1314 1000 0.0553 -
0.1379 1050 0.0836 -
0.1445 1100 0.0161 -
0.1511 1150 0.1149 -
0.1576 1200 0.1144 -
0.1642 1250 0.0028 -
0.1708 1300 0.0037 -
0.1773 1350 0.1769 -
0.1839 1400 0.0172 -
0.1905 1450 0.0397 -
0.1970 1500 0.0645 -
0.2036 1550 0.0659 -
0.2102 1600 0.0014 -
0.2167 1650 0.0016 -
0.2233 1700 0.0729 -
0.2299 1750 0.0072 -
0.2364 1800 0.0175 -
0.2430 1850 0.0278 -
0.2496 1900 0.0537 -
0.2561 1950 0.0038 -
0.2627 2000 0.087 -
0.2693 2050 0.0459 -
0.2758 2100 0.0169 -
0.2824 2150 0.0112 -
0.2890 2200 0.001 -
0.2955 2250 0.0204 -
0.3021 2300 0.0796 -
0.3087 2350 0.0592 -
0.3153 2400 0.0003 -
0.3218 2450 0.0033 -
0.3284 2500 0.0309 -
0.3350 2550 0.0065 -
0.3415 2600 0.002 -
0.3481 2650 0.0076 -
0.3547 2700 0.0008 -
0.3612 2750 0.0023 -
0.3678 2800 0.0028 -
0.3744 2850 0.0171 -
0.3809 2900 0.0011 -
0.3875 2950 0.0015 -
0.3941 3000 0.0468 -
0.4006 3050 0.0075 -
0.4072 3100 0.0009 -
0.4138 3150 0.0334 -
0.4203 3200 0.0002 -
0.4269 3250 0.0001 -
0.4335 3300 0.0002 -
0.4400 3350 0.0001 -
0.4466 3400 0.021 -
0.4532 3450 0.0043 -
0.4597 3500 0.0084 -
0.4663 3550 0.0009 -
0.4729 3600 0.0033 -
0.4794 3650 0.0035 -
0.4860 3700 0.0004 -
0.4926 3750 0.0297 -
0.4991 3800 0.0004 -
0.5057 3850 0.0011 -
0.5123 3900 0.0238 -
0.5188 3950 0.0248 -
0.5254 4000 0.0293 -
0.5320 4050 0.0365 -
0.5386 4100 0.0261 -
0.5451 4150 0.0469 -
0.5517 4200 0.0098 -
0.5583 4250 0.0002 -
0.5648 4300 0.0236 -
0.5714 4350 0.0001 -
0.5780 4400 0.0001 -
0.5845 4450 0.0001 -
0.5911 4500 0.0138 -
0.5977 4550 0.0116 -
0.6042 4600 0.0003 -
0.6108 4650 0.0003 -
0.6174 4700 0.0001 -
0.6239 4750 0.0 -
0.6305 4800 0.0246 -
0.6371 4850 0.0001 -
0.6436 4900 0.0543 -
0.6502 4950 0.0001 -
0.6568 5000 0.0093 -
0.6633 5050 0.0001 -
0.6699 5100 0.0 -
0.6765 5150 0.0002 -
0.6830 5200 0.0001 -
0.6896 5250 0.0372 -
0.6962 5300 0.0 -
0.7027 5350 0.0001 -
0.7093 5400 0.0001 -
0.7159 5450 0.0003 -
0.7224 5500 0.0004 -
0.7290 5550 0.0001 -
0.7356 5600 0.0 -
0.7422 5650 0.0 -
0.7487 5700 0.0001 -
0.7553 5750 0.0001 -
0.7619 5800 0.0 -
0.7684 5850 0.0 -
0.7750 5900 0.0 -
0.7816 5950 0.0 -
0.7881 6000 0.0 -
0.7947 6050 0.0 -
0.8013 6100 0.0 -
0.8078 6150 0.0001 -
0.8144 6200 0.0001 -
0.8210 6250 0.0 -
0.8275 6300 0.0 -
0.8341 6350 0.0 -
0.8407 6400 0.0002 -
0.8472 6450 0.0 -
0.8538 6500 0.0001 -
0.8604 6550 0.0 -
0.8669 6600 0.0001 -
0.8735 6650 0.0001 -
0.8801 6700 0.0 -
0.8866 6750 0.0 -
0.8932 6800 0.0373 -
0.8998 6850 0.0 -
0.9063 6900 0.0 -
0.9129 6950 0.0272 -
0.9195 7000 0.0 -
0.9260 7050 0.0 -
0.9326 7100 0.0001 -
0.9392 7150 0.0 -
0.9458 7200 0.0002 -
0.9523 7250 0.0001 -
0.9589 7300 0.0 -
0.9655 7350 0.0 -
0.9720 7400 0.0 -
0.9786 7450 0.0001 -
0.9852 7500 0.0 -
0.9917 7550 0.0 -
0.9983 7600 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
20
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pEpOo/catastrophy5

Finetuned
(182)
this model

Evaluation results