G3-setfit-model / README.md
anismahmahi's picture
Add SetFit model
2ebb0cc
|
raw
history blame
15.2 kB
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
metrics:
  - f1
widget:
  - text: >
      The Democratic Party was totally corrupted by the Clinton Regime, and now
      it is totally insane.
  - text: >
      The media gave scant coverage to Obama’s close relationship with radical
      Reverend Jeremiah “God damn America) Wright who blamed the US for 9/11.
  - text: |
      It’s sharia compliance in New Mexico.
  - text: |
      Are you people serious?
  - text: >
      However, I ask, why were you not involved in the first place, Mr.
      President?
pipeline_tag: text-classification
inference: true
model-index:
  - name: SetFit
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: f1
            value: 0.6720214190093708
            name: F1

SetFit

This is a SetFit model that can be used for Text Classification. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a LogisticRegression instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
1.0
  • '#ukraine Be careful social media and Google are censoring non-propaganda news story like how Ukrainian defense minister is using video games to give the impression they are defeating the Russians to keep the conflict going ! #biden is warmongering Ceasefire , peace & neutrality NOW HTTPURL'
  • 'https://t.co/CjSFJmng7Z — Sen. Patrick Leahy (@SenatorLeahy) August 1, 2018\n'
  • 'On Monday afternoon, Homeland Security Secretary Kirstjen Nielsen tweeted out photos of CBP officers in riot gear as well as the barbed wire and barriers citing the reports about plans to “rush” the border.\n'
0.0
  • 'President Trump noted that President Obama and his advisers had information that the Russians had been working to interfere in the election and they ignored it, because they thought Hillary Clinton was going to win.\n'
  • 'Once the truth is accepted that jihadis are inspired and sanctioned by their Islamic texts, it must logically become required that mosques, Islamic schools and groups have to immediately curtail any teaching that motivates sedition, violence, and hatred of unbelievers (i.e.\n'
  • '“However, no nation has a more talented, more dedicated group of law enforcement investigators and prosecutors than the United States.”\n'

Evaluation

Metrics

Label F1
all 0.6720

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("anismahmahi/G3-setfit-model")
# Run inference
preds = model("Are you people serious?
")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 28.3246 129
Label Training Sample Count
0 2362
1 2518

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 5
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.3302 -
0.0164 50 0.2709 -
0.0328 100 0.2545 -
0.0492 150 0.229 -
0.0656 200 0.2463 -
0.0820 250 0.2934 -
0.0984 300 0.2735 -
0.1148 350 0.2837 -
0.1311 400 0.2364 -
0.1475 450 0.2379 -
0.1639 500 0.188 -
0.1803 550 0.2443 -
0.1967 600 0.1274 -
0.2131 650 0.2106 -
0.2295 700 0.3211 -
0.2459 750 0.2443 -
0.2623 800 0.1979 -
0.2787 850 0.1679 -
0.2951 900 0.1208 -
0.3115 950 0.0594 -
0.3279 1000 0.11 -
0.3443 1050 0.0951 -
0.3607 1100 0.1059 -
0.3770 1150 0.1027 -
0.3934 1200 0.0771 -
0.4098 1250 0.0295 -
0.4262 1300 0.0696 -
0.4426 1350 0.104 -
0.4590 1400 0.13 -
0.4754 1450 0.1287 -
0.4918 1500 0.0264 -
0.5082 1550 0.0651 -
0.5246 1600 0.113 -
0.5410 1650 0.07 -
0.5574 1700 0.0016 -
0.5738 1750 0.1001 -
0.5902 1800 0.0116 -
0.6066 1850 0.01 -
0.6230 1900 0.0115 -
0.6393 1950 0.0053 -
0.6557 2000 0.0585 -
0.6721 2050 0.0034 -
0.6885 2100 0.0171 -
0.7049 2150 0.0141 -
0.7213 2200 0.0549 -
0.7377 2250 0.0026 -
0.7541 2300 0.1239 -
0.7705 2350 0.0121 -
0.7869 2400 0.0589 -
0.8033 2450 0.0042 -
0.8197 2500 0.0026 -
0.8361 2550 0.003 -
0.8525 2600 0.0004 -
0.8689 2650 0.0003 -
0.8852 2700 0.1 -
0.9016 2750 0.0567 -
0.9180 2800 0.0311 -
0.9344 2850 0.0404 -
0.9508 2900 0.0002 -
0.9672 2950 0.0008 -
0.9836 3000 0.0006 -
1.0 3050 0.0003 0.3187
1.0164 3100 0.0003 -
1.0328 3150 0.0002 -
1.0492 3200 0.0002 -
1.0656 3250 0.002 -
1.0820 3300 0.0002 -
1.0984 3350 0.0003 -
1.1148 3400 0.005 -
1.1311 3450 0.0613 -
1.1475 3500 0.0002 -
1.1639 3550 0.0002 -
1.1803 3600 0.0005 -
1.1967 3650 0.0001 -
1.2131 3700 0.0609 -
1.2295 3750 0.0003 -
1.2459 3800 0.0005 -
1.2623 3850 0.0006 -
1.2787 3900 0.0003 -
1.2951 3950 0.0014 -
1.3115 4000 0.0002 -
1.3279 4050 0.0001 -
1.3443 4100 0.0002 -
1.3607 4150 0.001 -
1.3770 4200 0.0004 -
1.3934 4250 0.0004 -
1.4098 4300 0.0002 -
1.4262 4350 0.0612 -
1.4426 4400 0.0613 -
1.4590 4450 0.0002 -
1.4754 4500 0.0603 -
1.4918 4550 0.0001 -
1.5082 4600 0.0011 -
1.5246 4650 0.0576 -
1.5410 4700 0.0001 -
1.5574 4750 0.0002 -
1.5738 4800 0.0002 -
1.5902 4850 0.0012 -
1.6066 4900 0.0003 -
1.6230 4950 0.0001 -
1.6393 5000 0.0001 -
1.6557 5050 0.0001 -
1.6721 5100 0.0001 -
1.6885 5150 0.0001 -
1.7049 5200 0.0002 -
1.7213 5250 0.0001 -
1.7377 5300 0.0002 -
1.7541 5350 0.0001 -
1.7705 5400 0.0001 -
1.7869 5450 0.0001 -
1.8033 5500 0.0001 -
1.8197 5550 0.0003 -
1.8361 5600 0.0001 -
1.8525 5650 0.0001 -
1.8689 5700 0.0001 -
1.8852 5750 0.0001 -
1.9016 5800 0.0002 -
1.9180 5850 0.0 -
1.9344 5900 0.0001 -
1.9508 5950 0.0 -
1.9672 6000 0.0 -
1.9836 6050 0.0001 -
2.0 6100 0.0001 0.3313
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.16.1
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}