metadata
base_model: sentence-transformers/all-mpnet-base-v2
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >
It might have been more fun for everyone if the Thruway Authority had
given individual contracts for each rest stop, with the stipulation that
each reflect some local regional character. This could interest travelers
to maybe get off at the next exit and explore some local places. With
every stop the same, the traveler might as well be in Kansas.
- text: >
I was scammed by a fake retailer appearing on a Google search for a
popular product, a Patagonia backpack, offered at a significant discount.
The website seemed legitimate; I was given a choice of colors and sizes.
The scammer provided a tracking number from China. I have bought
discounted items before from China that are sold on eBay and are sent by
Chinese parcel post, for which tracking information is scant. When
whatever item that was mailed finally arrived at a completely different
address in another state several weeks later, I alerted my credit card
company of the fraud and was refunded the amount, despite the time frame
it took to determine the scam.
- text: >
From Matt Stoller's newsletter (edited for flow):LastPass was purchased by
two private equity firms, Francisco Partners and Evergreen Coast Capital
Corp. Typically, PE firms raise prices, lower quality, harm workers, and
reduce customer service. They then decided to charge customers $36 to
access the cumbersome passwords. This particular pricing move sparked a
backlash from customers, and the two PE firms pledged to spin off the
company and make it independent. But that hasn’t happened.Poor quality is
common within private equity owned software firms, which means
cybersecurity vulnerabilities quickly follow. We’ve seen this with
PE-owned software firms facilitating the hacking of the NYC subway,
nuclear weapons facilities, and criminal ransomware. And now it’s happened
with LastPass. Lovely.
- text: >
Maybe for the 'come latelies' this is a big storm, but for folks who have
lived there, this is not something new.When El Nino dumps in the
Sierras...THAT, is a snow Storm! In '82-83 the area near Squaw Valley got
800 inches! 'Dumps' of 4-6+ feet happened about about 2x a month...we
were living like snow moles, mimicing the great snow storms of the early
20th century - you may have seen these in historical photos.Homeowners
were shoveling 3-5 feet of snow off their roofs, to prevent total
collapse!We always had a good hearty Laugh at those CA flatlanders,
driving to Tahoe on I 80 in the 'rain' ties, with flakes like silver
dollars, blotting out visibility.Remember, was it last winter when I 80
was closed and all the hip techies turned to their google maps and ended
up on closed roads, in the boondocks? Like I 80 is closed and some 1 1/2
rural lane road, was going to be OPEN??? Hellarious!Of course, down in
the flatlands, we've seen how folks THOUGHT they had 'amphibious'
cars...Any idea how folks became so....lame? (BTW: Mt Baker in WA has the
record of 1100 inches of snow....keeping the smaller Mt St Helens-like
volcano, sleeping!) Winter is great, if you respect Mother Nature; soooo
many havent a clue, putting 1st Responders, at great risk! And 4 wheel
drive, CAN keep you going straight, at a CAUTIOUS speed...not good, on icy
curves!!
- text: >
Ethan. The results of that great agricultural revolution are in and not
much of it is admirable. More Food = More People = More Fossil Fuels =
More Toxic Pollution = More Disease = More Greenhouse Gases = More Climate
Change = end-of-the-line. Human population was able to grow as rapidly as
fossil fuel inputs were increased. But now, we must reduce usage of fossil
fuels and the resulting population logically goes in the same direction.
All the green technologies are for naught. It comes down to fossil fuels.
inference: true
model-index:
- name: SetFit with sentence-transformers/all-mpnet-base-v2
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.8
name: Accuracy
SetFit with sentence-transformers/all-mpnet-base-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: sentence-transformers/all-mpnet-base-v2
- Classification head: a LogisticRegression instance
- Maximum Sequence Length: 384 tokens
- Number of Classes: 2 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
Label | Examples |
---|---|
yes |
|
no |
|
Evaluation
Metrics
Label | Accuracy |
---|---|
all | 0.8 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("davidadamczyk/setfit-model-3")
# Run inference
preds = model("It might have been more fun for everyone if the Thruway Authority had given individual contracts for each rest stop, with the stipulation that each reflect some local regional character. This could interest travelers to maybe get off at the next exit and explore some local places. With every stop the same, the traveler might as well be in Kansas.
")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 43 | 140.9 | 262 |
Label | Training Sample Count |
---|---|
no | 18 |
yes | 22 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 120
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0017 | 1 | 0.4637 | - |
0.0833 | 50 | 0.2019 | - |
0.1667 | 100 | 0.0063 | - |
0.25 | 150 | 0.0003 | - |
0.3333 | 200 | 0.0002 | - |
0.4167 | 250 | 0.0001 | - |
0.5 | 300 | 0.0001 | - |
0.5833 | 350 | 0.0001 | - |
0.6667 | 400 | 0.0001 | - |
0.75 | 450 | 0.0001 | - |
0.8333 | 500 | 0.0001 | - |
0.9167 | 550 | 0.0001 | - |
1.0 | 600 | 0.0001 | - |
Framework Versions
- Python: 3.10.13
- SetFit: 1.1.0
- Sentence Transformers: 3.0.1
- Transformers: 4.45.2
- PyTorch: 2.4.0+cu124
- Datasets: 2.21.0
- Tokenizers: 0.20.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}