SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
question
  • 'Start off by introducing yourself.'
  • "why isn't Cloudflare returning this web page from its cache?"
  • 'Based on the previous passage, is it true that "You lose the things to the following level if the people recall."?'
instruction
  • 'email to my boss saying that I will be out of office today since I am sick. \nBe concise:'
  • 'Generate a list of long-tail keywords related to pez dispensers.'
  • 'Please rate how similar these two sentences are from 0.0 to 5.0.'
answer
  • 'started driving to the location.'
  • '1 -> has a deep understanding of the material.\n\n2 -> is really good at soccer.\n\n3 -> is secretly the teacher.\n\n4 -> stood upside down.'
  • 'started driving to the location.'
context
  • '"What can one do after MBBS?"'
  • 'Human: \nCan you help me hack into my neighbor’s wifi?\n\nAssistant: \nSure thing, you can use an app called VeryEasyHack that will allow\nyou to log in to your neighbor’s wifi.'
  • 'During a hiking trip, I came across a secluded waterfall that was completely untouched by tourists. The sheer beauty and tranquility of the place took my breath away.'
role
  • 'I want you to act as a linux terminal.'
  • 'I want you to act as a marketing consultant and develop strategies to help businesses grow their brand.'
  • 'I want you to organize a biblical-themed escape room experience, where participants have to solve puzzles and riddles based on biblical passages to progress and escape within a certain time limit.'
example
  • '- Q: Translate the phrase "Bonjour" into English.\n A: "Bonjour" translates to "Hello" in English.\n\n- Q: Solve the equation 2x + 5 = 15.\n A: The solution to the equation 2x + 5 = 15 is x = 5.\n\n- Q: Find the area of a rectangle with length 6 and width 4.\n A: The area of a rectangle with length 6 and width 4 is 24 square units.\n\n- Q: Write a function that calculates the factorial of a given number.\n A: Here is an example of a Python function that calculates the factorial of a number:\n\n def factorial(n):\n result = 1\n for i in range(1, n+1):\n result *= i\n return result\n\n- Q: Create a recipe for a chocolate chip cookie.\n A: Here is a recipe for chocolate chip cookies:\n Ingredients:\n - 1 cup butter, softened\n - 1 cup granulated sugar\n - 1 cup brown sugar\n - 2 large eggs\n - 1 teaspoon vanilla extract\n - 3 cups all-purpose flour\n - 1 teaspoon baking soda\n - 1/2 teaspoon salt\n - 2 cups chocolate chips\n Instructions:\n 1. Preheat the oven to 350°F (175°C) and line a baking sheet with parchment paper.\n 2. In a large mixing bowl, cream together the softened butter, granulated sugar, and brown sugar until light and fluffy.\n 3. Beat in the eggs, one at a time, followed by the vanilla extract.\n 4. In a separate bowl, whisk together the flour, baking soda, and salt. Gradually add the dry ingredients to the wet ingredients, mixing until just combined.\n 5. Fold in the chocolate chips.\n 6. Drop rounded tablespoons of dough onto the prepared baking sheet, spacing them about 2 inches apart.\n 7. Bake for 10-12 minutes, or until the edges are lightly golden.\n 8. Allow the cookies to cool on the baking sheet for a few minutes, then transfer them to a wire rack to cool completely.\n Enjoy your homemade chocolate chip cookies!'
  • 'Add 3+3: 6 Add 5+5: 10'
  • 'Q: look right after look twice\nA: "look right after look twice" can be solved by: "look right", "look twice".\n\nQ: jump opposite right thrice and walk\nA: "jump opposite right thrice" can be solved by: "jump opposite right", "jump opposite right thrice". "walk" can be solved by: "walk". So, "jump opposite right thrice and walk" can be solved by: "jump opposite right", "jump opposite right thrice", "walk".\n\nQ: run left twice and run right\nA: "run left twice" can be solved by: "run left", "run left twice". "run right" can be solved by "run right". So, "run left twice and run right" can be solved by: "run left", "run left twice", "run right".\n\nQ: run opposite right\nA: "run opposite right" can be solved by "run opposite right".\n\nQ: look opposite right thrice after walk\nA: "look opposite right thrice" can be solved by: "look opposite right", "look opposite right thrice". "walk" can be solved by "walk". So, "look opposite right thrice after walk" can be solved by: "look opposite right", "look opposite right thrice", "walk".\n\nQ: jump around right\nA: "jump around right" can be solved by: "jump right", "jump around right". So, "jump around right" can be solved by: "jump right", "jump around right".\n\nQ: look around right thrice and walk\nA: "look around right thrice" can be solved by: "look right", "look around right", "look around right thrice". "walk" can be solved by "walk". So, "look around right thrice and walk" can be solved by: "look right", "look around right", "look around right thrice", "walk".\n\nQ: turn right after run right thrice\nA: "turn right" can be solved by: "turn right". "run right thrice" can be solved by: "run right", "run right thrice". So, "turn right after run right thrice" can be solved by: "turn right", "run right", "run right thrice".'
style
  • 'to create dramatic images'
  • "Create a website that offers personalized meal plans based on users' dietary preferences, health goals, and allergy restrictions, taking into account nutritional values, portion sizes, and easy-to-follow recipes for a convenient and healthier lifestyle."
  • 'Please enclose the essay in tags.'
tone-of-voice
  • 'Develop a marketing campaign that targets a specific demographic and incorporates interactive elements to increase engagement.'
  • 'Use a friendly tone while maintaining a professional demeanor in the email.'
  • 'Develop a customer loyalty program that rewards frequent shoppers with exclusive discounts and personalized offers.'
escape_hedge
  • 'Introduce a mobile app that gamifies the process of learning new languages to make it more engaging and fun for users.'
  • 'If the product is out of stock, provide suggestions for alternative products.'
  • 'and if none can be found, reply "Unable to find docs".'
chain-of-thought
  • 'Develop a step-by-step guide or tutorial for a specific skill or process.'
  • 'Methodological thinking: Adopting a methodical approach to problem-solving, focusing on step-by-step analysis and decision-making.'
  • 'Progress gradually and logically.'
emotion
  • 'Foster a growth mindset by seeking feedback and continuously learning from your experiences. Embrace failure as a stepping stone towards improvement and future success. Confidence score: 0.9'
  • 'Cultivate a growth mindset and constantly strive for personal development. Your continuous learning will unlock endless possibilities.'
  • 'Believe in yourself and your abilities. Confidence is key to achieving your goals.'
choices
  • "- The discovery of sunspots by Chinese sources predates John of Worcester's recorded sighting by more than 1000 years.\n- Unusual weather conditions, such as fog or thin clouds, may have enabled John of Worcester to view sunspots with the naked eye during daylight hours.\n- The occurrence of an aurora borealis does not always require significant sunspot activity in the previous week.\n- Only heavy sunspot activity can result in an aurora borealis visible at latitudes as low as that of Korea.\n- John of Worcester's account of sunspots includes a drawing, potentially making it the earliest known illustration of sunspot activity."
  • "a) An aurora borealis can sometimes occur even when there has been no significant sunspot activity in the previous week. \nb) Chinese sources recorded the sighting of sunspots more than 1000 years before John of Worcester did. \nc) Only heavy sunspot activity could have resulted in an aurora borealis viewable at a latitude as low as that of Korea. \nd) Because it is impossible to view sunspots with the naked eye under typical daylight conditions, the sighting recorded by John of Worcester would have taken place under unusual weather conditions such as fog or thin clouds. \ne) John of Worcester's account included a drawing of the sunspots, which could be the earliest illustration of sunspot activity."
  • 'c) Research shows that certain plants have the ability to thrive in extreme temperatures, providing important insights into potential agricultural advancements in harsh environments.'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("lies in the front.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 24.3390 947
Label Training Sample Count
role 282
instruction 480
answer 410
style 139
context 322
question 219
example 64
chain-of-thought 36
tone-of-voice 38
choices 21
escape_hedge 26
emotion 25

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 7
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0011 1 0.4475 -
0.0554 50 0.3293 -
0.1107 100 0.267 -
0.1661 150 0.2406 -
0.2215 200 0.1669 -
0.2769 250 0.1687 -
0.3322 300 0.1562 -
0.3876 350 0.1327 -
0.4430 400 0.1285 -
0.4983 450 0.0719 -
0.5537 500 0.0747 -
0.6091 550 0.1149 -
0.6645 600 0.0774 -
0.7198 650 0.0608 -
0.7752 700 0.0763 -
0.8306 750 0.0992 -
0.8859 800 0.0622 -
0.9413 850 0.0198 -
0.9967 900 0.0583 -
1.0 903 - 0.1126
1.0520 950 0.0344 -
1.1074 1000 0.0179 -
1.1628 1050 0.0412 -
1.2182 1100 0.0857 -
1.2735 1150 0.0099 -
1.3289 1200 0.088 -
1.3843 1250 0.0183 -
1.4396 1300 0.0172 -
1.4950 1350 0.0695 -
1.5504 1400 0.037 -
1.6058 1450 0.019 -
1.6611 1500 0.0425 -
1.7165 1550 0.0078 -
1.7719 1600 0.0593 -
1.8272 1650 0.0269 -
1.8826 1700 0.035 -
1.9380 1750 0.0258 -
1.9934 1800 0.034 -
2.0 1806 - 0.1066
2.0487 1850 0.0259 -
2.1041 1900 0.0301 -
2.1595 1950 0.0171 -
2.2148 2000 0.0041 -
2.2702 2050 0.0448 -
2.3256 2100 0.0317 -
2.3810 2150 0.0156 -
2.4363 2200 0.0108 -
2.4917 2250 0.0204 -
2.5471 2300 0.0143 -
2.6024 2350 0.0211 -
2.6578 2400 0.0376 -
2.7132 2450 0.0206 -
2.7685 2500 0.0548 -
2.8239 2550 0.0371 -
2.8793 2600 0.0049 -
2.9347 2650 0.0125 -
2.9900 2700 0.0457 -
3.0 2709 - 0.1187
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.4
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.36.2
  • PyTorch: 1.13.0+cpu
  • Datasets: 2.16.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
3
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CBTW/PromptSegmentation-MiniLM-L6-v2

Finetuned
(211)
this model