SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'Reasoning: \n\n1. Context Grounding: The provided answer lists several services that Kartz Media & PR offers, many of which are mentioned in the document. However, the document does not mention "personalized space travel public relations for interstellar companies," indicating a potential inaccuracy or embellishment.\n \n2. Relevance: Most of the services outlined in the answer are relevant and found within the document. The majority align with what is described in the company profile, but the inclusion of a service related to space travel public relations is extraneous and irrelevant.\n \n3. Conciseness: The answer is relatively concise but incorrect information ("personalized space travel public relations for interstellar companies") detracts from its clarity and precision.\n\nFinal result:'
  • "The provided answer does not address the question regarding the careers that graduates from the School of Communications, Media, Arts and Design have achieved. Instead, it gives information about Charles Sheedy's role at the University of Notre Dame, which is unrelated to the question.\n\n**Reasoning:**\n1. Context Grounding: The answer provided is well-supported by the document but not relevant to the question asked. The document mentions Charles Sheedy in the context of coeducation at Notre Dame, not about graduates' careers.\n2. Relevance: The answer does not answer the specific question about the careers of graduates from the School of Communications, Media, Arts and Design.\n3. Conciseness: While the answer is concise, it fails to include any relevant informationabout the requested topic.\n\nFinal Result: ****"
  • 'Reasoning:\n1. Context Grounding: The answer given does not use any information that relates to horseback riding or arena etiquette. It only discusses measures for enhancing the viewing and playing experience in a darts tournament during the pandemic.\n2. Relevance: The answer does not address the question about following arena etiquette while horseback riding. Instead, it talks about unrelated aspects of a darts tournament.\n3. Conciseness: Despite being concise, the information provided is completely unrelated to the question asked.\n\nFinal result:'
1
  • 'Reasoning: \n\n1. Context Grounding: The answer is generally in line with the context of the document, which discusses signs of infidelity. Both the document and answer cover behavioral changes, physical changes, communication avoidance, and evidence gathering like checking receipts or tickets.\n \n2. Relevance: The response addresses the specific question by listing multiple signs of infidelity, including behaviors and potential evidence to watch for.\n\n3. Conciseness: The answer is somewhat clear but could be more concise. It gets a bit repetitive, especially regarding contacting the partner and checking for physical evidence.\n\nFinal result:'
  • "Reasoning:\n\n1. Context Grounding: The given answer does not make correct use of the provided document's context and includes random numerical sequences, which are irrelevant and incorrect.\n2. Relevance: While the answer somewhat attempts to address the question by outlining steps related to displaying blog posts, the inclusion of random numbers significantly detracts from focus and accuracy.\n3. Conciseness: The presence of irrelevant numerical sequences and unnecessary repetition hinders conciseness.\n4. Correct and Detailed Instructions: The instructions provided are garbled with non-contextual content, making it difficult to follow and implement accurately.\n\nOverall, the answer fails primarily due to the disruption caused by the irrelevant numerical sequences and the lack of clear instructions rooted in the provided document.\n\nResult:"
  • 'Reasoning:\n\n1. Context Grounding: The answer is consistent with the document's content. It discusses selling gold coins both online and in person, which aligns with the document's detailed sections on online sales, local dealers, and auction sites.\n\n2. Relevance: The provided answer specifically answers the question of "How to Sell Gold Coins" by providing relevant methods for selling gold coins online, in person to local dealers, and through auction sites. This matches the guidance provided in the document.\n\n3. Conciseness: The answer is concise and to the point. It avoids unnecessary details and clearly outlines the main methods of selling gold coins, which were detailed extensively in the document.\n\nFinal Result:'

Evaluation

Metrics

Label Accuracy
all 0.6875

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wix_qa_gpt-4o_cot-instructions_remove_final_evaluation_e2_larger_train_17")
# Run inference
preds = model("Reasoning:
1. **Context Grounding**: The answer is well-supported by the provided document. It accurately follows the provided instructions.
2. **Relevance**: The answer directly addresses the question about changing the reservation reference from the service page to the booking calendar.
3. **Conciseness**: The answer is clear and to the point, giving step-by-step instructions without unnecessary details.
4. **Correctness and Detail**: The answer provides the correct steps and detailed instructions on how to achieve the task asked in the question.

Final result:")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 33 94.8197 198
Label Training Sample Count
0 117
1 127

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0016 1 0.2302 -
0.0820 50 0.2583 -
0.1639 100 0.255 -
0.2459 150 0.2174 -
0.3279 200 0.1165 -
0.4098 250 0.0778 -
0.4918 300 0.0214 -
0.5738 350 0.0045 -
0.6557 400 0.0028 -
0.7377 450 0.0022 -
0.8197 500 0.0019 -
0.9016 550 0.0018 -
0.9836 600 0.0016 -
1.0656 650 0.0015 -
1.1475 700 0.0015 -
1.2295 750 0.0015 -
1.3115 800 0.0014 -
1.3934 850 0.0014 -
1.4754 900 0.0016 -
1.5574 950 0.0013 -
1.6393 1000 0.0013 -
1.7213 1050 0.0013 -
1.8033 1100 0.0012 -
1.8852 1150 0.0012 -
1.9672 1200 0.0013 -

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.1.0
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Netta1994/setfit_baai_wix_qa_gpt-4o_cot-instructions_remove_final_evaluation_e2_larger_train_17

Finetuned
(325)
this model

Evaluation results