SetFit with intfloat/multilingual-e5-large

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-large as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'what are the top brands contributing to share gain for Jumex in Cuernavaca in 2022'
  • 'Apart from Jugos + Néctares, Which are the top contributing categoriesXconsumo to the share loss for Jumex in Orizaba in 2021?'
  • 'what are the top brands contributing to share gain/loss for KOF in Cuernavaca in2022'
2
  • "What is the trend of Danone's market share in Colas SS in Cuernavaca from 2019 to YTD 2023?"
  • 'Are there any notable shifts in market share for KOF from 2021 to 2022 in TT OP'
  • 'In which categories KOF has gained most share in TT OP Cuernavaca 2021-2022'
3
  • 'What is the avg pack size for an offering within the 12.1-15 price bracket for Agua in TT HM, for top KOF brand vs Top competitor brand?'
  • 'How should KOF gain share in <10 price bracket for NCB in TT HM'
  • 'What is the price range for CSD in TT HM?'
5
  • 'What are the untapped opportunities in Graffon?'
  • 'Help me with new categories to expand in for kof'
  • 'I am a category manager for agua at kof. Tell me what areas to prioritize for category development'
8
  • 'Which month and at what price was my share highest'
  • 'What is the sku range and velocity of KOF in colas'
  • 'distribution wise, which non csd skus are doing the best?'
11
  • 'Which levers to prioritize to gain share in Orizaba Colas MS_PET_RET?'
  • 'Which levers to prioritize to gain share in CSDS?'
  • 'How can I gain share in NCBS?'
9
  • 'How much headroom do I have in AGUA'
  • 'What measures can be taken to maximize headroom in the AGUA market?'
  • 'Which industries to prioritize to gain share in CSDS in TT HM?'
10
  • 'Which pack segment shows opportunities to drive my market share in CSDs Colas MS?'
  • 'What are my priority pack segments to gain share in AGUA Colas SS?'
  • 'What are my priority pack segments to gain share in NCB Colas SS?'
1
  • 'Which levers have led the share loss of KOF in Colas in Q4'
  • 'Why is Resto losing share in Cuernavaca Colas SS RET Original?'
  • 'What are the main factors contributing to the share gain of Jumex in Still Drinks MS in Orizaba for FY 2022?'
7
  • 'Is there any PPL correction scope for Valle Frut within TT OP?'
  • 'Is there a need for PPL correction in the energy drink offerings of Red Bull within the Energy Drinks category?'
  • 'Is CC a premium brand? How premium are its offerings as compared to other brands in Colas?'
4
  • 'What is the industry mix of CSDS'
  • 'How has the csd industry evolved in the last two years?'
  • 'What is the change in industry mix for coca-cola in TT HM Orizaba in 2021 to 2022'
6
  • "I'm interested in launching a new orange flavored offering in new york city in the (TT OP) category. What pack sizes would be most suitable for this market?"
  • 'I want to launch a new pack type in csd for kof. Tell me what'
  • 'Within Colas MS, which pack segments are dominated by Red cola in Cuernavaca? Do we have any offerings to compete with the same?'

Evaluation

Metrics

Label Accuracy
all 0.25

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("vgarg/fw_identification_model_e5_large_v5_14_12_23")
# Run inference
preds = model("Why is KOF losing share in Cuernavaca Colas MS RET Original?")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 5 13.8362 33
Label Training Sample Count
0 10
1 10
2 10
3 10
4 10
5 10
6 10
7 10
8 10
9 10
10 10
11 6

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0034 1 0.3504 -
0.1724 50 0.1647 -
0.3448 100 0.0301 -
0.5172 150 0.0113 -
0.6897 200 0.0026 -
0.8621 250 0.0012 -
1.0345 300 0.0006 -
1.2069 350 0.001 -
1.3793 400 0.0007 -
1.5517 450 0.0004 -
1.7241 500 0.0006 -
1.8966 550 0.0005 -
2.0690 600 0.0005 -
2.2414 650 0.0004 -
2.4138 700 0.0003 -
2.5862 750 0.0005 -
2.7586 800 0.0004 -
2.9310 850 0.0003 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu118
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
23
Safetensors
Model size
560M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vgarg/fw_identification_model_e5_large_v5_14_12_23

Finetuned
(68)
this model

Evaluation results