SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
2.0
  • '세운 네라톤카테타 #1116 라텍스 멸균 100개 팩 6번 12fr 4.0mm0 트리비즈니스'
  • '세운 바로박(Barovac) PS200C 단위:1개 (주)엠디오씨'
  • '의무실 성인용 고무밴드 네블라이저 마스크 호흡기 흡입마스크 기관지 인사이트쇼핑몰'
1.0
  • 'JW중외제약 하이맘밴드 프리미엄 2매 이지덤(대웅제약)_이지덤씬 2매(+가위) 테크노 제일약국'
  • '메디폼 친수성 폼드레싱 10x10cm (5mm) (2mm) 10매입 1박스 5mm 주식회사 엠퍼러'
  • '메나리니 더마틱스 울트라 겔 15g 1개. 릴리뷰티'
0.0
  • '약국 에탄올스왑 일회용 알콜솜 에프에이 이올스왑 알콜스왑 소독솜 1박스 다팜메디'
  • '[유한양행] 해피홈 소독용 알콜스왑알콜솜 100매입 2개 [0001]기본상품 CJONSTYLE'
  • '일회용 알콜솜 알콜스왑 소독 약국 바른케어 개별포장100매 바른케어 플러스 알콜솜 100매 로그엠(LOGM)'
4.0
  • '가주 비멸균 설압자 1통(100개) 혀누르개 목설압자 의료용 병원용 더블세이프 MinSellAmount 이원헬스케어'
  • '의료용 겸자 12.5cm /곡 모스키토 켈리 포셉 SJ헬스케어'
  • '개부밧드6절(뚜껑있는밧드)소독통/개무밧드/사각트레이/트레이밧드/거어즈캔 신동방메디칼'
3.0
  • '일회용 베드 위생시트 부직포시트 침대커버 1롤 50장 80x180cm 비방수(고급형) 80x180 50장/롤 심비오시스'
  • '부직포자루,육수보자기,다시백,거름망 45x50-300장 봉제 지우씨'
  • '병원침대/환자용침대 매트리스/고탄성 접이식 마사지 지압 의료용 매트 두께 9cm_밤색 평매트리스_900mm X 1900mm 메디칼베드마트'

Evaluation

Metrics

Label Metric
all 0.9571

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_lh19")
# Run inference
preds = model("[저소음 미세입자] 오므론 네블라이저 NE-C803  꿈꾸는약국")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 10.084 20
Label Training Sample Count
0.0 50
1.0 50
2.0 50
3.0 50
4.0 50

Training Hyperparameters

  • batch_size: (512, 512)
  • num_epochs: (20, 20)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 40
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.025 1 0.4162 -
1.25 50 0.2435 -
2.5 100 0.0066 -
3.75 150 0.0054 -
5.0 200 0.0001 -
6.25 250 0.0 -
7.5 300 0.0 -
8.75 350 0.0 -
10.0 400 0.0 -
11.25 450 0.0 -
12.5 500 0.0 -
13.75 550 0.0 -
15.0 600 0.0 -
16.25 650 0.0 -
17.5 700 0.0 -
18.75 750 0.0 -
20.0 800 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0.dev0
  • Sentence Transformers: 3.1.1
  • Transformers: 4.46.1
  • PyTorch: 2.4.0+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.20.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
453
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mini1013/master_cate_lh19

Base model

klue/roberta-base
Finetuned
(132)
this model

Evaluation results