SetFit with klue/roberta-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses klue/roberta-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Sentence Transformer body: klue/roberta-base
  • Classification head: a LogisticRegression instance
  • Maximum Sequence Length: 512 tokens
  • Number of Classes: 2 classes

Model Sources

Model Labels

Label Examples
1
  • '로레알파리 토탈리페어5 트리트먼트 헤어팩 170ml × 1개 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 트리트먼트/헤어팩 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 트리트먼트/헤어팩'
  • '아모스 녹차실감 인텐시브 팩 250ml 녹차실감 인텐시브팩250g 홈>전체상품;(#M)홈>녹차실감 Naverstore > 화장품/미용 > 헤어케어 > 헤어팩'
  • '프리미엄 헤어클리닉 헤어팩 258ml 베이비파우더 LotteOn > 뷰티 > 헤어케어 > 헤어팩 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 트리트먼트/헤어팩'
0
  • '퓨어시카 트리트먼트 베이비파우더향 1000ml 1개 MinSellAmount 스마일배송 홈>뷰티>바디케어>바디워시;스마일배송 홈>;(#M)스마일배송 홈>뷰티>헤어케어/스타일링>트리트먼트/팩 Gmarket > 뷰티 > 바디/헤어 > 바디케어 > 바디클렌저'
  • '1+1 살림백서 탈모 샴푸 엑티브B7 맥주효모 앤 비오틴 1000ml 남자 여자 바이오틴 4)오푼티아 트리트먼트 유칼립투스 1L (#M)화장품/미용>헤어케어>탈모케어 AD > Naverstore > 화장품/미용 > 가을뷰티 > 각질관리템 > 탈모샴푸'
  • '1+1 살림백서 오푼티아 퍼퓸 샴푸 500ml 약산성 비듬 지성 두피 볼륨 유칼립투스향 13.유칼립투스 트리트먼트 1+1 500ml (#M)화장품/미용>헤어케어>샴푸 AD > Naverstore > 화장품/미용 > 머스크 > 샴푸'

Evaluation

Metrics

Label Accuracy
all 0.8206

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_top_bt13_9")
# Run inference
preds = model("무코타염색제 7박스+3박스+정품 트리트먼트 50g 1.카키브라운 (#M)바디/헤어>바디케어>바디케어세트 Gmarket > 뷰티 > 바디/헤어 > 바디케어 > 바디케어세트")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 14 23.76 98
Label Training Sample Count
0 50
1 50

Training Hyperparameters

  • batch_size: (64, 64)
  • num_epochs: (30, 30)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 100
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0064 1 0.4326 -
0.3185 50 0.3579 -
0.6369 100 0.2616 -
0.9554 150 0.0326 -
1.2739 200 0.0 -
1.5924 250 0.0 -
1.9108 300 0.0 -
2.2293 350 0.0 -
2.5478 400 0.0 -
2.8662 450 0.0 -
3.1847 500 0.0 -
3.5032 550 0.0 -
3.8217 600 0.0 -
4.1401 650 0.0 -
4.4586 700 0.0 -
4.7771 750 0.0 -
5.0955 800 0.0 -
5.4140 850 0.0 -
5.7325 900 0.0 -
6.0510 950 0.0 -
6.3694 1000 0.0 -
6.6879 1050 0.0 -
7.0064 1100 0.0 -
7.3248 1150 0.0 -
7.6433 1200 0.0 -
7.9618 1250 0.0 -
8.2803 1300 0.0 -
8.5987 1350 0.0 -
8.9172 1400 0.0 -
9.2357 1450 0.0 -
9.5541 1500 0.0 -
9.8726 1550 0.0 -
10.1911 1600 0.0 -
10.5096 1650 0.0 -
10.8280 1700 0.0 -
11.1465 1750 0.0 -
11.4650 1800 0.0 -
11.7834 1850 0.0 -
12.1019 1900 0.0 -
12.4204 1950 0.0 -
12.7389 2000 0.0 -
13.0573 2050 0.0 -
13.3758 2100 0.0 -
13.6943 2150 0.0 -
14.0127 2200 0.0 -
14.3312 2250 0.0 -
14.6497 2300 0.0 -
14.9682 2350 0.0 -
15.2866 2400 0.0 -
15.6051 2450 0.0 -
15.9236 2500 0.0 -
16.2420 2550 0.0 -
16.5605 2600 0.0 -
16.8790 2650 0.0 -
17.1975 2700 0.0 -
17.5159 2750 0.0 -
17.8344 2800 0.0 -
18.1529 2850 0.0 -
18.4713 2900 0.0 -
18.7898 2950 0.0 -
19.1083 3000 0.0 -
19.4268 3050 0.0 -
19.7452 3100 0.0 -
20.0637 3150 0.0 -
20.3822 3200 0.0 -
20.7006 3250 0.0 -
21.0191 3300 0.0 -
21.3376 3350 0.0 -
21.6561 3400 0.0 -
21.9745 3450 0.0 -
22.2930 3500 0.0 -
22.6115 3550 0.0 -
22.9299 3600 0.0 -
23.2484 3650 0.0 -
23.5669 3700 0.0 -
23.8854 3750 0.0 -
24.2038 3800 0.0 -
24.5223 3850 0.0 -
24.8408 3900 0.0 -
25.1592 3950 0.0 -
25.4777 4000 0.0 -
25.7962 4050 0.0 -
26.1146 4100 0.0 -
26.4331 4150 0.0 -
26.7516 4200 0.0 -
27.0701 4250 0.0 -
27.3885 4300 0.0 -
27.7070 4350 0.0 -
28.0255 4400 0.0 -
28.3439 4450 0.0 -
28.6624 4500 0.0 -
28.9809 4550 0.0 -
29.2994 4600 0.0 -
29.6178 4650 0.0 -
29.9363 4700 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.3.1
  • Transformers: 4.44.2
  • PyTorch: 2.2.0a0+81ea7a4
  • Datasets: 3.2.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
5,956
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mini1013/master_cate_top_bt13_9

Base model

klue/roberta-base
Finetuned
(176)
this model

Evaluation results