klue-sbert-v1

This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.

klue/bert-base 모델을 sentencebert로 파인튜닝한 모델

Evaluation Results

  • 성능 측정을 위한 말뭉치는, 아래 한국어 (kor), 영어(en) 평가 말뭉치를 이용함
    한국어 : korsts(1,379쌍문장)klue-sts(519쌍문장)
    영어 : stsb_multi_mt(1,376쌍문장) 와 glue:stsb (1,500쌍문장)
  • 성능 지표는 cosin.spearman
  • 평가 측정 코드는 여기 참조
  • 모델 korsts klue-sts glue(stsb) stsb_multi_mt(en)
    distiluse-base-multilingual-cased-v2 0.7475 0.7855 0.8193 0.8075
    paraphrase-multilingual-mpnet-base-v2 0.8201 0.7993 0.8907 0.8682
    bongsoo/albert-small-kor-sbert-v1 0.8305 0.8588 0.8419 0.7965
    bongsoo/kpf-sbert-v1.0 0.8590 0.8924 0.8840 0.8531
    bongsoo/klue-sbert-v1.0 0.8529 0.8952 0.8813 0.8469

For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net

Training

  • klue/bert-base 모델을 sts(10)-distil(10)-nli(3)-sts(10) 훈련 시킴

The model was trained with the parameters:

공통

  • do_lower_case=1, correct_bios=0, polling_mode=mean

1.STS

  • 말뭉치 : korsts(5,749) + kluestsV1.1(11,668) + stsb_multi_mt(5,749) + mteb/sickr-sts(9,927) + glue stsb(5,749) (총:38,842)
  • Param : lr: 1e-4, eps: 1e-6, warm_step=10%, epochs: 10, train_batch: 128, eval_batch: 64, max_token_len: 72
  • 훈련코드 여기 참조

2.distilation

  • 교사 모델 : paraphrase-multilingual-mpnet-base-v2(max_token_len:128)
  • 말뭉치 : news_talk_en_ko_train.tsv (영어-한국어 대화-뉴스 병렬 말뭉치 : 1.38M)
  • Param : lr: 5e-5, eps: 1e-8, epochs: 10, train_batch: 128, eval/test_batch: 64, max_token_len: 128(교사모델이 128이므로 맟춰줌)
  • 훈련코드 여기 참조

3.NLI - 말뭉치 : 훈련(967,852) : kornli(550,152), kluenli(24,998), glue-mnli(392,702) / 평가(3,519) : korsts(1,500), kluests(519), gluests(1,500) () - HyperParameter : lr: 3e-5, eps: 1e-8, warm_step=10%, epochs: 3, train/eval_batch: 64, max_token_len: 128 - 훈련코드 여기 참조

Citing & Authors

bongsoo

Downloads last month
91
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.