SentenceTransformer based on answerdotai/ModernBERT-base
This is a sentence-transformers model finetuned from answerdotai/ModernBERT-base on the korean_nli_dataset dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: answerdotai/ModernBERT-base
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': True, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("x2bee/sts_nli_tune_test")
# Run inference
sentences = [
'버스가 바쁜 길을 따라 운전한다.',
'녹색 버스가 도로를 따라 내려간다.',
'그 여자는 데이트하러 가는 중이다.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts_dev
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.8273 |
spearman_cosine | 0.8298 |
pearson_euclidean | 0.8112 |
spearman_euclidean | 0.8214 |
pearson_manhattan | 0.8125 |
spearman_manhattan | 0.8226 |
pearson_dot | 0.7648 |
spearman_dot | 0.7648 |
pearson_max | 0.8273 |
spearman_max | 0.8298 |
Training Details
Training Dataset
korean_nli_dataset
- Dataset: korean_nli_dataset at ef305ef
- Size: 392,702 training samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 4 tokens
- mean: 35.7 tokens
- max: 194 tokens
- min: 4 tokens
- mean: 19.92 tokens
- max: 64 tokens
- min: 0.0
- mean: 0.48
- max: 1.0
- Samples:
sentence1 sentence2 score 개념적으로 크림 스키밍은 제품과 지리라는 두 가지 기본 차원을 가지고 있다.
제품과 지리학은 크림 스키밍을 작동시키는 것이다.
0.5
시즌 중에 알고 있는 거 알아? 네 레벨에서 다음 레벨로 잃어버리는 거야 브레이브스가 모팀을 떠올리기로 결정하면 브레이브스가 트리플 A에서 한 남자를 떠올리기로 결정하면 더블 A가 그를 대신하러 올라가고 A 한 명이 그를 대신하러 올라간다.
사람들이 기억하면 다음 수준으로 물건을 잃는다.
1.0
우리 번호 중 하나가 당신의 지시를 세밀하게 수행할 것이다.
우리 팀의 일원이 당신의 명령을 엄청나게 정확하게 실행할 것이다.
1.0
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Evaluation Dataset
sts_dev
- Dataset: sts_dev at 1de0cdf
- Size: 1,500 evaluation samples
- Columns:
text
,pair
, andlabel
- Approximate statistics based on the first 1000 samples:
text pair label type string string float details - min: 7 tokens
- mean: 20.38 tokens
- max: 52 tokens
- min: 6 tokens
- mean: 20.52 tokens
- max: 54 tokens
- min: 0.0
- mean: 0.42
- max: 1.0
- Samples:
text pair label 안전모를 가진 한 남자가 춤을 추고 있다.
안전모를 쓴 한 남자가 춤을 추고 있다.
1.0
어린아이가 말을 타고 있다.
아이가 말을 타고 있다.
0.95
한 남자가 뱀에게 쥐를 먹이고 있다.
남자가 뱀에게 쥐를 먹이고 있다.
1.0
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Framework Versions
- Python: 3.11.10
- Sentence Transformers: 3.3.1
- Transformers: 4.48.0
- PyTorch: 2.5.1+cu124
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for CocoRoF/ModernBERT-SimCSE
Base model
answerdotai/ModernBERT-baseEvaluation results
- Pearson Cosine on sts devself-reported0.827
- Spearman Cosine on sts devself-reported0.830
- Pearson Euclidean on sts devself-reported0.811
- Spearman Euclidean on sts devself-reported0.821
- Pearson Manhattan on sts devself-reported0.813
- Spearman Manhattan on sts devself-reported0.823
- Pearson Dot on sts devself-reported0.765
- Spearman Dot on sts devself-reported0.765
- Pearson Max on sts devself-reported0.827
- Spearman Max on sts devself-reported0.830