SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2 on the bps-query-publication-similarity-pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/allstat-semantic-search-paraphrase-mpnet-base-v2-2-sts")
# Run inference
sentences = [
    'KONDISI SOSIAL EKONOMI INDONESIA BULAN MEI',
    'Perkembangan Beberapa Indikator Utama Sosial-Ekonomi Indonesia Edisi Mei',
    'Ekspor Menurut Moda Transportasi',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric allstat-semantic-dev allstat-semantic-test
pearson_cosine 0.99 0.9894
spearman_cosine 0.9525 0.9518

Training Details

Training Dataset

bps-query-publication-similarity-pairs

  • Dataset: bps-query-publication-similarity-pairs at dc8407f
  • Size: 154,111 training samples
  • Columns: query, doc_title, and score
  • Approximate statistics based on the first 1000 samples:
    query doc_title score
    type string string float
    details
    • min: 4 tokens
    • mean: 11.15 tokens
    • max: 60 tokens
    • min: 4 tokens
    • mean: 11.25 tokens
    • max: 41 tokens
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    query doc_title score
    LAPORKAN PASAR TENAGA KERJA INDONESIA BULAN DUA Buletin Statistik Perdagangan Luar Negeri Impor Februari 0.15
    ANALISIS MOBILITAS TENAGA KERJA Statistik Upah 0.1
    Statistik perdagangan luar negeri ekspor Desember kde HS Buletin Statistik Perdagangan Luar Negeri Ekspor Menurut HS, Desember 0.88
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

bps-query-publication-similarity-pairs

  • Dataset: bps-query-publication-similarity-pairs at dc8407f
  • Size: 19,264 evaluation samples
  • Columns: query, doc_title, and score
  • Approximate statistics based on the first 1000 samples:
    query doc_title score
    type string string float
    details
    • min: 4 tokens
    • mean: 11.21 tokens
    • max: 32 tokens
    • min: 4 tokens
    • mean: 11.42 tokens
    • max: 38 tokens
    • min: 0.0
    • mean: 0.52
    • max: 1.0
  • Samples:
    query doc_title score
    Laporan statistik perkebunan teh Indonesia Perkembangan Beberapa Indikator Utama Sosial-Ekonomi Indonesia Agustus 0.02
    Sensus ekonomi : data bisnis Jawa Tengah Benchmark Indeks Konstruksi (=100), 0.07
    data harga produsen pertanian, tanaman pangan, hortikultura & perkebunan rakyat Statistik Harga Produsen Pertanian Subsektor Tanaman Pangan, Hortikultura dan Tanaman Perkebunan Rakyat 0.9
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss allstat-semantic-dev_spearman_cosine allstat-semantic-test_spearman_cosine
0.0208 200 0.0615 0.0418 0.7425 -
0.0415 400 0.0403 0.0317 0.7678 -
0.0623 600 0.0306 0.0262 0.7783 -
0.0831 800 0.0262 0.0241 0.7821 -
0.1038 1000 0.0249 0.0215 0.7865 -
0.1246 1200 0.0216 0.0209 0.7902 -
0.1453 1400 0.0207 0.0196 0.7928 -
0.1661 1600 0.0199 0.0192 0.7950 -
0.1869 1800 0.019 0.0182 0.7963 -
0.2076 2000 0.0188 0.0189 0.8015 -
0.2284 2200 0.0182 0.0177 0.8021 -
0.2492 2400 0.0177 0.0183 0.7980 -
0.2699 2600 0.0185 0.0170 0.8044 -
0.2907 2800 0.0182 0.0173 0.8077 -
0.3115 3000 0.0174 0.0162 0.8089 -
0.3322 3200 0.0174 0.0173 0.8075 -
0.3530 3400 0.0179 0.0173 0.8097 -
0.3738 3600 0.0173 0.0165 0.8082 -
0.3945 3800 0.0163 0.0166 0.8133 -
0.4153 4000 0.0182 0.0169 0.8082 -
0.4360 4200 0.0166 0.0168 0.8061 -
0.4568 4400 0.016 0.0159 0.8169 -
0.4776 4600 0.0161 0.0151 0.8179 -
0.4983 4800 0.0158 0.0169 0.8160 -
0.5191 5000 0.0158 0.0156 0.8175 -
0.5399 5200 0.015 0.0146 0.8232 -
0.5606 5400 0.0152 0.0151 0.8222 -
0.5814 5600 0.0153 0.0151 0.8214 -
0.6022 5800 0.0151 0.0143 0.8269 -
0.6229 6000 0.014 0.0132 0.8293 -
0.6437 6200 0.0133 0.0129 0.8307 -
0.6645 6400 0.0126 0.0132 0.8286 -
0.6852 6600 0.0132 0.0127 0.8335 -
0.7060 6800 0.014 0.0129 0.8326 -
0.7267 7000 0.0137 0.0131 0.8342 -
0.7475 7200 0.0124 0.0120 0.8391 -
0.7683 7400 0.0125 0.0124 0.8360 -
0.7890 7600 0.0132 0.0126 0.8368 -
0.8098 7800 0.0129 0.0130 0.8346 -
0.8306 8000 0.0132 0.0119 0.8427 -
0.8513 8200 0.0115 0.0113 0.8442 -
0.8721 8400 0.0113 0.0114 0.8468 -
0.8929 8600 0.0111 0.0111 0.8490 -
0.9136 8800 0.0115 0.0111 0.8452 -
0.9344 9000 0.011 0.0109 0.8499 -
0.9551 9200 0.0105 0.0103 0.8538 -
0.9759 9400 0.0106 0.0102 0.8549 -
0.9967 9600 0.0108 0.0111 0.8510 -
1.0174 9800 0.0097 0.0103 0.8561 -
1.0382 10000 0.0091 0.0099 0.8575 -
1.0590 10200 0.0087 0.0093 0.8610 -
1.0797 10400 0.0088 0.0097 0.8580 -
1.1005 10600 0.0083 0.0090 0.8644 -
1.1213 10800 0.0088 0.0092 0.8627 -
1.1420 11000 0.0081 0.0089 0.8648 -
1.1628 11200 0.0083 0.0091 0.8619 -
1.1836 11400 0.0084 0.0096 0.8632 -
1.2043 11600 0.008 0.0095 0.8612 -
1.2251 11800 0.008 0.0094 0.8649 -
1.2458 12000 0.0081 0.0092 0.8661 -
1.2666 12200 0.0083 0.0087 0.8705 -
1.2874 12400 0.0077 0.0087 0.8705 -
1.3081 12600 0.0079 0.0085 0.8722 -
1.3289 12800 0.0075 0.0090 0.8698 -
1.3497 13000 0.0086 0.0085 0.8717 -
1.3704 13200 0.0077 0.0083 0.8741 -
1.3912 13400 0.0075 0.0083 0.8751 -
1.4120 13600 0.0071 0.0078 0.8775 -
1.4327 13800 0.008 0.0082 0.8734 -
1.4535 14000 0.0069 0.0084 0.8774 -
1.4743 14200 0.0075 0.0081 0.8764 -
1.4950 14400 0.0074 0.0078 0.8794 -
1.5158 14600 0.0073 0.0087 0.8741 -
1.5365 14800 0.0078 0.0080 0.8810 -
1.5573 15000 0.0067 0.0082 0.8792 -
1.5781 15200 0.0072 0.0080 0.8796 -
1.5988 15400 0.0075 0.0077 0.8832 -
1.6196 15600 0.007 0.0076 0.8840 -
1.6404 15800 0.0073 0.0075 0.8864 -
1.6611 16000 0.0066 0.0072 0.8877 -
1.6819 16200 0.0068 0.0074 0.8873 -
1.7027 16400 0.0067 0.0072 0.8886 -
1.7234 16600 0.0065 0.0074 0.8871 -
1.7442 16800 0.0065 0.0071 0.8915 -
1.7650 17000 0.0072 0.0071 0.8905 -
1.7857 17200 0.0063 0.0068 0.8942 -
1.8065 17400 0.0061 0.0067 0.8961 -
1.8272 17600 0.0059 0.0064 0.8991 -
1.8480 17800 0.0062 0.0065 0.8999 -
1.8688 18000 0.0066 0.0068 0.8968 -
1.8895 18200 0.0059 0.0065 0.8984 -
1.9103 18400 0.0056 0.0063 0.8993 -
1.9311 18600 0.006 0.0061 0.9008 -
1.9518 18800 0.0057 0.0062 0.9006 -
1.9726 19000 0.006 0.0060 0.9011 -
1.9934 19200 0.0062 0.0061 0.9011 -
2.0141 19400 0.0052 0.0060 0.9036 -
2.0349 19600 0.0046 0.0058 0.9057 -
2.0556 19800 0.0046 0.0056 0.9064 -
2.0764 20000 0.0042 0.0057 0.9082 -
2.0972 20200 0.0043 0.0057 0.9075 -
2.1179 20400 0.0043 0.0055 0.9089 -
2.1387 20600 0.0044 0.0060 0.9089 -
2.1595 20800 0.0047 0.0055 0.9079 -
2.1802 21000 0.0047 0.0055 0.9089 -
2.2010 21200 0.0043 0.0053 0.9121 -
2.2218 21400 0.004 0.0053 0.9120 -
2.2425 21600 0.0041 0.0054 0.9108 -
2.2633 21800 0.0041 0.0054 0.9111 -
2.2841 22000 0.0039 0.0053 0.9134 -
2.3048 22200 0.0045 0.0053 0.9118 -
2.3256 22400 0.0044 0.0055 0.9116 -
2.3463 22600 0.0043 0.0053 0.9140 -
2.3671 22800 0.004 0.0052 0.9147 -
2.3879 23000 0.0042 0.0050 0.9149 -
2.4086 23200 0.0041 0.0050 0.9175 -
2.4294 23400 0.004 0.0051 0.9161 -
2.4502 23600 0.0039 0.0050 0.9182 -
2.4709 23800 0.0039 0.0048 0.9204 -
2.4917 24000 0.0037 0.0047 0.9205 -
2.5125 24200 0.0035 0.0048 0.9212 -
2.5332 24400 0.0039 0.0048 0.9218 -
2.5540 24600 0.0038 0.0045 0.9229 -
2.5748 24800 0.0038 0.0047 0.9229 -
2.5955 25000 0.004 0.0046 0.9230 -
2.6163 25200 0.004 0.0045 0.9255 -
2.6370 25400 0.0036 0.0044 0.9251 -
2.6578 25600 0.0036 0.0045 0.9256 -
2.6786 25800 0.0037 0.0044 0.9263 -
2.6993 26000 0.0036 0.0044 0.9273 -
2.7201 26200 0.0037 0.0045 0.9256 -
2.7409 26400 0.0034 0.0044 0.9281 -
2.7616 26600 0.0035 0.0043 0.9285 -
2.7824 26800 0.0034 0.0042 0.9291 -
2.8032 27000 0.0032 0.0041 0.9307 -
2.8239 27200 0.0033 0.0042 0.9304 -
2.8447 27400 0.0032 0.0040 0.9311 -
2.8654 27600 0.0035 0.0042 0.9312 -
2.8862 27800 0.0034 0.0041 0.9327 -
2.9070 28000 0.0035 0.0039 0.9327 -
2.9277 28200 0.0034 0.0039 0.9337 -
2.9485 28400 0.003 0.0039 0.9342 -
2.9693 28600 0.0031 0.0039 0.9341 -
2.9900 28800 0.003 0.0038 0.9362 -
3.0108 29000 0.0026 0.0037 0.9378 -
3.0316 29200 0.0025 0.0038 0.9376 -
3.0523 29400 0.0023 0.0036 0.9378 -
3.0731 29600 0.0024 0.0037 0.9382 -
3.0939 29800 0.0024 0.0037 0.9385 -
3.1146 30000 0.0024 0.0035 0.9381 -
3.1354 30200 0.0023 0.0036 0.9385 -
3.1561 30400 0.0023 0.0035 0.9399 -
3.1769 30600 0.0022 0.0034 0.9407 -
3.1977 30800 0.0023 0.0034 0.9408 -
3.2184 31000 0.0024 0.0034 0.9406 -
3.2392 31200 0.0023 0.0033 0.9417 -
3.2600 31400 0.0022 0.0033 0.9423 -
3.2807 31600 0.0023 0.0034 0.9419 -
3.3015 31800 0.0023 0.0033 0.9428 -
3.3223 32000 0.0021 0.0032 0.9439 -
3.3430 32200 0.0021 0.0032 0.9438 -
3.3638 32400 0.0022 0.0032 0.9442 -
3.3846 32600 0.0023 0.0032 0.9445 -
3.4053 32800 0.0023 0.0031 0.9451 -
3.4261 33000 0.0022 0.0031 0.9453 -
3.4468 33200 0.0021 0.0032 0.9455 -
3.4676 33400 0.002 0.0031 0.9459 -
3.4884 33600 0.0024 0.0030 0.9466 -
3.5091 33800 0.0022 0.0030 0.9468 -
3.5299 34000 0.0022 0.0031 0.9472 -
3.5507 34200 0.0022 0.0030 0.9474 -
3.5714 34400 0.002 0.0030 0.9477 -
3.5922 34600 0.0021 0.0030 0.9480 -
3.6130 34800 0.002 0.0029 0.9485 -
3.6337 35000 0.002 0.0029 0.9489 -
3.6545 35200 0.0019 0.0029 0.9492 -
3.6752 35400 0.002 0.0029 0.9493 -
3.6960 35600 0.002 0.0029 0.9497 -
3.7168 35800 0.0021 0.0028 0.9499 -
3.7375 36000 0.0019 0.0028 0.9501 -
3.7583 36200 0.0019 0.0028 0.9507 -
3.7791 36400 0.0019 0.0028 0.9510 -
3.7998 36600 0.0019 0.0028 0.9514 -
3.8206 36800 0.0019 0.0027 0.9517 -
3.8414 37000 0.0018 0.0028 0.9517 -
3.8621 37200 0.0019 0.0027 0.9519 -
3.8829 37400 0.0017 0.0027 0.9521 -
3.9037 37600 0.0019 0.0027 0.9522 -
3.9244 37800 0.0019 0.0027 0.9522 -
3.9452 38000 0.0019 0.0027 0.9523 -
3.9659 38200 0.0018 0.0027 0.9525 -
3.9867 38400 0.0018 0.0027 0.9525 -
4.0 38528 - - - 0.9518

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.2.2+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
25
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for yahyaabd/allstat-semantic-search-paraphrase-mpnet-base-v2-2-sts

Dataset used to train yahyaabd/allstat-semantic-search-paraphrase-mpnet-base-v2-2-sts

Evaluation results