SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 on the query-hard-pos-neg-doc-pairs-statictable dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/allstats-search-miniLM-v1-3")
# Run inference
sentences = [
    'Arus dana Q3 2006',
    'Ringkasan Neraca Arus Dana, Triwulan III, 2006, (Miliar Rupiah)',
    'Rata-Rata Pengeluaran per Kapita Sebulan di Daerah Perkotaan Menurut Kelompok Barang dan Golongan Pengeluaran per Kapita Sebulan, 2000-2012',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Binary Classification

Metric allstats-semantic-mini-v1_test allstats-semantic-mini-v1_dev
cosine_accuracy 0.965 0.9651
cosine_accuracy_threshold 0.6882 0.6834
cosine_f1 0.9462 0.9465
cosine_f1_threshold 0.6882 0.6834
cosine_precision 0.9409 0.9415
cosine_recall 0.9515 0.9515
cosine_ap 0.9858 0.9862
cosine_mcc 0.9203 0.9207

Training Details

Training Dataset

query-hard-pos-neg-doc-pairs-statictable

  • Dataset: query-hard-pos-neg-doc-pairs-statictable at 7b28b96
  • Size: 25,580 training samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string int
    details
    • min: 7 tokens
    • mean: 20.14 tokens
    • max: 55 tokens
    • min: 5 tokens
    • mean: 24.9 tokens
    • max: 47 tokens
    • 0: ~70.80%
    • 1: ~29.20%
  • Samples:
    query doc label
    Status pekerjaan utama penduduk usia 15+ yang bekerja, 2020 Jumlah Penghuni Lapas per Kanwil 0
    status pekerjaan utama penduduk usia 15+ yang bekerja, 2020 Jumlah Penghuni Lapas per Kanwil 0
    STATUS PEKERJAAN UTAMA PENDUDUK USIA 15+ YANG BEKERJA, 2020 Jumlah Penghuni Lapas per Kanwil 0
  • Loss: OnlineContrastiveLoss

Evaluation Dataset

query-hard-pos-neg-doc-pairs-statictable

  • Dataset: query-hard-pos-neg-doc-pairs-statictable at 7b28b96
  • Size: 5,479 evaluation samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string int
    details
    • min: 7 tokens
    • mean: 20.78 tokens
    • max: 52 tokens
    • min: 4 tokens
    • mean: 26.28 tokens
    • max: 43 tokens
    • 0: ~71.50%
    • 1: ~28.50%
  • Samples:
    query doc label
    Bagaimana perbandingan PNS pria dan wanita di berbagai golongan tahun 2014? Rata-rata Pendapatan Bersih Berusaha Sendiri Menurut Provinsi dan Lapangan Pekerjaan Utama (ribu rupiah), 2017 0
    bagaimana perbandingan pns pria dan wanita di berbagai golongan tahun 2014? Rata-rata Pendapatan Bersih Berusaha Sendiri Menurut Provinsi dan Lapangan Pekerjaan Utama (ribu rupiah), 2017 0
    BAGAIMANA PERBANDINGAN PNS PRIA DAN WANITA DI BERBAGAI GOLONGAN TAHUN 2014? Rata-rata Pendapatan Bersih Berusaha Sendiri Menurut Provinsi dan Lapangan Pekerjaan Utama (ribu rupiah), 2017 0
  • Loss: OnlineContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • eval_on_start: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss allstats-semantic-mini-v1_test_cosine_ap allstats-semantic-mini-v1_dev_cosine_ap
-1 -1 - - 0.8789 -
0 0 - 0.4455 - 0.8789
0.0125 20 0.4484 0.3363 - 0.8893
0.0250 40 0.1921 0.2230 - 0.9052
0.0375 60 0.1779 0.1435 - 0.9440
0.0500 80 0.1047 0.1269 - 0.9511
0.0625 100 0.0669 0.1498 - 0.9445
0.0750 120 0.1662 0.1028 - 0.9630
0.0876 140 0.0774 0.1115 - 0.9589
0.1001 160 0.0947 0.1204 - 0.9500
0.1126 180 0.1285 0.1464 - 0.9456
0.1251 200 0.0793 0.1024 - 0.9600
0.1376 220 0.0792 0.0992 - 0.9607
0.1501 240 0.0696 0.0931 - 0.9642
0.1626 260 0.0692 0.1205 - 0.9538
0.1751 280 0.1015 0.0980 - 0.9629
0.1876 300 0.0628 0.1001 - 0.9634
0.2001 320 0.0335 0.1094 - 0.9650
0.2126 340 0.0668 0.0941 - 0.9673
0.2251 360 0.0662 0.0765 - 0.9748
0.2376 380 0.0251 0.0674 - 0.9784
0.2502 400 0.0771 0.0667 - 0.9805
0.2627 420 0.0363 0.0576 - 0.9785
0.2752 440 0.0762 0.0787 - 0.9726
0.2877 460 0.0475 0.0649 - 0.9773
0.3002 480 0.0086 0.0692 - 0.9760
0.3127 500 0.0242 0.0636 - 0.9771
0.3252 520 0.0342 0.0700 - 0.9758
0.3377 540 0.0568 0.0547 - 0.9792
0.3502 560 0.0286 0.0508 - 0.9808
0.3627 580 0.0426 0.0518 - 0.9823
0.3752 600 0.03 0.0553 - 0.9806
0.3877 620 0.0146 0.0826 - 0.9748
0.4003 640 0.0417 0.0667 - 0.9779
0.4128 660 0.0081 0.0667 - 0.9775
0.4253 680 0.0094 0.0704 - 0.9798
0.4378 700 0.0225 0.0525 - 0.9841
0.4503 720 0.0217 0.0462 - 0.9861
0.4628 740 0.011 0.0466 - 0.9858
0.4753 760 0.0191 0.0495 - 0.9846
0.4878 780 0.0146 0.0478 - 0.9847
0.5003 800 0.0076 0.0424 - 0.9852
0.5128 820 0.035 0.0549 - 0.9821
0.5253 840 0.0321 0.0551 - 0.9796
0.5378 860 0.0241 0.0559 - 0.9781
0.5503 880 0.0335 0.0525 - 0.9792
0.5629 900 0.0125 0.0539 - 0.9799
0.5754 920 0.0154 0.0512 - 0.9823
0.5879 940 0.0133 0.0497 - 0.9824
0.6004 960 0.0072 0.0532 - 0.9821
0.6129 980 0.0192 0.0520 - 0.9809
0.6254 1000 0.0199 0.0503 - 0.9811
0.6379 1020 0.0069 0.0484 - 0.9824
0.6504 1040 0.0065 0.0514 - 0.9806
0.6629 1060 0.0098 0.0479 - 0.9834
0.6754 1080 0.0 0.0480 - 0.9841
0.6879 1100 0.0247 0.0508 - 0.9835
0.7004 1120 0.0137 0.0481 - 0.9842
0.7129 1140 0.0068 0.0512 - 0.9838
0.7255 1160 0.0182 0.0473 - 0.9851
0.7380 1180 0.0129 0.0442 - 0.9859
0.7505 1200 0.0 0.0436 - 0.9860
0.7630 1220 0.0073 0.0439 - 0.9858
0.7755 1240 0.0081 0.0441 - 0.9859
0.7880 1260 0.0305 0.0460 - 0.9857
0.8005 1280 0.0003 0.0486 - 0.9851
0.8130 1300 0.0218 0.0501 - 0.9852
0.8255 1320 0.0187 0.0435 - 0.9844
0.8380 1340 0.0205 0.0437 - 0.9846
0.8505 1360 0.0094 0.0442 - 0.9851
0.8630 1380 0.0083 0.0426 - 0.9856
0.8755 1400 0.0 0.0423 - 0.9858
0.8881 1420 0.0 0.0424 - 0.9859
0.9006 1440 0.0073 0.0428 - 0.9859
0.9131 1460 0.0075 0.0441 - 0.9859
0.9256 1480 0.0177 0.0447 - 0.9858
0.9381 1500 0.0 0.0438 - 0.9858
0.9506 1520 0.0 0.0438 - 0.9858
0.9631 1540 0.0072 0.0440 - 0.9860
0.9756 1560 0.0101 0.0436 - 0.9861
0.9881 1580 0.0277 0.0437 - 0.9862
-1 -1 - - 0.9858 -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.4.0
  • Transformers: 4.48.1
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
9
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for yahyaabd/allstats-search-miniLM-v1-3

Dataset used to train yahyaabd/allstats-search-miniLM-v1-3

Evaluation results

  • Cosine Accuracy on allstats semantic mini v1 test
    self-reported
    0.965
  • Cosine Accuracy Threshold on allstats semantic mini v1 test
    self-reported
    0.688
  • Cosine F1 on allstats semantic mini v1 test
    self-reported
    0.946
  • Cosine F1 Threshold on allstats semantic mini v1 test
    self-reported
    0.688
  • Cosine Precision on allstats semantic mini v1 test
    self-reported
    0.941
  • Cosine Recall on allstats semantic mini v1 test
    self-reported
    0.952
  • Cosine Ap on allstats semantic mini v1 test
    self-reported
    0.986
  • Cosine Mcc on allstats semantic mini v1 test
    self-reported
    0.920
  • Cosine Accuracy on allstats semantic mini v1 dev
    self-reported
    0.965
  • Cosine Accuracy Threshold on allstats semantic mini v1 dev
    self-reported
    0.683