adriansanz's picture
Add new SentenceTransformer model.
6a178d5 verified
metadata
base_model: BAAI/bge-m3
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:5520
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: Pagar un rebut o una liquidació pendent de pagament
    sentences:
      - Què és el tràmit per pagar un rebut o liquidació?
      - Quin és el tràmit que permet la inscripció d'una entitat o associació?
      - Quin és el límit de temps per a la instal·lació de tanques provisionals?
  - source_sentence: >-
      Mitjançant decret de data 11/10/2022 núm. 202204494 s'inicia el procés de
      concurrència competitiva per accedir a les parades vacants del mercat de
      les Fonts.
    sentences:
      - >-
        Quin és el mercat on es va iniciar el procés de concurrència competitiva
        per accedir a les parades vacants?
      - >-
        Puc sol·licitar un certificat històric d'empadronament per a una persona
        que ja no viu al municipi?
      - >-
        Necessito obtenir un duplicat del títol de dret funerari perquè he
        perdut l'original
  - source_sentence: >-
      Comunicar les dades per realitzar la notificació electrònica de tots els
      procediments en què l’obligat legal sigui titular o part implicada, i hagi
      de ser notificat o notificada.
    sentences:
      - >-
        Quin és el paper de l'Ajuntament en la inspecció de les condicions
        específiques?
      - Quin és el tràmit relacionat amb la targeta ciutadana de serveis?
      - Qui és el titular o part implicada en els procediments?
  - source_sentence: >-
      Aquest tràmit permet sol·licitar l'informe municipal sobre la integració
      social de persones estrangeres.
    sentences:
      - Puc canviar la concessió del meu dret funerari per una raó específica?
      - Quin és el procediment per a obtenir l'informe d'inserció social?
      - Quin és el propòsit de la formació en higiene alimentària
  - source_sentence: Permet tramitar la baixa de les activitats esportives municipals.
    sentences:
      - Quin és el procés per a donar de baixa una activitat esportiva?
      - On es pot recollir els dorsals el dia de la cursa?
      - Quin és el benefici fiscal que es pot obtenir?
model-index:
  - name: SentenceTransformer based on BAAI/bge-m3
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 1024
          type: dim_1024
        metrics:
          - type: cosine_accuracy@1
            value: 0.1
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.22608695652173913
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.30434782608695654
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.4956521739130435
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.1
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.0753623188405797
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.060869565217391314
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.04956521739130433
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.1
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.22608695652173913
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.30434782608695654
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.4956521739130435
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2644535096144644
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.19486714975845426
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.21422014718167715
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.1
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.21304347826086956
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.49130434782608695
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.1
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.07101449275362319
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06000000000000001
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.04913043478260868
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.1
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.21304347826086956
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.3
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.49130434782608695
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2611989525147102
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.19224465148378198
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.21168860407432996
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.09565217391304348
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.25217391304347825
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3217391304347826
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5043478260869565
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.09565217391304348
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.08405797101449275
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06434782608695652
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.05043478260869564
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.09565217391304348
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.25217391304347825
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.3217391304347826
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5043478260869565
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2736727362077943
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.20330400276052454
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.2225493022129085
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.09130434782608696
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.24347826086956523
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.32608695652173914
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.4782608695652174
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.09130434782608696
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.08115942028985507
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06521739130434782
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.04782608695652173
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.09130434782608696
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.24347826086956523
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.32608695652173914
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.4782608695652174
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.25842339032219125
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.19112146307798494
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.21262325852877148
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.09565217391304348
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.2217391304347826
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.32608695652173914
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.5130434782608696
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.09565217391304348
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.07391304347826087
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06521739130434782
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.05130434782608694
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.09565217391304348
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.2217391304347826
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.32608695652173914
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.5130434782608696
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.2703816814799584
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.1968685300207041
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.21575875323163748
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.10434782608695652
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.23478260869565218
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.3217391304347826
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.49130434782608695
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.10434782608695652
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.0782608695652174
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.06434782608695652
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.049130434782608694
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.10434782608695652
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.23478260869565218
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.3217391304347826
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.49130434782608695
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.268671836286108
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.20097135955831624
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.22058427749634182
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("adriansanz/sqv-v5-5ep")
# Run inference
sentences = [
    'Permet tramitar la baixa de les activitats esportives municipals.',
    'Quin és el procés per a donar de baixa una activitat esportiva?',
    'Quin és el benefici fiscal que es pot obtenir?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.1
cosine_accuracy@3 0.2261
cosine_accuracy@5 0.3043
cosine_accuracy@10 0.4957
cosine_precision@1 0.1
cosine_precision@3 0.0754
cosine_precision@5 0.0609
cosine_precision@10 0.0496
cosine_recall@1 0.1
cosine_recall@3 0.2261
cosine_recall@5 0.3043
cosine_recall@10 0.4957
cosine_ndcg@10 0.2645
cosine_mrr@10 0.1949
cosine_map@100 0.2142

Information Retrieval

Metric Value
cosine_accuracy@1 0.1
cosine_accuracy@3 0.213
cosine_accuracy@5 0.3
cosine_accuracy@10 0.4913
cosine_precision@1 0.1
cosine_precision@3 0.071
cosine_precision@5 0.06
cosine_precision@10 0.0491
cosine_recall@1 0.1
cosine_recall@3 0.213
cosine_recall@5 0.3
cosine_recall@10 0.4913
cosine_ndcg@10 0.2612
cosine_mrr@10 0.1922
cosine_map@100 0.2117

Information Retrieval

Metric Value
cosine_accuracy@1 0.0957
cosine_accuracy@3 0.2522
cosine_accuracy@5 0.3217
cosine_accuracy@10 0.5043
cosine_precision@1 0.0957
cosine_precision@3 0.0841
cosine_precision@5 0.0643
cosine_precision@10 0.0504
cosine_recall@1 0.0957
cosine_recall@3 0.2522
cosine_recall@5 0.3217
cosine_recall@10 0.5043
cosine_ndcg@10 0.2737
cosine_mrr@10 0.2033
cosine_map@100 0.2225

Information Retrieval

Metric Value
cosine_accuracy@1 0.0913
cosine_accuracy@3 0.2435
cosine_accuracy@5 0.3261
cosine_accuracy@10 0.4783
cosine_precision@1 0.0913
cosine_precision@3 0.0812
cosine_precision@5 0.0652
cosine_precision@10 0.0478
cosine_recall@1 0.0913
cosine_recall@3 0.2435
cosine_recall@5 0.3261
cosine_recall@10 0.4783
cosine_ndcg@10 0.2584
cosine_mrr@10 0.1911
cosine_map@100 0.2126

Information Retrieval

Metric Value
cosine_accuracy@1 0.0957
cosine_accuracy@3 0.2217
cosine_accuracy@5 0.3261
cosine_accuracy@10 0.513
cosine_precision@1 0.0957
cosine_precision@3 0.0739
cosine_precision@5 0.0652
cosine_precision@10 0.0513
cosine_recall@1 0.0957
cosine_recall@3 0.2217
cosine_recall@5 0.3261
cosine_recall@10 0.513
cosine_ndcg@10 0.2704
cosine_mrr@10 0.1969
cosine_map@100 0.2158

Information Retrieval

Metric Value
cosine_accuracy@1 0.1043
cosine_accuracy@3 0.2348
cosine_accuracy@5 0.3217
cosine_accuracy@10 0.4913
cosine_precision@1 0.1043
cosine_precision@3 0.0783
cosine_precision@5 0.0643
cosine_precision@10 0.0491
cosine_recall@1 0.1043
cosine_recall@3 0.2348
cosine_recall@5 0.3217
cosine_recall@10 0.4913
cosine_ndcg@10 0.2687
cosine_mrr@10 0.201
cosine_map@100 0.2206

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 5,520 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 5 tokens
    • mean: 43.7 tokens
    • max: 117 tokens
    • min: 9 tokens
    • mean: 20.51 tokens
    • max: 51 tokens
  • Samples:
    positive anchor
    L’Ajuntament vol crear un banc de recursos on recollir tots els oferiments de la població i que servirà per atendre les necessitats de les famílies refugiades acollides al poble. Quin és el paper de l’Ajuntament en la integració de les persones refugiades acollides?
    Aquest tipus d'actuació requereix la intervenció d'una persona tècnica competent que subscrigui el projecte o la documentació tècnica corresponent i que assumeixi la direcció facultativa de l'execució de les obres. Quin és el requisit per a la intervenció d'una persona tècnica competent en les obres d'intervenció parcial interior en edificis amb elements catalogats?
    Aquest títol, adreçat a persones empadronades a Sant Quirze del Vallès, es concedirà segons el nivell d’ingressos, la condició d’edat o de discapacitat, en base als criteris específics que recull l’ordenança reguladora del sistema de tarifació social del transport públic municipal en autobús a Sant Quirze del Vallès. Quin és el benefici de la TBUS GRATUÏTA per a les persones majors?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024,
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 5
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.2
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_1024_cosine_map@100 dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.4638 10 4.122 - - - - - -
0.9275 20 2.7131 - - - - - -
0.9739 21 - 0.2085 0.1973 0.1884 0.2087 0.1886 0.2177
1.3913 30 1.6964 - - - - - -
1.8551 40 1.2311 - - - - - -
1.9942 43 - 0.2148 0.2135 0.2170 0.2351 0.2091 0.2386
2.3188 50 0.9216 - - - - - -
2.7826 60 0.737 - - - - - -
2.9681 64 - 0.2145 0.2058 0.2072 0.2277 0.2127 0.2085
3.2464 70 0.6678 - - - - - -
3.7101 80 0.555 - - - - - -
3.9884 86 - 0.2028 0.2154 0.2117 0.2331 0.2113 0.2028
4.1739 90 0.5542 - - - - - -
4.6377 100 0.5058 - - - - - -
4.8696 105 - 0.2142 0.2158 0.2126 0.2225 0.2206 0.2117
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.35.0.dev0
  • Datasets: 3.0.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}