router-embedding / README.md
Muhammad2003's picture
Add new SentenceTransformer model.
06ab687 verified
|
raw
history blame
13.8 kB
metadata
base_model: BAAI/bge-base-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:14271
  - loss:BatchAllTripletLoss
widget:
  - source_sentence: >-
      In a complex legal scenario involving multiple jurisdictions, how would
      you navigate the differences in laws related to online privacy violations
      and harassment?
    sentences:
      - >-
        How does voluntary admission under the Baker Act impact eligibility for
        a Concealed Weapon Permit?
      - >-
        How do the terms of the account and the circumstances impact the
        potential liability of the Bank of Hawaii in this situation?
      - Can someone run a background check on you without your consent?
  - source_sentence: How long is the Kansas Lemon Law effective for?
    sentences:
      - What should I do to stop my neighbor from using my land and barn?
      - >-
        How does the expungement of an arrest impact the disclosure requirements
        in applications for permits or licenses?
      - >-
        If a policy is canceled due to a denied claim, does the canceled policy
        still cover injuries from the incident?
  - source_sentence: >-
      What are the implications of a guilty plea without corroborating evidence
      in terms of justice and fairness?
    sentences:
      - >-
        How does having a Series 7 license impact the ability of a financial
        planner to sell securities products?
      - >-
        What are the specific state laws that govern the relationship between
        the Baker Act and Concealed Weapon Permits?
      - >-
        How does the duration of copyright protection impact the entry of works
        into the public domain?
  - source_sentence: How can one prove the terms and existence of a verbal contract?
    sentences:
      - >-
        Is it common for search warrants to be obtained under a unique cause
        number?
      - >-
        In what ways can transparency in background check forms contribute to
        national security measures?
      - >-
        What are the potential legal responsibilities of the 14-year-old boy if
        he is determined to be the father of the baby?
  - source_sentence: >-
      How can the person ensure they receive the necessary compensation for
      their work-related injury?
    sentences:
      - >-
        Is there a law in Oklahoma that restricts the distance of a dispensary
        to a baseball field?
      - >-
        Considering the complexities of property rights, due process, and public
        safety, what are the ethical and legal considerations surrounding
        citizens taking possession of unattended animals in public areas, and
        how do these actions intersect with constitutional rights and property
        laws?
      - >-
        What precedent cases or legal doctrines could be relevant in a lawsuit
        against the town council person and the township in this scenario?

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Muhammad2003/router-embedding")
# Run inference
sentences = [
    'How can the person ensure they receive the necessary compensation for their work-related injury?',
    'Is there a law in Oklahoma that restricts the distance of a dispensary to a baseball field?',
    'Considering the complexities of property rights, due process, and public safety, what are the ethical and legal considerations surrounding citizens taking possession of unattended animals in public areas, and how do these actions intersect with constitutional rights and property laws?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 14,271 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 2 tokens
    • mean: 23.55 tokens
    • max: 50 tokens
    • 0: ~25.00%
    • 1: ~25.00%
    • 2: ~25.00%
    • 3: ~25.00%
  • Samples:
    sentence label
    What rights do you have regarding accessing your medical records under HIPAA? 1
    What should you do if you lose access to your patient portal after being discharged from a healthcare provider? 1
    How can you address the issue of losing access to your patient portal with the pain management office? 3
  • Loss: BatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.1121 100 4.9663
0.2242 200 4.7529
0.3363 300 4.5009
0.4484 400 4.4893
0.5605 500 4.3914
0.6726 600 4.4306
0.7848 700 4.5464
0.8969 800 4.4952
1.0090 900 4.349
1.1211 1000 4.4221
1.2332 1100 4.4476
1.3453 1200 4.4516
1.4574 1300 4.3968
1.5695 1400 4.3283
1.6816 1500 4.3894
1.7937 1600 4.42
1.9058 1700 4.4457

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.1.0+cu118
  • Accelerate: 0.33.0.dev0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

BatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}