Edit model card

SentenceTransformer based on mixedbread-ai/mxbai-embed-large-v1

This is a sentence-transformers model finetuned from mixedbread-ai/mxbai-embed-large-v1. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: mixedbread-ai/mxbai-embed-large-v1
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Daxtra/sbert-trained-on-whs")
# Run inference
sentences = [
    'Responsible for lease negotiations and rent collections with the aim of maximising yields for rented properties.\nDealing with freedom of information requests.\nDealing with title issues.\nManage a wide range of insolvency assignments such as Fixed & Floating charge Receiverships, Members Voluntary Liquidations, Creditors Voluntary Liquidations and Court Appointed Liquidations.\nDevelopment and roll out of disposal strategies for all properties under management.\nManage the build out of a number of ghost estates on behalf of NAMA. A current project involves remediation works to 84 residential units in Co. Monaghan with a value of  EUR 10 Million.\nPreparation of tenders for NAMA & other financial institutions.\nAttending meeting with borrowers and financial institutions.\nCoordinate with estate agent to ensure we are receiving maximum yields for rented properties. I am responsible for the management of in excess of 200 rental properties across various Receiverships under my remit.',
    'Preparation of budgets in order to manage cash flow throughout the various assignments.\nPreparation and submission of tax and CRO returns.\nCommunicate with Solicitors, Estate Agents and Assets Managers on a daily basis to ensure properties are brought to market and sold in a timely manner.\nDrafting monthly/quarterly reports for case managers.\nReviewing tenders received and appointment of professional service firms.\nLiaising with NAMA case managers and our internal tax department in order to determine the most tax efficient manner to dispose of properties.',
    'Realization of bedside rounds and teaching.\nProgram implementation and development which include: administrative and HR management; conception and implementation of information system; Conception, implementation and coordination of PMTCT program.\nMonthly report of activities.\nPlanning and Supervision of mortality and morbidity review (MMR).\nResponsible for communication with the pediatric Saint Damien Hospital and other existing programs in the same hospital.\nNote: This program run by NPFS/Saint Damien and funded by Francesca Rava foundation at\nSupervision of the staff (12 Obstetricians, 7 anesthetists, 16 nurse midwives, 6 auxiliary midwives, 1 administrative assistant , 1 data clerk etc.)\nPerformance of ultrasound\nClinical work according to day time schedule\nPerformance of surgical procedures',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 112,464 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 8 tokens
    • mean: 64.94 tokens
    • max: 128 tokens
    • min: 6 tokens
    • mean: 64.91 tokens
    • max: 128 tokens
  • Samples:
    sentence_0 sentence_1
    Co-authored standalone Moodle site on sustainability that was marketed by the college and sold to various 3rd parties, making in excess of £ 20,000.
    Expert-level knowledge in eLearning and Virtual Learning Environments through daily use, including Moodle and Mahara with the requirement to produced tailored courses on the technology to a diverse range of teaching professionals;
    Delivered in excess of 20 training sessions to over 600 lecturers on a range of new innovative practices in education including innovations within the VLE which supported their continued professional development and enhanced their classroom performance with a participant satisfaction rate regularly in excess of 95%;
    Administered Moodle across college, supporting learner management and championing the use of core modules to support the tracking and assessment of in excess of 15000 learners.
    Coordinated and managed multiple syllabuses and competency tests for advanced software development course, emphasising quality of resources to promote self-guided learning in addition to more traditional approaches, increasing intake by 400% over a 3 year period, achieving pass and completion rate to in excess of 97%;
    Improved quality of eLearning resources through the delivery of training on the use of screen-recording and simulation software in content development, increasing the use of the Virtual Learning Environment from something that would serve as a repository of worksheets to a more interactive and engaging application, appearing as the top visited pages in weekly reports;
    Promoted to the unique position of Head Judge in Web Design by the Government-backed National Apprenticeship Service, project managing and collaborating with a team of Expert Judges nationally setting up the timetabling and resourcing of events attended by in excess of 100 student competitors;
    Advising on best practices.
    Installing and maintaining Windows and Linux network systems.
    Installing, uninstalling, troubleshooting specific Software for hospital based equipment.
    Hardware and software installation and desktop support.
    Solving I.T. issues for hospital staff.
    Backing up Data Systems, setting RAID configurations, wiring, setting up network and proxy servers. Fixing and troubleshooting desktop computers and laptops.
    Analysis of data from the manufacture of finished goods, distribution of materials in the production of finished products;
    Preparation of cost of production calculations for finished products;
    Full maintenance of accounting, tax, and management accounting in accordance with the current legislation of Ukraine.
    Accounting for cash transactions;
    Maintenance of personnel documents (orders, contracts, employment records);
    Work with primary documents (billing, acts, account invoices, tax invoices, work in Client Bank and Privat 24, carrying out banking operations, preparation of acts of reconciliation, payroll, accounting of goods and materials);
    Preparation and submission of financial, statistical, and tax reporting.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 24
  • per_device_eval_batch_size: 24
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 24
  • per_device_eval_batch_size: 24
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.0999 468 -
0.1067 500 0.432
0.1997 936 -
0.2134 1000 0.2153
0.2996 1404 -
0.3201 1500 0.1997
0.3995 1872 -
0.4268 2000 0.1635
0.4994 2340 -
0.5335 2500 0.1573
0.5992 2808 -
0.6402 3000 0.1518
0.6991 3276 -
0.7469 3500 0.1359
0.7990 3744 -
0.8536 4000 0.1351
0.8988 4212 -
0.9603 4500 0.1187
0.9987 4680 -
1.0 4686 -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.2.0
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
12
Safetensors
Model size
335M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Daxtra/sbert-trained-on-whs

Finetuned
(14)
this model