SentenceTransformer
This is a sentence-transformers model trained on the parquet dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- parquet
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/Bioformer-16L-UMLS-Pubmed_PMC-ST-TCE-Epoch-1")
# Run inference
sentences = [
'[YEAR_RANGE] 2021-2025 [TEXT] Combined hyperglycemic crises in adult patients already exist in Latin America.',
'[YEAR_RANGE] 2021-2025 [TEXT] AbstractIntroduction. Diabetes mellitus is one of the most common diseases worldwide, with a high morbidity and mortality rate. Its prevalence has been increasing, as well as its acute complications, such as hyperglycemic crises. Hyperglycemic crises can present with combined features of diabetic ketoacidosis and hyperosmolar state. However, their implications are not fully understood.Objective. To describe the characteristics, outcomes, and complications of the diabetic population with hyperglycemic crises and to value the combined state in the Latin American population.Materials and methods. Retrospective observational study of all hyperglycemic crises treated in the intensive care unit of the Fundación Valle del Lili between January 1, 2015, and December 31, 2020. Descriptive analysis and prevalence ratio estimation for deaths were performed using the robust Poisson regression method.Results. There were 317 patients with confirmed hyperglycemic crises, 43 (13.56%) with diabetic ketoacidosis, 9 (2.83%) in hyperosmolar state, and 265 (83.59%) with combined diabetic ketoacidosis and hyperosmolar state. Infection was the most frequent triggering cause (52.52%). Fatalities due to ketoacidosis occurred in four patients (9.30%) and combined diabetic ketoacidosis/hyperosmolar state in 22 patients (8.30%); no patient had a hyperosmolar state. Mechanical ventilation was associated with death occurrence (adjusted PR = 1.15; 95 % CI 95 = 1.06 - 1.24).Conclusions. The combined state was the most prevalent presentation of the hyperglycemic crisis, with a mortality rate similar to diabetic ketoacidosis. Invasive mechanical ventilation was associated with a higher occurrence of death.',
'[YEAR_RANGE] 2021-2025 [TEXT] Carbon capture and utilization (CCU) covers an array of technologies for valorizing carbon dioxide (CO2). To date, most mature CCU technology conducted with capture agents operates against the CO2 gradient to desorb CO2 from capture agents, exhibiting high energy penalties and thermal degradation due to the requirement for thermal swings. This Perspective presents a concept of Bio-Integrated Carbon Capture and Utilization (BICCU), which utilizes methanogens for integrated release and conversion of CO2 captured with capture agents. BICCU hereby substitutes the energy-intensive desorption with microbial conversion of captured CO2 by the methanogenic CO2-reduction pathway, utilizing green hydrogen to generate non-fossil methane. Existing carbon capture and utilization technologies are hindered by significant energy penalties. Here, the authors discuss the Bio-Integrated Carbon Capture and Utilization (BICCU) technology, which mitigates the energy penalties while generating valuable C1 and C2 products.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
parquet
- Dataset: parquet
- Size: 6,150,902 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 12 tokens
- mean: 39.88 tokens
- max: 112 tokens
- min: 32 tokens
- mean: 318.63 tokens
- max: 1024 tokens
- Samples:
anchor positive [YEAR_RANGE] 1896-1900 [TEXT] ON THE PIGMENT OF THE NEGRO'S SKIN AND HAIR
[YEAR_RANGE] 1896-1900 [TEXT] The pigmentary granules of the negro's skin and hair can be freed in several ways from the cells in which they are lodged and collected in any desired amount. As thus obtained, these granules are found to be insoluble in dilute alkalies, dilute hydrochloric acid (hot or cold), alcohol, or other organic solvents when applied in the order named. If, after they have been subjected to the action of dilute hydrochloric acid, they are again treated with dilute alkalies, they are found to give up their pigment, and, on the continued application of heat, the granules dissolve entirely in the alkaline solution, leaving only an insignificant residue. The pigmentary granules are composed of a colourless ground substance or substratum, a pigment, and much inorganic matter. Their inorganic constituents, as thus far determined, are calcium, magnesium, iron, and silicic, phosphoric, and sulphuric acids; and these constituents possibly play an important part in the deposi...
[YEAR_RANGE] 1896-1900 [TEXT] THE HISTOLOGIGAL LESIONS OF ACUTE GLANDERS IN MAN AND OF EXPERIMENTAL GLANDERS IN THE GUINEA-PIG
[YEAR_RANGE] 1896-1900 [TEXT] The glanders nodule in the class of cases studied by us is in no sense analogous to the miliary tubercle in its histogenesis, and our studies afford no support to Baumgarten's views. The primary effect of the bacillus of glanders on a tissue we found to be not a production of epithelioid cells, which undergo necrosis and invasion by leucocytes, as happens in the cases in which the bacillus of tuberculosis is concerned, but to be the production of primary necrosis of the tissue, followed by inflammatory exudation, often of a suppurative character. Degenerative changes rapidly ensue in the inflammatory products. These conclusions are in harmony with the observations of Tedeschi, above referred to.
[YEAR_RANGE] 1896-1900 [TEXT] THE EFFECT OF ODOURS, IRRITANT VAPOURS, AND MENTAL WORK UPON THE BLOOD FLOW
[YEAR_RANGE] 1896-1900 [TEXT] The most important of this investigation has been the completion of various improvements in the construction and use of the plethysmograph, by means of which numerous errors attending the use of the instrument have been eliminated. The results of the work show that all olfactory sensations, so far as they produce any effect through the vasomotor system, tend to diminish the volume of the arm, and therefore presumably cause a congestion of the brain. Whenever the stimulation occassions an increase in the volume of the arm, as sometimes happens, it seems to be due to acceleration of the heart rate, which, of course, tends also to increase the supply of blood to the brain. The of odours varies in extent with different individuals, and with the same individual at different times. It was most marked in subjects sensitive to odours. Irritant vapours, such as formic acid, have a marked effect in the same direction—that is, they cause a strong diminution in the vo...
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
parquet
- Dataset: parquet
- Size: 6,150,902 evaluation samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 10 tokens
- mean: 28.46 tokens
- max: 61 tokens
- min: 23 tokens
- mean: 309.36 tokens
- max: 1024 tokens
- Samples:
anchor positive [YEAR_RANGE] 2021-2025 [TEXT] Construction of Metal/Zeolite Hybrid Nanoframe Reactors via
[YEAR_RANGE] 2021-2025 [TEXT] Metal/zeolite hybrid nanoframes featuring highly accessible compartmental environments, abundant heterogeneous interfaces, and diverse chemical compositions are expected to possess significant potential for heterogeneous catalysis, yet their general synthetic methodology has not yet been established. In this study, we developed a two-step in-situ-kinetics transformation approach to prepare metal/ZSM-5 hybrid nanoframes with exceptionally open nanostructures, tunable metal compositions, and abundant accessible active sites. Initially, the process involved the formation of single-crystalline ZSM-5 nanoframes through an anisotropic etching and recrystallization kinetic transformation process. Subsequently, through an in situ reaction of the Ni2+ ions and the silica species etched from ZSM-5 nanoframes, layered nickel silicate emerged on both the inner and outer surfaces of the zeolite nanoframes. Upon reduction under a hydrogen atmosphere, well-dispersed Ni n...
[YEAR_RANGE] 2021-2025 [TEXT] Genome-wide sRNA and mRNA transcriptomic profiling insights into carbapenem-resistant
[YEAR_RANGE] 2021-2025 [TEXT] Introduction Acinetobacter baumannii (AB) is rising as a human pathogen of critical priority worldwide as it is the leading cause of opportunistic infections in healthcare settings and carbapenem-resistant AB is listed as a “super bacterium” or “priority pathogen for drug resistance” by the World Health Organization.MethodsClinical isolates of A. baumannii were collected and tested for antimicrobial susceptibility. Among them, carbapenem-resistant and carbapenem-sensitive A. baumannii were subjected to prokaryotic transcriptome sequencing. The change of sRNA and mRNA expression was analyzed by bioinformatics and validated by quantitative reverse transcription-PCR.ResultsA total of 687 clinical isolates were collected, of which 336 strains of A. baumannii were resistant to carbapenem. Five hundred and six differentially expressed genes and nineteen differentially expressed sRNA candidates were discovered through transcriptomic profile analysis between carba...
[YEAR_RANGE] 2021-2025 [TEXT] Evaluation and modeling of diaphragm displacement using ultrasound imaging for wearable respiratory assistive robot
[YEAR_RANGE] 2021-2025 [TEXT] IntroductionAssessing the influence of respiratory assistive devices on the diaphragm mobility is essential for advancing patient care and improving treatment outcomes. Existing respiratory assistive robots have not yet effectively assessed their impact on diaphragm mobility. In this study, we introduce for the first time a non-invasive, real-time clinically feasible ultrasound method to evaluate the impact of soft wearable robots on diaphragm displacement.MethodsWe measured and compared diaphragm displacement and lung volume in eight participants during both spontaneous and robotic-assisted respiration. Building on these measurements, we proposed a human-robot coupled two-compartment respiratory mechanics model that elucidates the underlying mechanism by which our extracorporeal wearable robots augments respiration. Specifically, the soft robot applies external compression to the abdominal wall muscles, inducing their inward movement, which consequently p...
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 128learning_rate
: 2e-05num_train_epochs
: 1max_steps
: 45651log_level
: infofp16
: Truedataloader_num_workers
: 16load_best_model_at_end
: Trueresume_from_checkpoint
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: 45651lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: infolog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 16dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Truehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0000 | 1 | 2.0182 | - |
0.0219 | 1000 | 0.3181 | - |
0.0438 | 2000 | 0.1044 | - |
0.0657 | 3000 | 0.0819 | - |
0.0876 | 4000 | 0.0797 | - |
0.1095 | 5000 | 0.0696 | - |
0.1314 | 6000 | 0.0695 | - |
0.1533 | 7000 | 0.0667 | - |
0.1752 | 8000 | 0.0539 | - |
0.1971 | 9000 | 0.061 | - |
0.2190 | 10000 | 0.0605 | - |
0.2410 | 11000 | 0.0543 | - |
0.2629 | 12000 | 0.0574 | - |
0.2848 | 13000 | 0.0551 | - |
0.3067 | 14000 | 0.053 | - |
0.3286 | 15000 | 0.0478 | - |
0.3505 | 16000 | 0.0536 | - |
0.3724 | 17000 | 0.057 | - |
0.3943 | 18000 | 0.0652 | - |
0.4162 | 19000 | 0.0452 | - |
0.4381 | 20000 | 0.0604 | - |
0.4600 | 21000 | 0.054 | - |
0.4819 | 22000 | 0.0475 | - |
0.5038 | 23000 | 0.0511 | - |
0.5257 | 24000 | 0.0486 | - |
0.5476 | 25000 | 0.0485 | - |
0.5695 | 26000 | 0.0427 | - |
0.5914 | 27000 | 0.0406 | - |
0.6133 | 28000 | 0.0353 | - |
0.6352 | 29000 | 0.0413 | - |
0.6571 | 30000 | 0.0362 | - |
0.6791 | 31000 | 0.0382 | - |
0.7010 | 32000 | 0.0384 | - |
0.7229 | 33000 | 0.0409 | - |
0.7448 | 34000 | 0.0373 | - |
0.7667 | 35000 | 0.0368 | - |
0.7886 | 36000 | 0.0347 | - |
0.8105 | 37000 | 0.0392 | - |
0.8324 | 38000 | 0.0365 | - |
0.8543 | 39000 | 0.036 | - |
0.8762 | 40000 | 0.033 | - |
0.8981 | 41000 | 0.0364 | - |
0.9200 | 42000 | 0.0391 | - |
0.9419 | 43000 | 0.0354 | - |
0.9638 | 44000 | 0.0354 | - |
0.9857 | 45000 | 0.0362 | - |
1.0000 | 45651 | - | 0.0047 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.4.1
- Transformers: 4.48.2
- PyTorch: 2.6.0+cu124
- Accelerate: 1.3.0
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 12
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.