SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1 on the pub_med_qa_instruction_fotmatted dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()


Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Do transport properties of pancreatic cancer describe gemcitabine delivery and response?',
    'The therapeutic resistance of pancreatic ductal adenocarcinoma (PDAC) is partly ascribed to ineffective delivery of chemotherapy to cancer cells. We hypothesized that physical properties at vascular, extracellular, and cellular scales influence delivery of and response to gemcitabine-based therapy. We developed a method to measure mass transport properties during routine contrast-enhanced CT scans of individual human PDAC tumors. Additionally, we evaluated gemcitabine infusion during PDAC resection in 12 patients, measuring gemcitabine incorporation into tumor DNA and correlating its uptake with human equilibrative nucleoside transporter (hENT1) levels, stromal reaction, and CT-derived mass transport properties. We also studied associations between CT-derived transport properties and clinical outcomes in patients who received preoperative gemcitabine-based chemoradiotherapy for resectable PDAC. Transport modeling of 176 CT scans illustrated striking differences in transport properties between normal pancreas and tumor, with a wide array of enhancement profiles. Reflecting the interpatient differences in contrast enhancement, resected tumors exhibited dramatic differences in gemcitabine DNA incorporation, despite similar intravascular pharmacokinetics. Gemcitabine incorporation into tumor DNA was inversely related to CT-derived transport parameters and PDAC stromal score, after accounting for hENT1 levels. Moreover, stromal score directly correlated with CT-derived parameters. Among 110 patients who received preoperative gemcitabine-based chemoradiotherapy, CT-derived parameters correlated with pathological response and survival.',
    'Atopy and systemic onset juvenile idiopathic arthritis (SoJIA) are two potential outcomes of a dysregulated immune system. Although rare, SoJIA causes 60% of the morbidity of JIA patients which exhibit a wide heterogeneity of prognosis and treatment. Co-morbidities can complicate the responses to therapy. To study the influence of co-existing atopy on the prognosis of SoJIA. Patients diagnosed with SoJIA between Jan 2006 and Sep 2010 were screened, enrolled in this prospective cohort study, and followed for 2 years. Management of SoJIA patients was assessed by ACR Pedi30/50/70 criteria, laboratory variables, and systemic feature score. At disease onset, 61 SoJIA patients (34 male and 27 female) were enrolled and were divided into SoJIA patients with atopy (n\u2009=\u200927) or those without atopy (n\u2009=\u200934). Atopic group at disease onset had significantly higher numbers of affected joints, ferritin levels and IgE serum levels than the non-atopic group. At 3 and 6 months, fewer SoJIA patients with atopy reached the ACR Pedi50 criteria (p\u2009<\u20090.02). During the 2 years of follow-up time, the number of infections and the number of flares were significantly higher in the SoJIA with atopy group (p\u2009<\u20090.01).',
embeddings = model.encode(sentences)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
# [3, 3]




Metric Value
cosine_accuracy 1.0

Training Details

Training Dataset


  • Dataset: pub_med_qa_instruction_fotmatted at 66d9473
  • Size: 8,000 training samples
  • Columns: instruction, context, and context_neg
  • Approximate statistics based on the first 1000 samples:
    instruction context context_neg
    type string string string
    • min: 12 tokens
    • mean: 26.21 tokens
    • max: 56 tokens
    • min: 20 tokens
    • mean: 320.41 tokens
    • max: 512 tokens
    • min: 24 tokens
    • mean: 319.78 tokens
    • max: 512 tokens
  • Samples:
    instruction context context_neg
    Do adipose-derived mesenchymal stem cells ameliorate STZ-induced pancreas damage in type 1 diabetes? To investigate the possibility of adipose-derived mesenchymal stem cells (ADSC) in the treatment of type 1 diabetes (T1D). ADSC were isolated from the adipotic tissue of abdomen in Sprague-Dawley rats (4-6 week-old,female) and expanded in vitro. Cells were then identified by testing their phenotypes through flow cytometry. Balb/c mice (8 week-old, male) were divided into 3 groups: T1D group, ADSC group and control group. Streptozocin (50 mg/kg·d) were injected intraperitoneally into mice of T1D group and ADSC group for 5 consecutive days to establish the T1D model. In ADSC group, ADSC were injected intravenously on day 3 of STZ injection. In control group, only PBS was injected. Fasting blood glucose (FGB) level was examined once a week. At the end of the 4th week, animals were killed. The pathological changes of islet were showed by histochemistry through hematoxylin-eosin staining (HE staining). β cell insulin expression was detected by quantum dots immunofluorescence histochemistry.... We conducted this pooled analysis to assess the prognostic value of pretreatment Quality of Life (QOL) assessments on overall survival (OS) in advanced non-small cell lung cancer (NSCLC). Four hundred twenty patients with advanced NSCLC (stages IIIB with pleural effusion and IV) from six North Central Cancer Treatment Group trials were included in this study. QOL assessments included the single-item Uniscale (355 patients), Lung Cancer Symptom Scale (217 patients), and Functional Assessment of Cancer Therapy-Lung (197 patients). QOL scores were transformed to a 0 to 100 scale with higher scores representing better status and categorized using the sample median or clinically deficient score (CDS, 50). Cox proportional hazards models stratified by study were used to evaluate the prognostic importance of QOL on OS alone and in the presence of other prognostic factors such as performance status, age, gender, body mass index, and laboratory parameters. Pretreatment QOL access...
    Is prevalence of asthma and allergic diseases in Croatian children increasing : survey study? To estimate the prevalence of asthma, allergic rhinitis, and atopic dermatitis among school children in the region of Primorsko-goranska County in Croatia, and compare the results with data from other countries. The study was conducted during the 2001-2002 school year, in complete adherence to the Phase One protocol of the International Study of Asthma and Allergies in Childhood (ISAAC). The target population comprised two age groups (6-7 and 13-14 years) in the region of Primorsko-Goranska County in Croatia. Data were collected using standardized ISAAC written questionnaire and asthma video questionnaire. There were 1,634 participating children in the 6-7 age group (response rate 80.3%) and 2,194 participating children in the 13-14 age group (response rate 89.8%). Estimated 12-month prevalence rates of symptoms were: wheezing 9.7% and 8.4%, allergic rhinitis symptoms 16.9% and 17.5%, allergic rhinoconjunctivitis symptoms 5.6% and 6.7%, and atopic dermatitis symptoms 5.4% and 3.4%, for... Leaks of the blood-brain barrier can be detected on postcontrast-enhanced T1-weighted MRIs. Although early disruptions of the blood-brain barrier appear to be an important risk factor for tissue plasminogen activator-related hemorrhages in rodents, little is known about their incidence and consequences in human stroke. This is a retrospective analysis of a prospectively collected stroke database over the past 6 years. In 52 patients, multimodal MRI (including diffusion-weighted, perfusion-weighted, and postcontrast-enhanced T1-weighted MRI to detect blood-brain barrier changes) had been performed immediately before systemic thrombolysis and in 48 patients within a median of 30 minutes (interquartile range: 30 to 60 minutes) after recombinant tissue plasminogen activator treatment. The incidence of symptomatic hemorrhage (SICH), defined as any parenchymal hemorrhage leading to deterioration in the patient's clinical condition, was related to several clinical and imaging variables, inclu...
    Does synovial fluid glycosaminoglycan concentration correlate with severity of chondropathy or predict progression of osteoarthritis in a canine cruciate deficiency model? Considerable interest exists today in biochemical or immunochemical tests for monitoring the progression of osteoarthritis (OA). It has been suggested that measurements made on synovial fluid (SF) will more accurately reflect the magnitude of cartilage destruction in an index joint than those performed on serum. However, we have shown that the synovitis that occurs in OA affects the rate of protein clearance from the joint. We tested the hypothesis that if adjusted for clearance rate, the SF concentration of cartilage proteoglycans (PG) estimates severity of chondropathy and predicts progression of cartilage damage more accurately than if clearance is not taken into account. Clearance of radioiodinated serum albumin (RISA), a surrogate for the clearance of PG, was measured in 19 adult dogs at baseline and again 16 weeks and 32 weeks after anterior cruciate ligament transection (ACLT). Severity of chondropathy was determined arthroscopically after 16 weeks of instability and at postmort... Previous research has demonstrated that chronic cigarette smoking and major depressive disorder (MDD) are each associated with cognitive decrements. Further, these conditions co-occur commonly, though mechanisms in the comorbid condition are poorly understood. There may be distinct, additive, or overlapping factors underlying comorbid cigarette smoking and MDD. The present study investigated the impact of smoking and MDD on executive function and emotion processing. Participants (N=198) were grouped by diagnostic category (MDD and healthy controls, HC) and smoking status (ever-smokers, ES and never-smokers, NS). Participants completed the Facial Emotion Perception Test (FEPT), a measure of emotional processing, and the parametric Go/No-go task (PGNG), a measure of executive function. FEPT performance was analyzed using ANCOVA with accuracy and reaction time as separate dependent variables. Repeated measures MANCOVA was conducted for PGNG with performance measure and task level as depen...
  • Loss: MultipleNegativesRankingLoss with these parameters:
        "scale": 20.0,
        "similarity_fct": "cos_sim"

Evaluation Dataset


  • Dataset: pub_med_qa_instruction_fotmatted at 66d9473
  • Size: 1,000 evaluation samples
  • Columns: instruction, context, and context_neg
  • Approximate statistics based on the first 1000 samples:
    instruction context context_neg
    type string string string
    • min: 11 tokens
    • mean: 26.49 tokens
    • max: 67 tokens
    • min: 70 tokens
    • mean: 317.54 tokens
    • max: 512 tokens
    • min: 45 tokens
    • mean: 317.42 tokens
    • max: 512 tokens
  • Samples:
    instruction context context_neg
    Is serum 25-hydroxyvitamin D3 related to physical activity and ethnicity but not obesity in a multicultural workforce? Recent research suggests that body vitamin D levels are decreased in coronary heart disease and diabetes, but it is unclear which cardiovascular risk factors are related to vitamin D status. To examine the relation between vitamin D status and major cardiovascular risk factors. Serum 25-hydroxyvitamin D3, a marker of recent sun exposure and vitamin D status, was measured in 390 New Zealand residents (95 Pacific Islanders, 74 Maori and 221 others mostly of European descent), who were part of a larger cross-sectional survey of a workforce (n = 5677) aged 40-64 years. Serum 25-hydroxyvitamin D3 levels were significantly lower in Pacific Islanders (mean (SE) = 56 (3) nmol/L; p = 0.0001) and Maoris (68 (3) nmol/L; p = 0.036) compared with Europeans (75 (2) nmol/L) after adjusting for age, sex and time of year. Also adjusting for ethnic group, 25-hydroxyvitamin D3 was higher in people doing vigorous (aerobic) leisure physical activities (71 (2) nmol/L; p = 0.0066) and moderate (non-aerobic) ... Previous follow-up studies indicate that increased visual cortical, ventral cingulate and subcortical responses of depressed individuals to sad facial stimuli, but not happy stimuli could represent reversible markers of disease severity. We hypothesized that greater responses in these areas to sad stimuli, but not happy stimuli, would predict better subsequent clinical outcome. We also explored areas that would predict a poor outcome. Twelve melancholically depressed individuals in the early stages of antidepressant treatment in a secondary care setting participated in two experiments comparing responses to varying intensities of sad and happy facial stimuli, respectively, using event related functional MRI. They repeated the experiments after a mean delay of 12 weeks of treatment. There was a variation in response to treatment. Greater right visual cortex and right subgenual cingulate (R-BA25) responses to sad stimuli, but not happy stimuli, in the early stages of treatment were assoc...
    Does image subtraction facilitate assessment of volume and density change in ground-glass opacities in chest CT? To study the impact of image subtraction of registered images on the detection of change in pulmonary ground-glass nodules identified on chest CT. A cohort of 33 individuals (25 men, 8 women; age range 51-75 years) with 37 focal ground-glass opacities (GGO) were recruited from a lung cancer screening trial. For every participant, 1 to 3 follow-up scans were available (total number of pairs, 84). Pairs of scans of the same nodule were registered nonrigidly and then subtracted to enhance differences in size and density. Four observers rated size and density change of the GGO between pairs of scans by visual comparison alone and with additional availability of a subtraction image and indicated their confidence. An independent experienced chest radiologist served as an arbiter having all reader data, clinical data, and follow-up examinations available. Nodule pairs for which the arbiter could not establish definite progression, regression, or stability were excluded from further evaluation... Betaine serves as a methyl donor in a reaction converting homocysteine to methionine. It is commonly used for the treatment of hyperhomocysteinemia in humans, which indicates it may be associated with reduced risk of atherosclerosis. However, there have been few data regarding its vascular effect. To investigate the effect of betaine supplementation on atherosclerotic lesion in apolipoprotein (apo) E-deficient mice. Four groups of apoE-deficient mice were fed AIN-93G diets supplemented with 0, 1, 2, or 4 g betaine/100 g diet (no, 1, 2, and 4% betaine, respectively). Wild-type C57BL/6 J mice were fed AIN-93G diet (wild-type). Mice were sacrificed after 0, 7, or 14 weeks of the experimental diets. Atherosclerotic lesion area in the aortic sinus, levels of tumor necrosis factor (TNF)-alpha and monocyte chemoattractant protein (MCP)-1 in aorta and serum, serum lipids, and methylation status of TNF-alpha promoter in aorta were determined. Linear regression analysis showed that the higher do...
    Is overexpression of peptidyl-prolyl isomerase-like 1 associated with the growth of colon cancer cells? To discover novel therapeutic targets for colon cancers, we previously surveyed expression patterns among 23,000 genes in colon cancer tissues using a cDNA microarray. Among the genes that were up-regulated in the tumors, we selected for this study peptidyl-prolyl isomerase-like 1 (PPIL1) encoding PPIL1, a cyclophilin-related protein. Western blot analysis and immunohistochemical staining using PPIL1-specific antibody showed that PPIL1 protein was frequently overexpressed in colon cancer cells compared with noncancerous epithelial cells of the colon mucosa. Colony formation assay showed a growth-promoting effect of wild-type PPIL1 on NIH3T3 and HEK293 cells. Consistently, transfection of short-interfering RNA specific to PPIL1 into SNUC4 and SNUC5 cells effectively reduced expression of the gene and retarded growth of the colon cancer cells. We further identified two PPIL1-interacting proteins, SNW1/SKIP (SKI-binding protein) and stathmin. SNW1/SKIP is involved in the regulation of tra... Protocols call for the start of hormonal therapy with levothyroxine after the declaration of brain death. As the hormonal perturbations occur during the process of brain death, the role of the early initiation of levothyroxine therapy (LT) to salvage organs is not well defined. The aim of this study was to evaluate the impact of early LT (before the declaration of brain death) on the number of solid organs procured per donor. We performed an 8-year retrospective analysis of all trauma patients who progressed to brain death. Patients who consented for organ donation, received LT, and donated solid organs were included. Patients were dichotomized into two groups: early LT group, patients who received LT before the declaration of brain death, and late LT group, those who received LT after brain death. The two groups were compared for differences in demographics, clinical characteristics, need for vasopressor, and number of solid organ donation. A total of 100 solid organ donors were ident...
  • Loss: MultipleNegativesRankingLoss with these parameters:
        "scale": 20.0,
        "similarity_fct": "cos_sim"

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss ai-pubmed-validation_cosine_accuracy
-1 -1 - - 1.0
0.2 100 0.0041 0.0031 1.0
0.4 200 0.0023 0.0032 1.0
0.6 300 0.0027 0.0033 1.0
0.8 400 0.007 0.0030 1.0
1.0 500 0.0024 0.0028 1.0

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0



Sentence Transformers

    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "",


    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
Downloads last month
Model size
82.1M params
Tensor type
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for subhrajit-mohanty/distilroberta-PubMedQAinstruction-embeddings

this model

Dataset used to train subhrajit-mohanty/distilroberta-PubMedQAinstruction-embeddings

Evaluation results