metadata

base_model: bobox/DeBERTa-small-ST-v1-test-step3
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:260034
  - loss:CachedGISTEmbedLoss
widget:
  - source_sentence: who used to present one man and his dog
    sentences:
      - >-
        One Man and His Dog One Man and His Dog is a BBC television series in
        the United Kingdom featuring sheepdog trials, originally presented by
        Phil Drabble, with commentary by Eric Halsall and, later, by Ray
        Ollerenshaw. It was first aired on 17 February 1976 and continues today
        (since 2013) as a special annual edition of Countryfile. In 1994, Robin
        Page replaced Drabble as the main presenter. Gus Dermody took over as
        commentator until 2012.
      - "animal adjectives [was: ratto, Ratte, raton] - Google Groups animal adjectives [was: ratto, Ratte, raton] Showing 1-9 of 9 messages While trying find the pronunciation of the word \"munger\", I encountered the nearby word \_ \_murine [MYOO-ryn] = relating to mice or rats \_ \_[from Latin _murinus_, which derives from _mus_, \_ \_mouse, whose genetive form is _muris_] So if you need an adjective to refer to lab rodents like _ratto_ or _mausu_, \"murine\" it is. (I would never have discovered this except in an alphabetically arranged dictionary.) There are a lot of animal adjectives of this type, such as ovine (sheep), equine (horse), bovine (bull, cow, calf), aquiline (eagle), murine (rats and mice). \_ But what is needed is a way to lookup an animal and find what the proper adjective is. \_For example, is there an adjective form for \"goat\"? for \"seal\"? for \"elephant\"? for \"whale\"? for \"walrus\"? By the way, I never did find out how \"munger\" is pronounced; the answer is not found in"
      - >-
        A boat is docked and filled with bicycles next to a grassy area on a
        body of water.
  - source_sentence: There were 29 Muslims fatalities in the Cave of the Patriarchs massacre .
    sentences:
      - >-
        Urban Dictionary: Dog and Bone Dog and Bone Cockney rhyming slang for
        phone - the telephone. ''Pick up the dog and bone now'' by Brendan April
        05, 2003 Create a mug The Urban Dictionary Mug One side has the word,
        one side has the definition. Microwave and dishwasher safe. Lotsa space
        for your liquids. Buy the t-shirt The Urban Dictionary T-Shirt Smooth,
        soft, slim fit American Apparel shirt. Custom printed. 100% fine jersey
        cotton, except for heather grey (90% cotton). ^Same as above except can
        be shortened further to 'Dogs' or just 'dog' Get on the dogs and give us
        a bell when your ready. by Phaze October 14, 2004
      - >-
        RAF College Cranwell - Local Area Information RAF College Cranwell Local
        Area Information Local Area Information RAF College Cranwell is situated
        in the North Kesteven District Council area in the heart of rural
        Lincolnshire, 5 miles from Sleaford and 14 miles from the City of
        Lincoln, surrounded by bustling market towns, picturesque villages and
        landscapes steeped in aviation history. Lincolnshire is currently home
        to several operational RAF airfields and was a key location during WWII
        for bomber stations. Museums, memorials, former airfields, heritage and
        visitor centres bear witness to the bravery of the men and women of this
        time. The ancient City of Lincoln dates back at least to Roman times and
        boasts a spectacular Cathedral and Castle area, whilst Sleaford is the
        home to the National Centre for Craft & Design. Please click on the Logo
        to access website
      - >-
        29 Muslims were killed and more than 100 others wounded . [   Settlers
        remember gunman Goldstein ; Hebron riots continue ] .
  - source_sentence: What requires energy for growth?
    sentences:
      - >-
        an organism requires energy for growth. Fish Fish are the ultimate
        aquatic organism. 
         a fish require energy for growth
      - >-
        In August , after the end of the war in June 1902 , Higgins Southampton
        left the `` SSBavarian '' and returned to Cape Town the following month
        .
      - >-
        Rhinestone Cowboy "Rhinestone Cowboy" is a song written by Larry Weiss
        and most famously recorded by American country music singer Glen
        Campbell. The song enjoyed huge popularity with both country and pop
        audiences when it was released in 1975.
  - source_sentence: Burning wood is used to produce what type of energy?
    sentences:
      - >-
        Shawnee Trails Council was formed from the merger of the Four Rivers
        Council and the Audubon Council .
      - A Mercedes parked next to a parking meter on a street.
      - |-
        burning wood is used to produce heat. Heat is kinetic energy. 
         burning wood is used to produce kinetic energy.
  - source_sentence: >-
      As of March , more than 413,000 cases have been confirmed in more than 190
      countries with more than 107,000 recoveries .
    sentences:
      - >-
        As of 24 March , more than 414,000 cases of COVID-19 have been reported
        in more than 190 countries and territories , resulting in more than
        18,500 deaths and more than 108,000 recoveries .
      - >-
        Pope Francis makes first visit as head of state to Italy\'s president -
        YouTube Pope Francis makes first visit as head of state to Italy\'s
        president Want to watch this again later? Sign in to add this video to a
        playlist. Need to report the video? Sign in to report inappropriate
        content. The interactive transcript could not be loaded. Loading...
        Rating is available when the video has been rented. This feature is not
        available right now. Please try again later. Published on Nov 14, 2013
        Pope Francis stepped out of the Vatican, several hundred feet into the
        heart of Rome, to meet with Italian President Giorgio Napolitano, and
        the country\'s Council of Ministers. . --------------------- Suscríbete
        al canal: http://smarturl.it/RomeReports Visita nuestra web:
        http://www.romereports.com/ ROME REPORTS, www.romereports.com, is an
        independent international TV News Agency based in Rome covering the
        activity of the Pope, the life of the Vatican and current social,
        cultural and religious debates. Reporting on the Catholic Church
        requires proximity to the source, in-depth knowledge of the Institution,
        and a high standard of creativity and technical excellence. As few
        broadcasters have a permanent correspondent in Rome, ROME REPORTS is
        geared to inform the public and meet the needs of television
        broadcasting companies around the world through daily news packages,
        weekly newsprograms and documentaries. ---------------------
      - >-
        German shepherds and retrievers are commonly used, but the Belgian
        Malinois has proven to be one of the most outstanding working dogs used
        in military service. Around 85 percent of military working dogs are
        purchased in Germany or the Netherlands, where they have been breeding
        dogs for military purposes for hundreds of years. In addition, the Air
        Force Security Forces Center, Army Veterinary Corps and the 341st
        Training Squadron combine efforts to raise their own dogs; nearly 15
        percent of all military working dogs are now bred here.
model-index:
  - name: SentenceTransformer based on bobox/DeBERTa-small-ST-v1-test-step3
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts test
          type: sts-test
        metrics:
          - type: pearson_cosine
            value: 0.882074513745531
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.9067824582257844
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.9093974331692458
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.9064308923162935
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.9081794284297221
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.9051085820447002
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.8709046425878566
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8757477717096785
            name: Spearman Dot
          - type: pearson_max
            value: 0.9093974331692458
            name: Pearson Max
          - type: spearman_max
            value: 0.9067824582257844
            name: Spearman Max

SentenceTransformer based on bobox/DeBERTa-small-ST-v1-test-step3

This is a sentence-transformers model finetuned from bobox/DeBERTa-small-ST-v1-test-step3 on the bobox/enhanced_nli-50_k dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: bobox/DeBERTa-small-ST-v1-test-step3
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 tokens
Similarity Function: Cosine Similarity
Training Dataset:
- bobox/enhanced_nli-50_k

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa-small-ST-v1-test-UnifiedDatasets-Ft2-checkpoints-tmp")
# Run inference
sentences = [
    'As of March , more than 413,000 cases have been confirmed in more than 190 countries with more than 107,000 recoveries .',
    'As of 24 March , more than 414,000 cases of COVID-19 have been reported in more than 190 countries and territories , resulting in more than 18,500 deaths and more than 108,000 recoveries .',
    'German shepherds and retrievers are commonly used, but the Belgian Malinois has proven to be one of the most outstanding working dogs used in military service. Around 85 percent of military working dogs are purchased in Germany or the Netherlands, where they have been breeding dogs for military purposes for hundreds of years. In addition, the Air Force Security Forces Center, Army Veterinary Corps and the 341st Training Squadron combine efforts to raise their own dogs; nearly 15 percent of all military working dogs are now bred here.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Dataset: sts-test
Evaluated with EmbeddingSimilarityEvaluator

Metric	Value
pearson_cosine	0.8821
spearman_cosine	0.9068
pearson_manhattan	0.9094
spearman_manhattan	0.9064
pearson_euclidean	0.9082
spearman_euclidean	0.9051
pearson_dot	0.8709
spearman_dot	0.8757
pearson_max	0.9094
spearman_max	0.9068

Training Details

Training Dataset

bobox/enhanced_nli-50_k

Dataset: bobox/enhanced_nli-50_k
Size: 260,034 training samples
Columns: sentence1 and sentence2
Approximate statistics based on the first 1000 samples:
sentence1 sentence2
type string string
details
min: 4 tokens
mean: 39.12 tokens
max: 344 tokens

min: 2 tokens
mean: 60.17 tokens
max: 442 tokens

	sentence1	sentence2
type	string	string
details	min: 4 tokens mean: 39.12 tokens max: 344 tokens	min: 2 tokens mean: 60.17 tokens max: 442 tokens

Samples:

sentence1	sentence2
`Temple Meads Railway Station is in which English city?`	Bristol Temple Meads station roof to be replaced - BBC News BBC News Bristol Temple Meads station roof to be replaced 17 October 2013 Image caption Bristol Temple Meads was designed by Isambard Kingdom Brunel Image caption It will cost Network Rail £15m to replace the station's roof Image caption A pact has been signed to redevelop the station over the next 25 years The entire roof on Bristol Temple Meads railway station is to be replaced. Network Rail says it has secured £15m to carry out maintenance of the roof and install new lighting and cables. The announcement was made as a pact was signed to "significantly transform" the station over the next 25 years. Network Rail, Bristol City Council, the West of England Local Enterprise Partnership, Homes and Communities Agency and English Heritage are supporting the plan. Each has signed the 25-year memorandum of understanding to redevelop the station. Patrick Hallgate, of Network Rail Western, said: "Our plans for Bristol will see the railway significantly transformed by the end of the decade, with more seats, better connections and more frequent services." The railway station was designed by Isambard Kingdom Brunel and opened in 1840.
`Where do most of the digestion reactions occur?`	`Most of the digestion reactions occur in the small intestine.`
`Sacko, 22, joined Sporting from French top-flight side Bordeaux in 2014, but has so far been limited to playing for the Portuguese club's B team. The former France Under-20 player joined Ligue 2 side Sochaux on loan in February and scored twice in 14 games. He is Leeds' third signing of the transfer window, following the arrivals of Marcus Antonsson and Kyle Bartley. Find all the latest football transfers on our dedicated page.`	`Leeds have signed Sporting Lisbon forward Hadi Sacko on a season-long loan with a view to a permanent deal.`

Loss: CachedGISTEmbedLoss with these parameters:

{'guide': SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
), 'temperature': 0.025}

Evaluation Dataset

bobox/enhanced_nli-50_k

Dataset: bobox/enhanced_nli-50_k
Size: 1,506 evaluation samples
Columns: sentence1 and sentence2
Approximate statistics based on the first 1000 samples:
sentence1 sentence2
type string string
details
min: 3 tokens
mean: 31.16 tokens
max: 340 tokens

min: 2 tokens
mean: 62.3 tokens
max: 455 tokens

	sentence1	sentence2
type	string	string
details	min: 3 tokens mean: 31.16 tokens max: 340 tokens	min: 2 tokens mean: 62.3 tokens max: 455 tokens

Samples:

sentence1	sentence2
`Interestingly, snakes use their forked tongues to smell.`	`Snakes use their tongue to smell things.`
`A voltaic cell generates an electric current through a reaction known as a(n) spontaneous redox.`	`A voltaic cell uses what type of reaction to generate an electric current`
`As of March 22 , there were more than 321,000 cases with over 13,600 deaths and more than 96,000 recoveries reported worldwide .`	`As of 22 March , more than 321,000 cases of COVID-19 have been reported in over 180 countries and territories , resulting in more than 13,600 deaths and 96,000 recoveries .`

Loss: CachedGISTEmbedLoss with these parameters:

{'guide': SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
), 'temperature': 0.025}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 320
per_device_eval_batch_size: 128
learning_rate: 2e-05
weight_decay: 0.0001
num_train_epochs: 1
lr_scheduler_type: cosine_with_restarts
lr_scheduler_kwargs: {'num_cycles': 3}
warmup_ratio: 0.25
save_safetensors: False
fp16: True
push_to_hub: True
hub_model_id: bobox/DeBERTa-small-ST-v1-test-UnifiedDatasets-Ft2-checkpoints-tmp
hub_strategy: all_checkpoints
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 320
per_device_eval_batch_size: 128
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0001
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: cosine_with_restarts
lr_scheduler_kwargs: {'num_cycles': 3}
warmup_ratio: 0.25
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: True
resume_from_checkpoint: None
hub_model_id: bobox/DeBERTa-small-ST-v1-test-UnifiedDatasets-Ft2-checkpoints-tmp
hub_strategy: all_checkpoints
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
eval_use_gather_object: False
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	loss	sts-test_spearman_cosine
0.0012	1	0.3208	-	-
0.0025	2	0.1703	-	-
0.0037	3	0.3362	-	-
0.0049	4	0.3346	-	-
0.0062	5	0.2484	-	-
0.0074	6	0.2249	-	-
0.0086	7	0.2724	-	-
0.0098	8	0.251	-	-
0.0111	9	0.2413	-	-
0.0123	10	0.382	-	-
0.0135	11	0.2695	-	-
0.0148	12	0.2392	-	-
0.0160	13	0.3603	-	-
0.0172	14	0.3282	-	-
0.0185	15	0.2878	-	-
0.0197	16	0.3046	-	-
0.0209	17	0.3946	-	-
0.0221	18	0.2038	-	-
0.0234	19	0.3542	-	-
0.0246	20	0.2369	-	-
0.0258	21	0.1967	0.1451	0.9081
0.0271	22	0.2368	-	-
0.0283	23	0.263	-	-
0.0295	24	0.3595	-	-
0.0308	25	0.3073	-	-
0.0320	26	0.2232	-	-
0.0332	27	0.1822	-	-
0.0344	28	0.251	-	-
0.0357	29	0.2677	-	-
0.0369	30	0.3252	-	-
0.0381	31	0.2058	-	-
0.0394	32	0.3083	-	-
0.0406	33	0.2109	-	-
0.0418	34	0.2751	-	-
0.0431	35	0.2269	-	-
0.0443	36	0.2333	-	-
0.0455	37	0.2747	-	-
0.0467	38	0.1285	-	-
0.0480	39	0.3659	-	-
0.0492	40	0.3991	-	-
0.0504	41	0.2647	-	-
0.0517	42	0.3627	0.1373	0.9084
0.0529	43	0.2026	-	-
0.0541	44	0.1923	-	-
0.0554	45	0.2369	-	-
0.0566	46	0.2268	-	-
0.0578	47	0.2975	-	-
0.0590	48	0.1922	-	-
0.0603	49	0.1906	-	-
0.0615	50	0.2379	-	-
0.0627	51	0.3796	-	-
0.0640	52	0.1821	-	-
0.0652	53	0.1257	-	-
0.0664	54	0.2368	-	-
0.0677	55	0.294	-	-
0.0689	56	0.2594	-	-
0.0701	57	0.2972	-	-
0.0713	58	0.2297	-	-
0.0726	59	0.1487	-	-
0.0738	60	0.182	-	-
0.0750	61	0.2516	-	-
0.0763	62	0.2809	-	-
0.0775	63	0.1371	0.1308	0.9068
0.0787	64	0.2149	-	-
0.0800	65	0.1806	-	-
0.0812	66	0.1458	-	-
0.0824	67	0.249	-	-
0.0836	68	0.2787	-	-
0.0849	69	0.288	-	-
0.0861	70	0.1461	-	-
0.0873	71	0.2304	-	-
0.0886	72	0.3505	-	-
0.0898	73	0.2227	-	-
0.0910	74	0.1746	-	-
0.0923	75	0.1484	-	-
0.0935	76	0.1346	-	-
0.0947	77	0.2112	-	-
0.0959	78	0.3138	-	-
0.0972	79	0.2675	-	-
0.0984	80	0.2849	-	-
0.0996	81	0.1719	-	-
0.1009	82	0.2749	-	-

Framework Versions

Python: 3.10.14
Sentence Transformers: 3.0.1
Transformers: 4.44.0
PyTorch: 2.4.0
Accelerate: 0.33.0
Datasets: 2.21.0
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}