bobox's picture
Training in progress, step 82, checkpoint
9207800 verified
metadata
base_model: bobox/DeBERTa-small-ST-v1-test-step3
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:260034
  - loss:CachedGISTEmbedLoss
widget:
  - source_sentence: who used to present one man and his dog
    sentences:
      - >-
        One Man and His Dog One Man and His Dog is a BBC television series in
        the United Kingdom featuring sheepdog trials, originally presented by
        Phil Drabble, with commentary by Eric Halsall and, later, by Ray
        Ollerenshaw. It was first aired on 17 February 1976 and continues today
        (since 2013) as a special annual edition of Countryfile. In 1994, Robin
        Page replaced Drabble as the main presenter. Gus Dermody took over as
        commentator until 2012.
      - "animal adjectives [was: ratto, Ratte, raton] - Google Groups animal adjectives [was: ratto, Ratte, raton] Showing 1-9 of 9 messages While trying find the pronunciation of the word \"munger\", I encountered the nearby word \_ \_murine [MYOO-ryn] = relating to mice or rats \_ \_[from Latin _murinus_, which derives from _mus_, \_ \_mouse, whose genetive form is _muris_] So if you need an adjective to refer to lab rodents like _ratto_ or _mausu_, \"murine\" it is. (I would never have discovered this except in an alphabetically arranged dictionary.) There are a lot of animal adjectives of this type, such as ovine (sheep), equine (horse), bovine (bull, cow, calf), aquiline (eagle), murine (rats and mice). \_ But what is needed is a way to lookup an animal and find what the proper adjective is. \_For example, is there an adjective form for \"goat\"? for \"seal\"? for \"elephant\"? for \"whale\"? for \"walrus\"? By the way, I never did find out how \"munger\" is pronounced; the answer is not found in"
      - >-
        A boat is docked and filled with bicycles next to a grassy area on a
        body of water.
  - source_sentence: There were 29 Muslims fatalities in the Cave of the Patriarchs massacre .
    sentences:
      - >-
        Urban Dictionary: Dog and Bone Dog and Bone Cockney rhyming slang for
        phone - the telephone. ''Pick up the dog and bone now'' by Brendan April
        05, 2003 Create a mug The Urban Dictionary Mug One side has the word,
        one side has the definition. Microwave and dishwasher safe. Lotsa space
        for your liquids. Buy the t-shirt The Urban Dictionary T-Shirt Smooth,
        soft, slim fit American Apparel shirt. Custom printed. 100% fine jersey
        cotton, except for heather grey (90% cotton). ^Same as above except can
        be shortened further to 'Dogs' or just 'dog' Get on the dogs and give us
        a bell when your ready. by Phaze October 14, 2004
      - >-
        RAF College Cranwell - Local Area Information RAF College Cranwell Local
        Area Information Local Area Information RAF College Cranwell is situated
        in the North Kesteven District Council area in the heart of rural
        Lincolnshire, 5 miles from Sleaford and 14 miles from the City of
        Lincoln, surrounded by bustling market towns, picturesque villages and
        landscapes steeped in aviation history. Lincolnshire is currently home
        to several operational RAF airfields and was a key location during WWII
        for bomber stations. Museums, memorials, former airfields, heritage and
        visitor centres bear witness to the bravery of the men and women of this
        time. The ancient City of Lincoln dates back at least to Roman times and
        boasts a spectacular Cathedral and Castle area, whilst Sleaford is the
        home to the National Centre for Craft & Design. Please click on the Logo
        to access website
      - >-
        29 Muslims were killed and more than 100 others wounded . [   Settlers
        remember gunman Goldstein ; Hebron riots continue ] .
  - source_sentence: What requires energy for growth?
    sentences:
      - >-
        an organism requires energy for growth. Fish Fish are the ultimate
        aquatic organism. 
         a fish require energy for growth
      - >-
        In August , after the end of the war in June 1902 , Higgins Southampton
        left the `` SSBavarian '' and returned to Cape Town the following month
        .
      - >-
        Rhinestone Cowboy "Rhinestone Cowboy" is a song written by Larry Weiss
        and most famously recorded by American country music singer Glen
        Campbell. The song enjoyed huge popularity with both country and pop
        audiences when it was released in 1975.
  - source_sentence: Burning wood is used to produce what type of energy?
    sentences:
      - >-
        Shawnee Trails Council was formed from the merger of the Four Rivers
        Council and the Audubon Council .
      - A Mercedes parked next to a parking meter on a street.
      - |-
        burning wood is used to produce heat. Heat is kinetic energy. 
         burning wood is used to produce kinetic energy.
  - source_sentence: >-
      As of March , more than 413,000 cases have been confirmed in more than 190
      countries with more than 107,000 recoveries .
    sentences:
      - >-
        As of 24 March , more than 414,000 cases of COVID-19 have been reported
        in more than 190 countries and territories , resulting in more than
        18,500 deaths and more than 108,000 recoveries .
      - >-
        Pope Francis makes first visit as head of state to Italy\'s president -
        YouTube Pope Francis makes first visit as head of state to Italy\'s
        president Want to watch this again later? Sign in to add this video to a
        playlist. Need to report the video? Sign in to report inappropriate
        content. The interactive transcript could not be loaded. Loading...
        Rating is available when the video has been rented. This feature is not
        available right now. Please try again later. Published on Nov 14, 2013
        Pope Francis stepped out of the Vatican, several hundred feet into the
        heart of Rome, to meet with Italian President Giorgio Napolitano, and
        the country\'s Council of Ministers. . --------------------- Suscríbete
        al canal: http://smarturl.it/RomeReports Visita nuestra web:
        http://www.romereports.com/ ROME REPORTS, www.romereports.com, is an
        independent international TV News Agency based in Rome covering the
        activity of the Pope, the life of the Vatican and current social,
        cultural and religious debates. Reporting on the Catholic Church
        requires proximity to the source, in-depth knowledge of the Institution,
        and a high standard of creativity and technical excellence. As few
        broadcasters have a permanent correspondent in Rome, ROME REPORTS is
        geared to inform the public and meet the needs of television
        broadcasting companies around the world through daily news packages,
        weekly newsprograms and documentaries. ---------------------
      - >-
        German shepherds and retrievers are commonly used, but the Belgian
        Malinois has proven to be one of the most outstanding working dogs used
        in military service. Around 85 percent of military working dogs are
        purchased in Germany or the Netherlands, where they have been breeding
        dogs for military purposes for hundreds of years. In addition, the Air
        Force Security Forces Center, Army Veterinary Corps and the 341st
        Training Squadron combine efforts to raise their own dogs; nearly 15
        percent of all military working dogs are now bred here.
model-index:
  - name: SentenceTransformer based on bobox/DeBERTa-small-ST-v1-test-step3
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts test
          type: sts-test
        metrics:
          - type: pearson_cosine
            value: 0.882074513745531
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.9067824582257844
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.9093974331692458
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.9064308923162935
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.9081794284297221
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.9051085820447002
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.8709046425878566
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8757477717096785
            name: Spearman Dot
          - type: pearson_max
            value: 0.9093974331692458
            name: Pearson Max
          - type: spearman_max
            value: 0.9067824582257844
            name: Spearman Max

SentenceTransformer based on bobox/DeBERTa-small-ST-v1-test-step3

This is a sentence-transformers model finetuned from bobox/DeBERTa-small-ST-v1-test-step3 on the bobox/enhanced_nli-50_k dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: bobox/DeBERTa-small-ST-v1-test-step3
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • bobox/enhanced_nli-50_k

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa-small-ST-v1-test-UnifiedDatasets-Ft2-checkpoints-tmp")
# Run inference
sentences = [
    'As of March , more than 413,000 cases have been confirmed in more than 190 countries with more than 107,000 recoveries .',
    'As of 24 March , more than 414,000 cases of COVID-19 have been reported in more than 190 countries and territories , resulting in more than 18,500 deaths and more than 108,000 recoveries .',
    'German shepherds and retrievers are commonly used, but the Belgian Malinois has proven to be one of the most outstanding working dogs used in military service. Around 85 percent of military working dogs are purchased in Germany or the Netherlands, where they have been breeding dogs for military purposes for hundreds of years. In addition, the Air Force Security Forces Center, Army Veterinary Corps and the 341st Training Squadron combine efforts to raise their own dogs; nearly 15 percent of all military working dogs are now bred here.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8821
spearman_cosine 0.9068
pearson_manhattan 0.9094
spearman_manhattan 0.9064
pearson_euclidean 0.9082
spearman_euclidean 0.9051
pearson_dot 0.8709
spearman_dot 0.8757
pearson_max 0.9094
spearman_max 0.9068

Training Details

Training Dataset

bobox/enhanced_nli-50_k

  • Dataset: bobox/enhanced_nli-50_k
  • Size: 260,034 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 39.12 tokens
    • max: 344 tokens
    • min: 2 tokens
    • mean: 60.17 tokens
    • max: 442 tokens
  • Samples:
    sentence1 sentence2
    Temple Meads Railway Station is in which English city? Bristol Temple Meads station roof to be replaced - BBC News BBC News Bristol Temple Meads station roof to be replaced 17 October 2013 Image caption Bristol Temple Meads was designed by Isambard Kingdom Brunel Image caption It will cost Network Rail £15m to replace the station's roof Image caption A pact has been signed to redevelop the station over the next 25 years The entire roof on Bristol Temple Meads railway station is to be replaced. Network Rail says it has secured £15m to carry out maintenance of the roof and install new lighting and cables. The announcement was made as a pact was signed to "significantly transform" the station over the next 25 years. Network Rail, Bristol City Council, the West of England Local Enterprise Partnership, Homes and Communities Agency and English Heritage are supporting the plan. Each has signed the 25-year memorandum of understanding to redevelop the station. Patrick Hallgate, of Network Rail Western, said: "Our plans for Bristol will see the railway significantly transformed by the end of the decade, with more seats, better connections and more frequent services." The railway station was designed by Isambard Kingdom Brunel and opened in 1840.
    Where do most of the digestion reactions occur? Most of the digestion reactions occur in the small intestine.
    Sacko, 22, joined Sporting from French top-flight side Bordeaux in 2014, but has so far been limited to playing for the Portuguese club's B team.
    The former France Under-20 player joined Ligue 2 side Sochaux on loan in February and scored twice in 14 games.
    He is Leeds' third signing of the transfer window, following the arrivals of Marcus Antonsson and Kyle Bartley.
    Find all the latest football transfers on our dedicated page.
    Leeds have signed Sporting Lisbon forward Hadi Sacko on a season-long loan with a view to a permanent deal.
  • Loss: CachedGISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.025}
    

Evaluation Dataset

bobox/enhanced_nli-50_k

  • Dataset: bobox/enhanced_nli-50_k
  • Size: 1,506 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 3 tokens
    • mean: 31.16 tokens
    • max: 340 tokens
    • min: 2 tokens
    • mean: 62.3 tokens
    • max: 455 tokens
  • Samples:
    sentence1 sentence2
    Interestingly, snakes use their forked tongues to smell. Snakes use their tongue to smell things.
    A voltaic cell generates an electric current through a reaction known as a(n) spontaneous redox. A voltaic cell uses what type of reaction to generate an electric current
    As of March 22 , there were more than 321,000 cases with over 13,600 deaths and more than 96,000 recoveries reported worldwide . As of 22 March , more than 321,000 cases of COVID-19 have been reported in over 180 countries and territories , resulting in more than 13,600 deaths and 96,000 recoveries .
  • Loss: CachedGISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.025}
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 320
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-05
  • weight_decay: 0.0001
  • num_train_epochs: 1
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {'num_cycles': 3}
  • warmup_ratio: 0.25
  • save_safetensors: False
  • fp16: True
  • push_to_hub: True
  • hub_model_id: bobox/DeBERTa-small-ST-v1-test-UnifiedDatasets-Ft2-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 320
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0001
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {'num_cycles': 3}
  • warmup_ratio: 0.25
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: bobox/DeBERTa-small-ST-v1-test-UnifiedDatasets-Ft2-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss sts-test_spearman_cosine
0.0012 1 0.3208 - -
0.0025 2 0.1703 - -
0.0037 3 0.3362 - -
0.0049 4 0.3346 - -
0.0062 5 0.2484 - -
0.0074 6 0.2249 - -
0.0086 7 0.2724 - -
0.0098 8 0.251 - -
0.0111 9 0.2413 - -
0.0123 10 0.382 - -
0.0135 11 0.2695 - -
0.0148 12 0.2392 - -
0.0160 13 0.3603 - -
0.0172 14 0.3282 - -
0.0185 15 0.2878 - -
0.0197 16 0.3046 - -
0.0209 17 0.3946 - -
0.0221 18 0.2038 - -
0.0234 19 0.3542 - -
0.0246 20 0.2369 - -
0.0258 21 0.1967 0.1451 0.9081
0.0271 22 0.2368 - -
0.0283 23 0.263 - -
0.0295 24 0.3595 - -
0.0308 25 0.3073 - -
0.0320 26 0.2232 - -
0.0332 27 0.1822 - -
0.0344 28 0.251 - -
0.0357 29 0.2677 - -
0.0369 30 0.3252 - -
0.0381 31 0.2058 - -
0.0394 32 0.3083 - -
0.0406 33 0.2109 - -
0.0418 34 0.2751 - -
0.0431 35 0.2269 - -
0.0443 36 0.2333 - -
0.0455 37 0.2747 - -
0.0467 38 0.1285 - -
0.0480 39 0.3659 - -
0.0492 40 0.3991 - -
0.0504 41 0.2647 - -
0.0517 42 0.3627 0.1373 0.9084
0.0529 43 0.2026 - -
0.0541 44 0.1923 - -
0.0554 45 0.2369 - -
0.0566 46 0.2268 - -
0.0578 47 0.2975 - -
0.0590 48 0.1922 - -
0.0603 49 0.1906 - -
0.0615 50 0.2379 - -
0.0627 51 0.3796 - -
0.0640 52 0.1821 - -
0.0652 53 0.1257 - -
0.0664 54 0.2368 - -
0.0677 55 0.294 - -
0.0689 56 0.2594 - -
0.0701 57 0.2972 - -
0.0713 58 0.2297 - -
0.0726 59 0.1487 - -
0.0738 60 0.182 - -
0.0750 61 0.2516 - -
0.0763 62 0.2809 - -
0.0775 63 0.1371 0.1308 0.9068
0.0787 64 0.2149 - -
0.0800 65 0.1806 - -
0.0812 66 0.1458 - -
0.0824 67 0.249 - -
0.0836 68 0.2787 - -
0.0849 69 0.288 - -
0.0861 70 0.1461 - -
0.0873 71 0.2304 - -
0.0886 72 0.3505 - -
0.0898 73 0.2227 - -
0.0910 74 0.1746 - -
0.0923 75 0.1484 - -
0.0935 76 0.1346 - -
0.0947 77 0.2112 - -
0.0959 78 0.3138 - -
0.0972 79 0.2675 - -
0.0984 80 0.2849 - -
0.0996 81 0.1719 - -
0.1009 82 0.2749 - -

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.0.1
  • Transformers: 4.44.0
  • PyTorch: 2.4.0
  • Accelerate: 0.33.0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}