Edit model card

SentenceTransformer based on bobox/DeBERTa-ST-AllLayers-v3-checkpoints-tmp

This is a sentence-transformers model finetuned from bobox/DeBERTa-ST-AllLayers-v3-checkpoints-tmp on the nli-pairs, sts-label, vitaminc-pairs, qnli-contrastive, scitail-pairs-qa, scitail-pairs-pos, xsum-pairs, compression-pairs, compression-pairs2, compression-pairs3, sciq_pairs, qasc_pairs, openbookqa_pairs, msmarco_pairs, msmarco_pairs2, nq_pairs, nq_pairs2, trivia_pairs, quora_pairs, gooaq_pairs, gooaq_pairs2 and mrpc_pairs datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa-ST-AllLayers-v3.1")
# Run inference
sentences = [
    'Hydrogen is in gas form at room temperature.',
    'In what form of matter is hydrogen at room temperature?',
    'How hot is it on the surface of the Sun?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.7886
spearman_cosine 0.8083
pearson_manhattan 0.7125
spearman_manhattan 0.7063
pearson_euclidean 0.7191
spearman_euclidean 0.7111
pearson_dot 0.5099
spearman_dot 0.4911
pearson_max 0.7886
spearman_max 0.8083

Training Details

Training Datasets

nli-pairs

  • Dataset: nli-pairs at d482672
  • Size: 18,000 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 5 tokens
    • mean: 16.62 tokens
    • max: 62 tokens
    • min: 4 tokens
    • mean: 9.46 tokens
    • max: 29 tokens
  • Samples:
    sentence1 sentence2
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse.
    Children smiling and waving at camera There are children present
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

sts-label

  • Dataset: sts-label at ab7a5ac
  • Size: 5,749 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 6 tokens
    • mean: 9.81 tokens
    • max: 27 tokens
    • min: 5 tokens
    • mean: 9.74 tokens
    • max: 25 tokens
    • min: 0.0
    • mean: 0.54
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    A plane is taking off. An air plane is taking off. 1.0
    A man is playing a large flute. A man is playing a flute. 0.76
    A man is spreading shreded cheese on a pizza. A man is spreading shredded cheese on an uncooked pizza. 0.76
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

vitaminc-pairs

  • Dataset: vitaminc-pairs at be6febb
  • Size: 18,000 training samples
  • Columns: claim and evidence
  • Approximate statistics based on the first 1000 samples:
    claim evidence
    type string string
    details
    • min: 6 tokens
    • mean: 16.71 tokens
    • max: 54 tokens
    • min: 8 tokens
    • mean: 37.14 tokens
    • max: 260 tokens
  • Samples:
    claim evidence
    Manchester had a population of more than 540,000 in 2017 and was the 5th most populous English district . Manchester ( ) is a major city and metropolitan borough in Greater Manchester , England , with a population of 545,500 as of 2017 ( 5th most populous English district ) .
    Manchester had a population of less than 540,000 in 2018 and was the 4th most populous English district . Manchester ( ) is a major city and metropolitan borough in Greater Manchester , England , with a population of 534,982 as of 2018 ( 4th most populous English district ) .
    Traditional Chinese medicine is founded on more than 4000 years of ancient Chinese medical science and practice . Traditional Chinese medicine ( TCM ; ) is an ancient system of medical diagnosis and treatment of illnesses with a holistic focus on disease prevention through diet , healthy lifestyle changes , exercise and is built on a patient centered clinically oriented foundation of more than 6,500 years of ancient Chinese medical science and practice that includes various forms of herbal medicine , acupuncture , massage ( tui na ) , exercise ( qigong ) , and dietary therapy , but recently also influenced by modern Western medicine .
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

qnli-contrastive

  • Dataset: qnli-contrastive at bcdcba7
  • Size: 17,000 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 6 tokens
    • mean: 13.7 tokens
    • max: 27 tokens
    • min: 6 tokens
    • mean: 35.42 tokens
    • max: 499 tokens
    • 0: 100.00%
  • Samples:
    sentence1 sentence2 label
    What kind of sound did Kanye abandon a rap and hiphop one for with his fourth album? West's fourth studio album, 808s & Heartbreak (2008), marked an even more radical departure from his previous releases, largely abandoning rap and hip hop stylings in favor of a stark electropop sound composed of virtual synthesis, the Roland TR-808 drum machine, and explicitly auto-tuned vocal tracks. 0
    When did hostilities in the Persian Gulf War begin? The lead up to the war began with the Iraqi invasion of Kuwait in August 1990 which was met with immediate economic sanctions by the United Nations against Iraq. 0
    What were two of Marvel's comic heroes in fantasy, swords and magic settings? Once again, Marvel attempted to diversify, and with the updating of the Comics Code achieved moderate to strong success with titles themed to horror (The Tomb of Dracula), martial arts, (Shang-Chi: Master of Kung Fu), sword-and-sorcery (Conan the Barbarian, Red Sonja), satire (Howard the Duck) and science fiction (2001: 0
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "OnlineContrastiveLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

scitail-pairs-qa

  • Dataset: scitail-pairs-qa at 0cc4353
  • Size: 14,312 training samples
  • Columns: sentence2 and sentence1
  • Approximate statistics based on the first 1000 samples:
    sentence2 sentence1
    type string string
    details
    • min: 7 tokens
    • mean: 15.92 tokens
    • max: 41 tokens
    • min: 6 tokens
    • mean: 15.03 tokens
    • max: 34 tokens
  • Samples:
    sentence2 sentence1
    Thermal energy constitutes the total kinetic energy of all the atoms that make up an object. What kind of energy constitutes the total kinetic energy of all the atoms that make up an object?
    Overharvesting is a serious threat particularly to aquatic species. Overharvesting is a serious threat particularly to which species?
    Cellulose is created by the polymerization of glucose. What is created by the polymerization of glucose?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

scitail-pairs-pos

  • Dataset: scitail-pairs-pos at 0cc4353
  • Size: 8,600 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 5 tokens
    • mean: 23.59 tokens
    • max: 60 tokens
    • min: 7 tokens
    • mean: 15.53 tokens
    • max: 41 tokens
  • Samples:
    sentence1 sentence2
    Most of the ozone in the Earth's atmosphere lies in the stratosphere, the layer above the troposphere. The stratosphere is the layer above the troposphere.
    Exocytosis, fertilization of an egg by sperm and transport of waste products to the lysozome are a few of the many eukaryotic processes that rely on some form of fusion. The cell expels waste and other particles through a process called exocytosis.
    Mercury, the smallest planet in the Solar System, has the most eccentric orbit. Mercury is the smallest planet in our solar system.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

xsum-pairs

  • Dataset: xsum-pairs at 788ddaf
  • Size: 4,000 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 2 tokens
    • mean: 197.9 tokens
    • max: 400 tokens
    • min: 10 tokens
    • mean: 25.48 tokens
    • max: 65 tokens
  • Samples:
    sentence1 sentence2
    Mitchell Starc took 4-36 after captain Steve Smith hit 165 as Australia declared their first innings on 624-8.
    Australia had 68 overs to force a win. They reduced Pakistan to 91-5 at tea before sealing victory in 53.2 overs to take an unassailable 2-0 series lead.
    Pakistan captain Misbah-ul-Haq was fined 40% of his match fee and the rest of his team 20% for a slow over rate.
    "It finished a lot better than it started," Starc said. "It's fantastic for us to get that result. The belief was there and it was a fantastic way to finish.
    "We knew we only had two sessions to get those 10 wickets and together as a bowling unit we've done really well."
    The third Test starts in Sydney on 3 January but Misbah has not committed to playing in the match after a poor series so far.
    The 42-year-old, who was dismissed for a two-ball duck after managing just 11 in the first innings, only scored nine runs in total in the first test in Brisbane.
    "I haven't decided about that [Sydney] but let's see," he said. "[If I'm not contributing] there's no point in hanging around."
    Australia bowled Pakistan out for 163 in Melbourne to win the second Test by an innings and 18 runs on day five.
    The Competition and Markets Authority (CMA) had ordered all insurance companies to split out the extra charges for the additional protection.
    But the Co-op was the only firm which missed a deadline to do so, in August last year.
    As a result around 120,000 customers received quotations that were unclear.
    From 1 February, the Co-op will provide two separate quotations - one with no claims bonus protection, and one without.
    "It is very disappointing that a major company such as Co-op Insurance has taken so long to provide this vital information to its customers," said Adam Land, senior director of remedies, business and financial analysis at the CMA.
    "Before the order came into force, the price and benefits of NCB [no claims bonus] protection were often unclear to drivers.
    "We expect the Co-op to fully comply with the terms of our directions immediately, so that motorists can search more easily for the best deal for them, and decide whether or not they want this optional cover."
    The Co-op said most of its quotations do now provide separate details of no claims bonus charges.
    "For 90% of our new business customers we are already fully compliant with this order," a spokesperson said.
    "We are part way through a major transformation programme, which when complete will allow us to be fully compliant and enable us to provide best in class service to our members."
    The Co-op has been ordered to provide clearer insurance quotations, after it failed to tell motorists about separate charges for no claims bonuses.
    The 21-year-old spent last season with the League Two club, scoring six times Argyle reached the play-off final.
    Tanner was part of the Reading squad which defeated Derek Adams' side 2-0 in the EFL Cup first round on 9 August.
    Tanner, who signed a two-year contract extension with Reading in January 2015, is eligible for Saturday's home fixture against Mansfield Town.
    Find all the latest football transfers on our dedicated page.
    Plymouth Argyle have re-signed Reading midfielder Craig Tanner on a loan deal until January.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

compression-pairs

  • Dataset: compression-pairs at 605bc91
  • Size: 10,125 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 9 tokens
    • mean: 32.07 tokens
    • max: 512 tokens
    • min: 5 tokens
    • mean: 10.27 tokens
    • max: 25 tokens
  • Samples:
    sentence1 sentence2
    Democrat State Senate President John Cullerton will endorse Democrat State Treasurer Alexi Giannoulias for US Senate today. Cullerton to endorse Giannoulias
    Barbara Walters hospitalised after fall Updated: 18:47, Monday January 21, 2013 Veteran ABC newswoman Barbara Walters has been hospitalised after falling at an inauguration party at the residence of Britain's ambassador to the United States. Barbara Walters hospitalised after fall
    Girl Next-Door star and Hugh Hefner's ex Bridget Marquardt recently moved out of his Playboy mansion. Bridget Marquardt moves out of Playboy mansion
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

compression-pairs2

  • Dataset: compression-pairs2 at 605bc91
  • Size: 4,701 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 11 tokens
    • mean: 32.18 tokens
    • max: 232 tokens
    • min: 5 tokens
    • mean: 10.24 tokens
    • max: 29 tokens
  • Samples:
    sentence1 sentence2
    The actress, Natasha Richardson, has died in hospital in New York after suffering a serious head injury in a skiing accident in Canada earlier this week. Actress Natasha Richardson dies
    'WHISPERING' Ted Lowe - the most recognisable voice in the history of snooker broadcasting - died four hours before the balls were broken in the final of the sport's most important championship 'Whispering' Ted Lowe dies
    Nairobi - Somalia's Shabaab Islamists said on Thursday they have executed a French agent they have held since 2009, as France said the hostage was likely killed several days ago in a failed rescue attempt. Shabaab say they have executed French hostage
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": 3,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2,
        "kl_div_weight": 0.75,
        "kl_temperature": 0.75
    }
    

compression-pairs3

  • Dataset: compression-pairs3 at 605bc91
  • Size: 4,700 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 12 tokens
    • mean: 31.18 tokens
    • max: 99 tokens
    • min: 5 tokens
    • mean: 10.02 tokens
    • max: 23 tokens
  • Samples:
    sentence1 sentence2
    Pakistan State Oil has started supply of furnace oil to Karachi Electric Supply Company on discounted rates. Pso starts supply of oil on discounted rates:
    European stocks were little changed as the region's finance chiefs meet today to work on a new strategy to contain the sovereign-debt crisis. European stocks little changed as finance chiefs meet;
    The body of a missing boater has been found in Lake Okeechobee in southwest Florida. Body of missing boater found in Lake Okeechobee
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.01,
        "prior_layers_weight": 10,
        "kl_div_weight": 3,
        "kl_temperature": 0.25
    }
    

sciq_pairs

  • Dataset: sciq_pairs at 2c94ad3
  • Size: 11,153 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 7 tokens
    • mean: 17.12 tokens
    • max: 61 tokens
    • min: 2 tokens
    • mean: 86.03 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    Temperature and what other environmental factor are important in the activity of an enzyme?
    The sum of the superscripts in an electron configuration is equal to the number of electrons in that atom, which is in turn equal to what number? Electron configuration notation eliminates the boxes and arrows of orbital filling diagrams. Each occupied sublevel designation is written followed by a superscript that is the number of electrons in that sublevel. For example, the hydrogen configuration is 1 s 1 , while the helium configuration is 1 s 2 . Multiple occupied sublevels are written one after another. The electron configuration of lithium is 1 s 2 2 s 1 . The sum of the superscripts in an electron configuration is equal to the number of electrons in that atom, which is in turn equal to its atomic number.
    What is the most common type of brain injury? The most common type of brain injury is a concussion. This is a bruise on the surface of the brain. It may cause temporary symptoms such as headache and confusion. Most concussions heal on their own in a few days or weeks. However, repeated concussions can lead to permanent changes in the brain. More serious brain injuries also often cause permanent brain damage.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

qasc_pairs

  • Dataset: qasc_pairs at a34ba20
  • Size: 7,767 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 5 tokens
    • mean: 11.52 tokens
    • max: 29 tokens
    • min: 15 tokens
    • mean: 34.76 tokens
    • max: 75 tokens
  • Samples:
    sentence1 sentence2
    How are organisms able to grow and repair their cells? Cell division is how organisms grow and repair themselves.. Mitosis is cell division.. Mitosis is how organisms grow and repair themselves.
    What cannot absorb light energy? chlorophyll is used for absorbing light energy by plants. Fungi have no chlorophyll.. Fungi cannot absorb light energy.
    what effects the colligative properties of solids adding salt to a solid decreases the freezing point of that solid. Freezing point depression is a colligative property.. salts effect the colligative properties of solids
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

openbookqa_pairs

  • Dataset: openbookqa_pairs
  • Size: 4,505 training samples
  • Columns: question and fact
  • Approximate statistics based on the first 1000 samples:
    question fact
    type string string
    details
    • min: 3 tokens
    • mean: 13.81 tokens
    • max: 78 tokens
    • min: 4 tokens
    • mean: 11.49 tokens
    • max: 30 tokens
  • Samples:
    question fact
    What is animal competition? if two animals eat the same prey then those animals compete for that pey
    If you wanted to make a metal bed frame, where would you start? alloys are made of two or more metals
    Places lacking warmth have few what cold environments contain few organisms
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

msmarco_pairs

  • Dataset: msmarco_pairs at 28ff31e
  • Size: 10,314 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 8.7 tokens
    • max: 41 tokens
    • min: 17 tokens
    • mean: 77.31 tokens
    • max: 258 tokens
  • Samples:
    sentence1 sentence2
    otc medication meaning Over-the-counter (OTC) medicines are drugs you can buy without a prescription. Some OTC medicines relieve aches, pains and itches. Some prevent or cure diseases, like tooth decay and athlete's foot. Others help manage recurring problems, like migraines.
    every how many weeks should you get a hair cut “Somehow people have been taught you need to cut your hair every 4 to 6 weeks and I think that’s way too soon,” she tells InStyle. “If you have a great cut and don’t mind a little added length, the style can last up to 6 months and still look great.”. RELATED: We Tried It: Cindy Crawford’s Cleanse.
    how does a pedometer work This is pretty much how a pedometer works. Photo: Pedometers can measure your steps because your body swings from side to side as you walk. Each swing counts as one step. Multiplying the number of swings by the average length of your steps tells you how far you've gone.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

msmarco_pairs2

  • Dataset: msmarco_pairs2 at 28ff31e
  • Size: 6,876 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 8.73 tokens
    • max: 25 tokens
    • min: 20 tokens
    • mean: 76.57 tokens
    • max: 268 tokens
  • Samples:
    sentence1 sentence2
    what should house humidity percentage be? Lower humidity can cause your heating system to work harder in the winter months. There must be a balance of humidity in your home to make it a comfortable place and preserve your interior as well as you heating and air conditioning systems. During summer months, homes should have between 30 and 45 percent humidity. During the winter, humidity levels should be kept between 45 and 55 percent humidity. How can I regulate my home humidity? The following are some ideas on how you can help to balance your home’s humidity: Install an inside weather station that measures humidity in your home.
    which electrolyte is missing if there is tetany Hypocalcemia is not a term for tetany but is rather a cause of tetany. Causes. The usual cause of tetany is lack of calcium. An excess of phosphate (high phosphate-to-calcium ratio) can also trigger the spasms. Underfunction of the parathyroid gland can lead to tetany. Low levels of carbon dioxide cause tetany by altering the albumin binding of calcium such that the ionized (physiologically influencing) fraction of calcium is reduced; one common reason for low carbon dioxide levels is hyperventilation. Low levels of magnesium can lead to tetany.
    dimensions of the sea of galilee Sea of Galilee From Wikipedia, the free encyclopedia. The Sea of Galilee is Israel's largest freshwater lake, approximately 53 kilometers (33 miles) in circumference, about 21 km (13 miles) long, and 13 km (8 miles) wide; it has a total area of 166 sq km, and a maximum depth of approximately 48 meters. At 213
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": 3,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2,
        "kl_div_weight": 0.75,
        "kl_temperature": 0.75
    }
    

nq_pairs

  • Dataset: nq_pairs at f9e894e
  • Size: 12,892 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 10 tokens
    • mean: 11.88 tokens
    • max: 27 tokens
    • min: 7 tokens
    • mean: 130.01 tokens
    • max: 502 tokens
  • Samples:
    sentence1 sentence2
    who did richmond play in the grand final Richmond Football Club Richmond began 2017 with 5 straight wins, a feat it had not achieved since 1995. A series of close losses hampered the Tigers throughout the middle of the season, including a 5-point loss to the Western Bulldogs, 2-point loss to Fremantle, and a 3-point loss to the Giants. Richmond ended the season strongly with convincing victories over Fremantle and St Kilda in the final two rounds, elevating the club to 3rd on the ladder. Richmond's first final of the season – their qualifying final against the Cats at the MCG attracted a record qualifying final crowd of 95,028; the Tigers won by 51 points. In their first preliminary final since 2001, Richmond defeated Greater Western Sydney by 36 points in front of a crowd of 94,258 to progress to the Grand Final against Adelaide, their first Grand Final appearance since 1982. The attendance was 100,021, the largest crowd for a Grand Final since 1986. The Crows led at quarter time and led by as many as 13, but the Tigers took over the game as it progressed and kicked seven straight goals at one point. They eventually would win by 48 points – 16.12 (108) to Adelaide's 8.12 (60) – to end their 37-year flag drought.[23] Dustin Martin also became the first player to win a Premiership medal, the Brownlow Medal and the Norm Smith Medal in the same season, while Damien Hardwick was named AFL Coaches Association Coach of the Year. Richmond's jump from 13th to premiers also marked the biggest jump from one AFL season to the next.
    who was the first european to translate bhagwat gita into english Charles Wilkins Sir Charles Wilkins, KH, FRS (1749 – 13 May 1836), was an English typographer and Orientalist, and founding member of The Asiatic Society. He is notable as the first translator of Bhagavad Gita into English, and as the creator, alongside Panchanan Karmakar,[1] of the first Bengali typeface.[2] In 1788, Wilkins was elected a member of the Royal Society.[3]
    who played bruce willis's girlfriend in pulp fiction Maria de Medeiros Among Medeiros' most memorable film appearances are three early 1990s roles. Her considerable resemblance to Anaïs Nin landed her the primary role in Henry & June (1990), in which she played the author. In 1990, she played the role of Maria in Ken McMullen's film about the rise of the Paris Commune, 1871. In 1994, Medeiros appeared in Quentin Tarantino's Pulp Fiction playing Fabienne, the girlfriend of Butch Coolidge (Bruce Willis).
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

nq_pairs2

  • Dataset: nq_pairs2 at f9e894e
  • Size: 4,298 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 10 tokens
    • mean: 11.75 tokens
    • max: 21 tokens
    • min: 14 tokens
    • mean: 127.94 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    when does daylight saving time come into effect Daylight saving time Start and end dates vary with location and year. Since 1996, European Summer Time has been observed from the last Sunday in March to the last Sunday in October; previously the rules were not uniform across the European Union.[39] Starting in 2007, most of the United States and Canada observe DST from the second Sunday in March to the first Sunday in November, almost two-thirds of the year.[43] The 2007 U.S. change was part of the Energy Policy Act of 2005; previously, from 1987 through 2006, the start and end dates were the first Sunday in April and the last Sunday in October, and Congress retains the right to go back to the previous dates now that an energy-consumption study has been done.[44] Proponents for permanently retaining November as the month for ending DST point to Halloween as a reason to delay the change—to provide extra daylight on October 31.
    when was the words under god added to the pledge of allegiance Pledge of Allegiance (United States) The phrase "under God" was incorporated into the Pledge of Allegiance on June 14, 1954, by a Joint Resolution of Congress amending § 4 of the Flag Code enacted in 1942.[28]
    who provides the funds for a loan guaranteed by the veterans' administration VA loan The basic intention of the VA home loan program is to supply home financing to eligible veterans and to help veterans purchase properties with no down payment. The loan may be issued by qualified lenders.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": 3,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2,
        "kl_div_weight": 0.75,
        "kl_temperature": 0.75
    }
    

trivia_pairs

  • Dataset: trivia_pairs at a7c36e3
  • Size: 17,190 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 17.38 tokens
    • max: 62 tokens
    • min: 25 tokens
    • mean: 444.36 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    In which 2004 animated Pixar movie does Violet have powers of invisibility? Violet - The Incredibles - Pixar movie - Character Profile - Writeups.org Jump to the game stats Content Violet is one of the main characters of the 2004 Pixar animation movie The Incredibles. The movie encountered both critical and commercial success. The Incredibles deals with super-heroism, family life and emotions, violence, and being different. Background Real Name: Violet Parr Other Aliases: Violet. It is very obvious from the film’s plot that a secret ID is incredibly important to a super’s life, so it is thus unlikely that Violet will continue to adventure as “Violet”. I suggest “Invisigirl” because it follows the film’s naming tradition of simple, descriptive names. Marital Status: Single Known Relatives: Robert Parr (Aka Mr. Incredible, father), Helen Parr (Aka Elastigirl, mother), Dashiell Robert Parr (Aka Dash, brother), Jack-Jack (Possibly Jacob) Parr (brother). Group Affiliation: The Incredibles Base Of Operations: Metroville Height: 5’ (Note:Violet’s official height is listed at 4’ 6”. but that is clearly not accurate, by comparing her height to her mother’s and to her boyfriend’s. I went with 5’, which seems accurate, but is still on the short side for a girl her age. Weight: 90lbs Age: 14 Eyes: Dark Blue Hair: Black (probably dyed) Powers and Abilities A pretty much normal teenaged girl otherwise, Violet/Invisigirl can become invisible (partially or fully) and can erect force fields with a high degree of imperviousness to harm (One of her fields protected her family from Syndrome’s crashing plane). Violet has difficulty using her Force Field when she is under stress. Video HD version of the official trailer. History Mr Incredible was his world’s most famed and lauded super-hero (supers, as they are called popularly), until a mishap while preventing a potential suicide led to a lawsuit for damages. This triggered not only an avalanche of personal-injury lawsuits against Mr Incredible personally, but a public backlash of opinion against supers in general, and most of them were forced to go underground to keep from being sued to death. Mr. Incredible and his new wife, Elastigirl , retired and became simple Mr. and Mrs. Parr, and started raising a family. Cut to 15 years later. Bob Parr is an insurance claims specialist with a midlife crisis and a desire to go back to the “old days”. He’s fed up with his pushy boss and his immoral profession. He and his best buddy Lucius Best (aka Frozone ), spend Wednesday nights cruising the city in a car, listening to the police scanner, and saving people on the sly. Helen is trying hard to be a mom to 3 kids, two of whom have superpowers of their own and fight constantly. She has worked too hard to build a normal life for her family to abide his nostalgia for heroism. Violet, their daughter, is having problems relating to people and is withdrawn and moody. Dash , their son, is chafing at the restrictions placed on him, and getting into trouble at school. When Mr. Incredible is offered the chance to play the role of hero again by a mysterious informant, he jumps at the opportunity. But when it turns out to be a trap set by an old nemesis he had a hand in corrupting, the whole family must reveal themselves to save Mr. Incredible and countless innocents. The Incredibles discover that their real power comes from their unity, rather than their superpowers. Description Violet is a tall, skinny girl with a retiring demeanor. She wears a duplicate of her mother’s Iincredisuit in red & black, with the Incredible’s logo on the chest. Before she came to terms with her “differentness”, she wore her hair so that it fell over her face, a figurative shield to hide behind, and wore mostly drab colors. She now wears it pulled back, and wears more bright colors. Personality Violet is a shy, worried teen who wants nothing more then to be normal. She grew up being taught to repress her abilities. Violet wants to be normal so badly that she has difficulty calling on her force field in anything but absolutely safe conditions… ie, at home, in the presence of her family. Her invisibility, a pur
    Where in the body is the Malleus bone? Malleus Bone Definition, Function & Anatomy
    Who composed The Resurrection Symphony and The Symphony of a Thousand? Mahler: The Genius Who Composed the Resurrection Symphony Mahler: The Genius Who Composed the Resurrection Symphony by DavidPaulWagner Gustav Mahler was an outstanding composer and conductor of the post-Romantic era. His lush symphonies included folk music and pastoral elements. Gustav Mahler's romantic music is often heard in modern movies (such as "Death in Venice"). His lush and often world-weary music included nine large-scale symphonies (including the "Resurrection") and cycles of orchestral songs including "The Song of the Earth" and "Songs on the Death of Children". Mahler's music is regarded as the peak of the post-Romantic period of classical music. Mahler's Life Gustav Mahler was born into a large Jewish family (he was one of 14 children) in Kaliste, Bohemia (modern-day Czech Republic) in 1860.  He showed early musical talent (his first public performance was when he was ten) and studied at the Vienna Conservatoire (conservatorium of music). While there he attended occasional lectures by the composer, Anton Bruckner. He was greatly influenced by the music of Richard Wagner. In 1878 he enrolled in Vienna University. He came under the influence of such continental philosophers as Schopenhauer, Nietzsche, Lotse and Fechner. Conducting Career Mahler now commenced his career as an orchestral conductor. He was conductor of (in succession) the Budapest Opera, the Hamburg Opera, the Vienna Court Opera, the Metropolitan Opera (New York) and the New York Philharmonic Society.  As a conductor he showed the influences particularly of Beethoven, Schubert, Wagner, Bruckner and Bach.  Mahler as a Composer Many critics see Mahler's career as a composer as falling into thee sections. In the first part of his composing career (1880-1901), he composed four symphonies, the Lieder eines fahrenden Gesellen (Song of a Wayfarer) song cycle, and other song cycles, which include songs from his Des Knaben Wunderhorn (Youth's Wonder Horn) cycle. In the second part (1901-07), he composed three instrumental symphonies (the 5th, 6th and 7th Symphonies), his Rückert songs (settings of poems by Friedrich Ruckert), his Kindertotenlieder (Songs on the Death of Children), more Wunderhorn arrangements, and finally his choral symphony (8th Symphony). In the third and final part of his composing career (1907-11), Mahler composed Das Lied von der Erde (The Song of the Earth), his 9th Symphony and his unfinished 10th Symphony. These final works show the composer's quiet resignation as he approached his death in 1911. Gustav Mahler's 5th Symphony in Visconti's 1971 film Death in Venice This film was based on Thomas Mann's novella of the same name Mahler's Life (continued) Mahler's Changing Fortunes During his lifetime Mahler's symphonies received wide interest, although he suffered anti-semiticism from various quarters. His songs generally received praise. The premiere of his Eighth Symphony in 1910 was a triumph with applause lasting half an hour. After his death, Mahler's works suffered a decline in popularity and were banned under the Nazi regime. However, since 1960 audiences have been more receptive to romanticism in music and to Mahler. Mahler has influenced a number of major composers including Schoenberg, Berg, Webern, Shostakovich and Britten. Mahler - Symphony No. 2 ("Resurrection") Works of Gustav Mahler
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

quora_pairs

  • Dataset: quora_pairs at 451a485
  • Size: 4,059 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 6 tokens
    • mean: 13.3 tokens
    • max: 42 tokens
    • min: 6 tokens
    • mean: 13.3 tokens
    • max: 37 tokens
  • Samples:
    sentence1 sentence2
    What is the salary of indian president? What is the President's salary in India?
    As a developer, how can I create a Bitcoin wallet? Where should I create a bitcoin wallet?
    Do you regularly enjoy anal sex? Do you like anal sex?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": 3,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2,
        "kl_div_weight": 0.75,
        "kl_temperature": 0.75
    }
    

gooaq_pairs

  • Dataset: gooaq_pairs at b089f72
  • Size: 12,892 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 11.44 tokens
    • max: 21 tokens
    • min: 14 tokens
    • mean: 57.37 tokens
    • max: 154 tokens
  • Samples:
    sentence1 sentence2
    are mac addresses case sensitive? A MAC address consists of six groups of two characters (numbers or letters). ... As you might have noticed, MAC is not case-sensitive, but it tends to appear either all lower case or all upper case. Each time you change a digit or a letter, you'll get a new MAC.
    is azek better than trex? Trex has a core made of 95% recycled material which includes things like ground up plastic, sawdust, and reclaimed wood. Trex is a much more natural material than Azek. On the other hand, Azek is made of entirely PVC, which is not a natural material. ... It's because of this resistance that Azek is the clear winner.
    what is the best shampoo to use for highlighted hair? ['Olaplex No. ... ', 'Love Beauty And Planet Blooming Color Shampoo. ... ', 'Rahua Color Full Shampoo. ... ', 'Kerastase Blond Absolu Bain Lumiere Shampoo. ... ', 'David Mallett Shampoo No.2: Le Volume. ... ', 'Fanola No Orange Shampoo. ... ', 'Shu Uemura Color Lustre Shampoo. ... ', 'Living Proof Restore Shampoo.']
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

gooaq_pairs2

  • Dataset: gooaq_pairs2 at b089f72
  • Size: 4,298 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 11.45 tokens
    • max: 21 tokens
    • min: 13 tokens
    • mean: 57.94 tokens
    • max: 135 tokens
  • Samples:
    sentence1 sentence2
    how much money do you make for uber? Uber drivers make an average of $364 a month and a median of $155 a month driving for the ride-sharing company, according to the analysis.
    how do you know when your energizer batteries are charged? Lift the prongs of the AC plug until fully extended. Plug charger into a standard 110-120 volt AC outlet. The green LED will light during charging. - It is normal for batteries to become warm while charging and is no cause for alarm.
    how long should you run your air conditioner? An Air Conditioner Should Run for 15-20 Minutes at a Time. In a perfect situation, an air conditioner should run for 15-20 minutes at a time in mild temperatures. Any less than that and your AC could be too large for your home – more on that below.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": 3,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2,
        "kl_div_weight": 0.75,
        "kl_temperature": 0.75
    }
    

mrpc_pairs

  • Dataset: mrpc_pairs at bcdcba7
  • Size: 2,474 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 9 tokens
    • mean: 26.61 tokens
    • max: 51 tokens
    • min: 11 tokens
    • mean: 26.54 tokens
    • max: 52 tokens
  • Samples:
    sentence1 sentence2
    In the second quarter last year , the company experienced a net loss of $ 185 million , or 54 cents a share , on sales of $ 600 million . The company posted a net loss of $ 185 million , or 54 cents per share , in the year-earlier period , it said in a statement Wednesday .
    U.S. District Judge Denny Chin said Fox 's claim was " wholly without merit , both factually and legally . " " This case is wholly without merit , both factually and legally , " Chin said .
    Pope John Paul has health problems but is still at the helm of the Roman Catholic Church , the pope 's top aide has told Reuters . Pope John Paul has health problems but is firmly in charge of the Roman Catholic Church , the pope 's top aide told Reuters on Friday .
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

Evaluation Datasets

nli-pairs

  • Dataset: nli-pairs at d482672
  • Size: 100 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 17.26 tokens
    • max: 36 tokens
    • min: 5 tokens
    • mean: 9.6 tokens
    • max: 21 tokens
  • Samples:
    anchor positive
    Two women are embracing while holding to go packages. Two woman are holding packages.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

vitaminc-pairs

  • Dataset: vitaminc-pairs at be6febb
  • Size: 85 evaluation samples
  • Columns: claim and evidence
  • Approximate statistics based on the first 1000 samples:
    claim evidence
    type string string
    details
    • min: 9 tokens
    • mean: 22.41 tokens
    • max: 41 tokens
    • min: 12 tokens
    • mean: 39.36 tokens
    • max: 79 tokens
  • Samples:
    claim evidence
    Dragon Con had over 5000 guests . Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .
    COVID-19 has reached more than 185 countries . As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .
    In March , Italy had 3.6x times more cases of coronavirus than China . As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

sts-label

  • Dataset: sts-label at ab7a5ac
  • Size: 100 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 7 tokens
    • mean: 9.8 tokens
    • max: 18 tokens
    • min: 7 tokens
    • mean: 9.59 tokens
    • max: 16 tokens
    • min: 0.0
    • mean: 0.55
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    A man with a hard hat is dancing. A man wearing a hard hat is dancing. 1.0
    A young child is riding a horse. A child is riding a horse. 0.95
    A man is feeding a mouse to a snake. The man is feeding a mouse to the snake. 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

qnli-contrastive

  • Dataset: qnli-contrastive at bcdcba7
  • Size: 100 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 7 tokens
    • mean: 14.06 tokens
    • max: 26 tokens
    • min: 10 tokens
    • mean: 37.95 tokens
    • max: 115 tokens
    • 0: 100.00%
  • Samples:
    sentence1 sentence2 label
    What came into force after the new constitution was herald? As of that day, the new constitution heralding the Second Republic came into force. 0
    What is the first major city in the stream of the Rhine? The most important tributaries in this area are the Ill below of Strasbourg, the Neckar in Mannheim and the Main across from Mainz. 0
    What is the minimum required if you want to teach in Canada? In most provinces a second Bachelor's Degree such as a Bachelor of Education is required to become a qualified teacher. 0
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "OnlineContrastiveLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

scitail-pairs-qa

  • Dataset: scitail-pairs-qa at 0cc4353
  • Size: 100 evaluation samples
  • Columns: sentence2 and sentence1
  • Approximate statistics based on the first 1000 samples:
    sentence2 sentence1
    type string string
    details
    • min: 7 tokens
    • mean: 16.0 tokens
    • max: 33 tokens
    • min: 9 tokens
    • mean: 15.37 tokens
    • max: 26 tokens
  • Samples:
    sentence2 sentence1
    A vas deferens is the name of the tube that carries sperm from the epididymis to the urethra. What is the name of the tube that carries sperm from the epididymis to the urethra?
    A(n) increase in length happens to metal railroad tracks during the heat of a summer day. What happens to metal railroad tracks during the heat of a summer day?
    Each lymph organ has a different job in the immune system. Each lymph organ has a different job in what system?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

scitail-pairs-pos

  • Dataset: scitail-pairs-pos at 0cc4353
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 9 tokens
    • mean: 23.38 tokens
    • max: 61 tokens
    • min: 8 tokens
    • mean: 15.59 tokens
    • max: 36 tokens
  • Samples:
    sentence1 sentence2
    An introduction to atoms and elements, compounds, atomic structure and bonding, the molecule and chemical reactions. Replace another in a molecule happens to atoms during a substitution reaction.
    Wavelength The distance between two consecutive points on a sinusoidal wave that are in phase; Wavelength is the distance between two corresponding points of adjacent waves called.
    humans normally have 23 pairs of chromosomes. Humans typically have 23 pairs pairs of chromosomes.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

xsum-pairs

  • Dataset: xsum-pairs at 788ddaf
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 58 tokens
    • mean: 196.79 tokens
    • max: 342 tokens
    • min: 7 tokens
    • mean: 24.65 tokens
    • max: 45 tokens
  • Samples:
    sentence1 sentence2
    18 January 2015 Last updated at 10:03 GMT
    The 11-year-old found fame on American TV by appearing on The Ellen DeGeneres Show with cousin, Rosie McClelland, three years ago.
    They were plucked from Essex and flown to America after the chat show host fell in love with their version of a Nicki Minaj track.
    The pair's YouTube video has been seen by 47 million people.
    Sophia Grace and Rosie soon became very popular in the US - interviewing stars like Katy Perry and Taylor Swift.
    They also went on to make a movie and tour Australia.
    But now, Sophia Grace is going it alone and has just entered America's Billboard music chart for best-selling songs.
    She's been telling BBC Radio 1's Newsbeat all about her new song Best Friends and revealed what she would like to do next.
    Sophia Grace Brownlee's face is probably one you recognise.
    Four people were taken to hospital on Thursday after a three-vehicle accident in the early hours near the A40 turn for Oxford (junction 8).
    On Saturday, a car ended up on its roof on a verge after a four-vehicle crash at junction 10 at 15:22 GMT.
    John Callaway from Oxfordshire Fire and Rescue Service said it was "remarkable" that no-one was killed.
    He added it was the seventh road accident his Banbury team had been called to in under a week.
    Mr Callaway said: "This was a high-speed collision on a fast road, it is remarkable that nobody lost their life.
    "Out of the four casualties, two were transferred to hospital by ambulance and two were treated at the scene. Fortunately none of the injuries appear life-threatening.
    "With the onset of winter and more difficult driving conditions, I urge drivers to allow more time for their journey and adjust their driving accordingly."
    Drivers are being urged to leave more time for their journeys after two crashes on the M40 in recent days.
    Evans is expected to be named on Friday in Michael O'Neill's squad for the World Cup qualifier with Norway in Belfast on 26 March.
    The 26-year-old has been sidelined since 2 January because of a chronic groin problem.
    Evans has started his country's last two World Cup qualifiers.
    New Rovers manager Mowbray made it clear there would be no club-versus-country row if Evans were to face the Norwegians.
    "I don't really get involved in the international set-ups. What I do know is footballers like to play for their countries, they want to play for their countries," he said.
    "If he gets called up, there will no problem. If anything it will be a benefit if he gets some game time and some intense training to build him up.
    "He has trained with us for almost a week now. I would have to say he looks a very talented footballer, my type of footballer. He picks really lovely passes, he's got quick feet and a really good appreciation of the football.
    "He needs to get fit. If he gets called up, there will no problem."
    Former Celtic boss Mowbray was appointed Blackburn manager on 22 February, succeeding Owen Coyle.
    Blackburn boss Tony Mowbray says he has no problem with Corry Evans playing for Northern Ireland despite the midfielder being injured for the past two months.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

compression-pairs

  • Dataset: compression-pairs at 605bc91
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 16 tokens
    • mean: 32.39 tokens
    • max: 73 tokens
    • min: 6 tokens
    • mean: 10.18 tokens
    • max: 20 tokens
  • Samples:
    sentence1 sentence2
    One of America's most famous newspaper publishers, Arthur Ochs Sulzberger, whose family owns the New York Times, died on Saturday at the age of 86. Arthur Ochs Sulzberger, publisher of the New York Times, dies at 86
    ``Since vitamin D is generally lower in persons with obesity, it is possible that low vitamin D could account, in part, for the link between obesity and diseases such as cancer, heart disease and diabetes,'' said Caitlin Mason, Ph.D., lead author of the paper, published online May 25 in the American Journal of Clinical Nutrition. Low vitamin D could account for link between obesity and cancer, heart disease, diabetes
    A 32-year-old Clovis man was sentenced to nine years in prison Wednesday for attempted murder, according to a news release from the office 9th Judicial District Attorney Matthew Chandler. Clovis man sentenced to nine years
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

sciq_pairs

  • Dataset: sciq_pairs at 2c94ad3
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 7 tokens
    • mean: 16.76 tokens
    • max: 43 tokens
    • min: 2 tokens
    • mean: 78.22 tokens
    • max: 330 tokens
  • Samples:
    sentence1 sentence2
    What forms when oceanic crust subducts into the mantle at convergent plate boundaries? Volcanic mountain ranges form when oceanic crust subducts into the mantle at convergent plate boundaries. The Andes Mountains are a chain of coastal volcanic mountains. They are forming as the Nazca plate subducts beneath the South American plate ( Figure below ).
    Are the joints between the vertebrae contained in your backbone fully movable, partially movable, or unmovable? Partly movable joints allow only a little movement. Your backbone has partly movable joints between the vertebrae ( Figure below ).
    What rod provides stiffness to counterbalance the pull of muscles? The notochord lies between the dorsal nerve cord and the digestive tract. It provides stiffness to counterbalance the pull of muscles.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

qasc_pairs

  • Dataset: qasc_pairs at a34ba20
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 6 tokens
    • mean: 11.41 tokens
    • max: 21 tokens
    • min: 20 tokens
    • mean: 35.37 tokens
    • max: 54 tokens
  • Samples:
    sentence1 sentence2
    a living thing with four divisions can be what with woody trunks? Most modern gymnosperms are trees with woody trunks.. Gymnosperms comprise four divisions.. a living thing with four divisions can be trees with woody trunks.
    What is hair not considered? Hair helps to insulate and protect the body.. Lean body mass and body fat are derived from total body water.. Hair is not considered fat.
    What can learn behavior that is intended to cause harm or pain? Aggression is behavior that is intended to cause harm or pain.. If they are around aggressive dogs, they learn to be aggressive.. dogs can learn behavior that is intended to cause harm or pain
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

openbookqa_pairs

  • Dataset: openbookqa_pairs
  • Size: 100 evaluation samples
  • Columns: question and fact
  • Approximate statistics based on the first 1000 samples:
    question fact
    type string string
    details
    • min: 3 tokens
    • mean: 13.68 tokens
    • max: 45 tokens
    • min: 4 tokens
    • mean: 11.67 tokens
    • max: 28 tokens
  • Samples:
    question fact
    The thermal production of a stove is generically used for a stove generates heat for cooking usually
    What creates a valley? a valley is formed by a river flowing
    when it turns day and night on a planet, what cause this? a planet rotating causes cycles of day and night on that planet
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

msmarco_pairs

  • Dataset: msmarco_pairs at 28ff31e
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 8.51 tokens
    • max: 16 tokens
    • min: 34 tokens
    • mean: 78.92 tokens
    • max: 190 tokens
  • Samples:
    sentence1 sentence2
    how big is a medium sized dog Medium-size kennels are around 36 inches long, and can accommodate dogs in the 40- to 70-pound range. Bulldogs, cocker spaniels, and American Eskimo dogs at a normal adult size all fit well in these size kennels. Large-size kennels are around 42 inches long, and can accommodate dogs in the 70 to 90 pound range.
    types of wounds A National Athletic Trainers' Association answered. The five types of wounds are abrasion, avulsion, incision, laceration, and puncture. An abrasion is a wound caused by friction when a body scrapes across a rough surface. An avulsion is characterized by a flap. An incision is a cut with clean edges. A laceration is a cut with jagged edges.
    did mike tyson ever lose In 2002, Tyson fought for the world heavyweight title at the age of 35, losing by knockout to Lennox Lewis. Tyson retired from professional boxing in 2006, after being knocked out in consecutive matches against Danny Williams and Kevin McBride.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

nq_pairs

  • Dataset: nq_pairs at f9e894e
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 10 tokens
    • mean: 11.91 tokens
    • max: 22 tokens
    • min: 19 tokens
    • mean: 134.49 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    who is playing the fa cup final 2018 2018 FA Cup Final The 2018 FA Cup Final was the final match of the 2017–18 FA Cup and the 137th final of the FA Cup, the world's oldest football cup competition. It was played at Wembley Stadium in London, England[3] on 19 May 2018 between Manchester United and Chelsea. It was the second successive final for Chelsea following their defeat by Arsenal the previous year.
    what type of audio file does itunes use iTunes iTunes keeps track of songs by creating a virtual library, allowing users to access and edit a song's attributes. These attributes, known as metadata, are stored in a binary library file called iTunes Library, which uses a proprietary file format ("ITL"). It caches information like artist and genre from the audio format's tag capabilities (the ID3 tag, for example) and stores iTunes-specific information like play count and rating. iTunes typically reads library data only from this file.[27] A second file can also be created if users activate a preference; the iTunes Music Library.xml file is refreshed whenever information in iTunes is changed. It uses an XML format, allowing third-party apps to access the library information (including play count, last played date, and rating, which are not standard fields in the ID3v2.3 format). Apple's own iDVD, iMovie, and iPhoto applications all access the library.[28] If the first file exists but is corrupted, such as by making it zero-length, iTunes will attempt to reconstruct it from the XML file. Detailed third-party instructions regarding this can be found elsewhere.[29] Beginning with iTunes 10.5.3 this behavior has been changed such that the XML file is not read automatically to recreate the database when the database is corrupted. Rather, the user should load the iTunes Library.xml file via File > Library > Import Playlist....
    who played the werewolf in the old movies The Wolf Man (1941 film) The Wolf Man is a 1941 American horror film written by Curt Siodmak and produced and directed by George Waggner. The film features Lon Chaney Jr. in the title role, and also features Claude Rains, Warren William, Ralph Bellamy, Patric Knowles, and Bela Lugosi; with Evelyn Ankers, and Maria Ouspenskaya in supporting roles. The title character has had a great deal of influence on Hollywood's depictions of the legend of the werewolf.[2] The film is the second Universal Pictures werewolf film, preceded six years earlier by the less commercially successful Werewolf of London (1935).
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

trivia_pairs

  • Dataset: trivia_pairs at a7c36e3
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 9 tokens
    • mean: 17.55 tokens
    • max: 39 tokens
    • min: 37 tokens
    • mean: 452.6 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    Italian musician David Rizzio was private secretary to which British monarch? Rizzio - Memidex dictionary/thesaurus Rizzio David Rizzio
    ‘Full English’ relates to which meal? Your Guide to a Full English Breakfast (Fry-Up)
    What went with Blood and Sweat in the name of the 60s rock band? Chicago Songs, History, and Biography Does Anybody Really Know What Time It Is? If You Leave Me Now Old Days Hard Habit to Break Where you might have heard them Their early '70s rock hits remain staples on classic rock radio; ditto for their late '70s and early '80s ballads on adult contemporary playlists. Occasionally, however, Chicago's catalog interacts with other realms of entertainment, like the highly ironic use of "If You Leave Me Now" in the classic Gulf War film Three Kings and the zombie spoof Shaun of the Dead, or "Saturday in the Park" being featured in an episode of "The Sopranos," or "Old Days" popping up in the films This is 40 and Starsky & Hutch.  continue reading below our video 5 Urban Myths That Rule the Ages Formed 1967 (Chicago, IL) Styles Jazz-rock, Pop-rock, Classic Rock, Soft-rock, Adult Contemporary, Prog-rock Claims to fame: Did more than any other group to create a commercial fusion of jazz, classical, pop, and rock Their signature sound was the result of several multi-talented singers, songwriters and musicians A socially aware rock band whose lyrical activist sensibilities lasted longer than most Lead guitarist Terry Kath, who died tragically young, is considered one of the most underrated rock guitarists of the era Survived a series of setbacks to re-emerge in the '80s as a successful soft-rock group The classic Chicago lineup: Robert Lamm (born October 13, 1944, Brooklyn, NY): lead and backing vocals, piano, organ, guitar Peter Cetera (born September 13, 1944, Chicago, IL): lead and backing vocals, bass, guitar Terry Kath (born January 31, 1946, Chicago, IL; died January 23, 1978, Woodland Hills, CA): lead and backing vocals, lead guitar, bass Lee Loughnane (born October 21, 1946, Chicago, IL): trumpet, flugelhorn, guitar, percussion, lead and backing vocals  James Pankow (born August 20, 1947, St. Louis, MO): trombone, keyboards, percussion, lead and backing vocals  Walter Parazaider (born March 14, 1945, Chicago, IL): alto and tenor saxophones, flute, clarinet, backing vocals Danny Seraphine (born August 28, 1948, Chicago, IL) drums, percussion, keyboards The History of Chicago Early years Anyone even casually familiar with the band Chicago won't be surprised to learn they were a bunch of guys from the Windy City who took up their instruments at an early age, learning jazz and classical music before being seduced by the money (and women) available to rock and soul party bands. In fact, the members of Chicago, all but two of whom were born and raised in the city or its suburbs, formed the band that was to be their legacy after meeting at the city's famed DePaul University. Walter Parazaider, a classically trained clarinetist who had discovered the joys of the saxophone, was heading up a local rock band called the Missing Links, which at times included Terry Kath, Lee Loughnane and Danny Seraphine. Embolded by the Beatles' recent use of horn sections on songs like "Got to Get You Into My Life," Parazaider began to merge his two loves, expanding the band into a large jazz-rock outfit; Fellow student James Pankow soon joined, then organist and vocalist Robert Lamm, recruited from another local group. As Kath moved from bass to guitar, and with a tenor needed to complete the group's harmony, Peter Cetera was invited to join. Due to the unconventional nature of both their size and scope, they went by the name The Big Thing. Success Parazider's longtime musician friend James William Guercio, by 1967 a producer at Columbia Records, loved the concept and agreed to manage the band. Moving them out to Los Angeles, the group, now renamed Chicago Transit Authority after their hometown's bus line, rehearsed night and day while Guercio produced the second album by Blood, Sweat & Tears, another big rock band with similar ideas. When that album became a Grammy-winning smash, spinning off three hit singles, the stage was set for Chicago. The album Chicago Transit Authority was only successful on the new, free-form FM stations at first, but two years of buzz finally got them a hit with "25 or 6 to 4," and
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

quora_pairs

  • Dataset: quora_pairs at 451a485
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 7 tokens
    • mean: 13.48 tokens
    • max: 37 tokens
    • min: 7 tokens
    • mean: 13.96 tokens
    • max: 36 tokens
  • Samples:
    sentence1 sentence2
    What is it like to attend your high school reunion? What was it like to go to your high school reunion?
    How do I concentrate in studies? How can I concentrate in my daily studies?
    How do I become mentally stronger? How do I become mentally strong?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": 3,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2,
        "kl_div_weight": 0.75,
        "kl_temperature": 0.75
    }
    

gooaq_pairs

  • Dataset: gooaq_pairs at b089f72
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 11.81 tokens
    • max: 21 tokens
    • min: 14 tokens
    • mean: 55.42 tokens
    • max: 107 tokens
  • Samples:
    sentence1 sentence2
    can drinking too much water make you lose your period? Not drinking enough water: Keeping yourself dehydrated during periods can lead to cramps and discomfort. During periods, you experience hormonal fluctuations and a bloated belly. As your estrogen and progesterone levels recede, your body retains more water.
    how long do side effects last after stopping medication? Symptoms of Antidepressant Discontinuation Symptoms of antidepressant withdrawal depend on the specific medication you have been taking. Symptoms most often occur within three days of stopping the antidepressant. They are usually mild and go away within about two weeks.
    is jedediah a biblical name? In the Hebrew Bible, Jedidiah (Jeddedi in Brenton's Septuagint Translation) was the second or "blessing" name given by God through the prophet Nathan in infancy to Solomon, second son of King David and Bathsheba.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 2,
        "prior_layers_weight": 0.25,
        "kl_div_weight": 1.25,
        "kl_temperature": 0.9
    }
    

mrpc_pairs

  • Dataset: mrpc_pairs at bcdcba7
  • Size: 100 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 14 tokens
    • mean: 26.86 tokens
    • max: 42 tokens
    • min: 14 tokens
    • mean: 26.2 tokens
    • max: 41 tokens
  • Samples:
    sentence1 sentence2
    A former employee of a local power company pleaded guilty Wednesday to setting off a bomb that knocked out a power substation during the Winter Olympics last year . A former Utah Power meter reader pleaded guilty Wednesday to bombing a power substation during the 2002 Winter Olympics .
    Metro , bus and local rail services in France 's four largest towns -- Paris , Lyon , Lille and Marseille -- were severely disrupted , Europe 1 radio reported . Subway , bus and suburban rail services in France 's four largest cities -- Paris , Lyon , Lille and Marseille -- were severely disrupted , transport authorities said .
    The U.N. troops are in Congo to protect U.N. installations and personnel , and they can only fire in self defense and have been unable to stem the violence . The troops - whose mandate is to protect U.N. installations and personnel - can only fire in self-defense and have been unable to stem the violence .
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 1,
        "kl_div_weight": 0.9,
        "kl_temperature": 0.75
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 3e-05
  • weight_decay: 0.0001
  • num_train_epochs: 2
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {'num_cycles': 2}
  • warmup_ratio: 0.075
  • save_safetensors: False
  • fp16: True
  • push_to_hub: True
  • hub_model_id: bobox/DeBERTa-ST-AllLayers-v3.1-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0001
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {'num_cycles': 2}
  • warmup_ratio: 0.075
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: bobox/DeBERTa-ST-AllLayers-v3.1-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss trivia pairs loss scitail-pairs-pos loss vitaminc-pairs loss qasc pairs loss scitail-pairs-qa loss msmarco pairs loss nq pairs loss quora pairs loss qnli-contrastive loss nli-pairs loss sts-label loss compression-pairs loss xsum-pairs loss sciq pairs loss mrpc pairs loss openbookqa pairs loss gooaq pairs loss sts-test_spearman_cosine
0.0050 32 0.6329 - - - - - - - - - - - - - - - - - -
0.0100 64 0.9693 - - - - - - - - - - - - - - - - - -
0.0150 96 0.6548 - - - - - - - - - - - - - - - - - -
0.0201 128 1.1279 - - - - - - - - - - - - - - - - - -
0.0251 160 1.0017 - - - - - - - - - - - - - - - - - -
0.0301 192 0.7571 - - - - - - - - - - - - - - - - - -
0.0351 224 0.7304 - - - - - - - - - - - - - - - - - -
0.0401 256 0.7636 - - - - - - - - - - - - - - - - - -
0.0451 288 0.482 - - - - - - - - - - - - - - - - - -
0.0501 320 0.6312 0.7809 0.5141 4.6926 0.2045 0.0761 0.5139 0.2351 0.0392 0.1608 1.0158 3.5502 0.0981 0.2558 0.2562 0.0550 1.7538 0.4713 0.7990
0.0552 352 0.5791 - - - - - - - - - - - - - - - - - -
0.0602 384 0.6413 - - - - - - - - - - - - - - - - - -
0.0652 416 0.4319 - - - - - - - - - - - - - - - - - -
0.0702 448 0.6672 - - - - - - - - - - - - - - - - - -
0.0752 480 0.459 - - - - - - - - - - - - - - - - - -
0.0802 512 0.7621 - - - - - - - - - - - - - - - - - -
0.0853 544 0.864 - - - - - - - - - - - - - - - - - -
0.0903 576 0.5081 - - - - - - - - - - - - - - - - - -
0.0953 608 0.654 - - - - - - - - - - - - - - - - - -
0.1003 640 0.6372 0.8748 0.5404 4.7194 0.2102 0.0754 0.5103 0.2447 0.0782 0.1520 1.0653 3.6123 0.1007 0.2596 0.2645 0.0549 1.7905 0.5236 0.7997
0.1053 672 0.9292 - - - - - - - - - - - - - - - - - -
0.1103 704 1.3108 - - - - - - - - - - - - - - - - - -
0.1153 736 0.9674 - - - - - - - - - - - - - - - - - -
0.1204 768 0.9226 - - - - - - - - - - - - - - - - - -
0.1254 800 0.789 - - - - - - - - - - - - - - - - - -
0.1304 832 0.5186 - - - - - - - - - - - - - - - - - -
0.1354 864 0.6726 - - - - - - - - - - - - - - - - - -
0.1404 896 0.5381 - - - - - - - - - - - - - - - - - -
0.1454 928 0.581 - - - - - - - - - - - - - - - - - -
0.1504 960 0.9038 0.9380 0.4760 4.7749 0.2327 0.0806 0.5809 0.2808 0.0913 0.1943 1.2173 3.1986 0.1009 0.2758 0.2688 0.0580 1.8053 0.5808 0.8006
0.1555 992 0.7964 - - - - - - - - - - - - - - - - - -
0.1605 1024 0.8213 - - - - - - - - - - - - - - - - - -
0.1655 1056 0.5396 - - - - - - - - - - - - - - - - - -
0.1705 1088 0.9297 - - - - - - - - - - - - - - - - - -
0.1755 1120 1.169 - - - - - - - - - - - - - - - - - -
0.1805 1152 0.7486 - - - - - - - - - - - - - - - - - -
0.1856 1184 0.6821 - - - - - - - - - - - - - - - - - -
0.1906 1216 0.6125 - - - - - - - - - - - - - - - - - -
0.1956 1248 0.8061 - - - - - - - - - - - - - - - - - -
0.2006 1280 0.6918 0.9222 0.4943 4.6790 0.1935 0.0747 0.5744 0.3035 0.0385 0.2037 1.1552 3.5581 0.0974 0.2828 0.2762 0.0562 1.6875 0.5284 0.8014
0.2056 1312 0.9421 - - - - - - - - - - - - - - - - - -
0.2106 1344 0.8641 - - - - - - - - - - - - - - - - - -
0.2156 1376 1.157 - - - - - - - - - - - - - - - - - -
0.2207 1408 0.8772 - - - - - - - - - - - - - - - - - -
0.2257 1440 1.0496 - - - - - - - - - - - - - - - - - -
0.2307 1472 0.589 - - - - - - - - - - - - - - - - - -
0.2357 1504 0.8234 - - - - - - - - - - - - - - - - - -
0.2407 1536 0.7365 - - - - - - - - - - - - - - - - - -
0.2457 1568 0.5076 - - - - - - - - - - - - - - - - - -
0.2507 1600 1.0329 0.8292 0.5018 4.7846 0.2132 0.0755 0.5719 0.2738 0.0800 0.1859 1.0577 3.6113 0.0974 0.2749 0.2648 0.0551 1.7900 0.5998 0.8019
0.2558 1632 1.4006 - - - - - - - - - - - - - - - - - -
0.2608 1664 0.5963 - - - - - - - - - - - - - - - - - -
0.2658 1696 0.7488 - - - - - - - - - - - - - - - - - -
0.2708 1728 0.8548 - - - - - - - - - - - - - - - - - -
0.2758 1760 1.3324 - - - - - - - - - - - - - - - - - -
0.2808 1792 0.5804 - - - - - - - - - - - - - - - - - -
0.2858 1824 0.7827 - - - - - - - - - - - - - - - - - -
0.2909 1856 0.5448 - - - - - - - - - - - - - - - - - -
0.2959 1888 0.7368 - - - - - - - - - - - - - - - - - -
0.3009 1920 0.5657 0.9042 0.4916 4.6983 0.1963 0.0701 0.5171 0.2481 0.0360 0.1133 1.0244 3.1822 0.0920 0.2594 0.2914 0.0498 1.7960 0.5626 0.7984
0.3059 1952 0.7425 - - - - - - - - - - - - - - - - - -
0.3109 1984 0.7819 - - - - - - - - - - - - - - - - - -
0.3159 2016 0.5937 - - - - - - - - - - - - - - - - - -
0.3210 2048 0.8133 - - - - - - - - - - - - - - - - - -
0.3260 2080 1.0674 - - - - - - - - - - - - - - - - - -
0.3310 2112 0.6288 - - - - - - - - - - - - - - - - - -
0.3360 2144 0.5866 - - - - - - - - - - - - - - - - - -
0.3410 2176 0.6962 - - - - - - - - - - - - - - - - - -
0.3460 2208 0.5562 - - - - - - - - - - - - - - - - - -
0.3510 2240 0.8871 0.8545 0.5766 4.6446 0.1923 0.0664 0.5325 0.2721 0.0844 0.1122 1.0148 3.9153 0.0919 0.2596 0.2843 0.0509 1.6234 0.5712 0.8015
0.3561 2272 0.6805 - - - - - - - - - - - - - - - - - -
0.3611 2304 1.0451 - - - - - - - - - - - - - - - - - -
0.3661 2336 1.0603 - - - - - - - - - - - - - - - - - -
0.3711 2368 0.8142 - - - - - - - - - - - - - - - - - -
0.3761 2400 1.7211 - - - - - - - - - - - - - - - - - -
0.3811 2432 0.7523 - - - - - - - - - - - - - - - - - -
0.3861 2464 0.8053 - - - - - - - - - - - - - - - - - -
0.3912 2496 0.8427 - - - - - - - - - - - - - - - - - -
0.3962 2528 0.8204 - - - - - - - - - - - - - - - - - -
0.4012 2560 0.5343 0.8588 0.4768 4.7234 0.1924 0.0674 0.5183 0.3077 0.0798 0.1508 1.0022 3.8186 0.0960 0.2572 0.2735 0.0512 1.6290 0.6571 0.8039
0.4062 2592 0.9709 - - - - - - - - - - - - - - - - - -
0.4112 2624 0.708 - - - - - - - - - - - - - - - - - -
0.4162 2656 0.4083 - - - - - - - - - - - - - - - - - -
0.4213 2688 0.8732 - - - - - - - - - - - - - - - - - -
0.4263 2720 1.2616 - - - - - - - - - - - - - - - - - -
0.4313 2752 1.3324 - - - - - - - - - - - - - - - - - -
0.4363 2784 0.6244 - - - - - - - - - - - - - - - - - -
0.4413 2816 0.6176 - - - - - - - - - - - - - - - - - -
0.4463 2848 0.6926 - - - - - - - - - - - - - - - - - -
0.4513 2880 0.8158 0.8937 0.4763 4.7561 0.1805 0.0703 0.5720 0.2748 0.0799 0.1333 1.2103 3.4280 0.0984 0.2574 0.2824 0.0539 1.5709 0.6211 0.8024
0.4564 2912 1.4753 - - - - - - - - - - - - - - - - - -
0.4614 2944 0.5735 - - - - - - - - - - - - - - - - - -
0.4664 2976 1.2261 - - - - - - - - - - - - - - - - - -
0.4714 3008 0.6085 - - - - - - - - - - - - - - - - - -
0.4764 3040 0.8766 - - - - - - - - - - - - - - - - - -
0.4814 3072 1.1824 - - - - - - - - - - - - - - - - - -
0.4864 3104 0.7192 - - - - - - - - - - - - - - - - - -
0.4915 3136 0.6131 - - - - - - - - - - - - - - - - - -
0.4965 3168 0.7407 - - - - - - - - - - - - - - - - - -
0.5015 3200 0.5857 0.8825 0.5107 4.7481 0.1826 0.0667 0.5865 0.2539 0.0626 0.1034 1.1432 3.9403 0.0910 0.2657 0.3079 0.0503 1.5945 0.5953 0.8013
0.5065 3232 0.6212 - - - - - - - - - - - - - - - - - -
0.5115 3264 1.1408 - - - - - - - - - - - - - - - - - -
0.5165 3296 0.6898 - - - - - - - - - - - - - - - - - -
0.5215 3328 0.9827 - - - - - - - - - - - - - - - - - -
0.5266 3360 0.9518 - - - - - - - - - - - - - - - - - -
0.5316 3392 0.5584 - - - - - - - - - - - - - - - - - -
0.5366 3424 1.3362 - - - - - - - - - - - - - - - - - -
0.5416 3456 0.4418 - - - - - - - - - - - - - - - - - -
0.5466 3488 0.5896 - - - - - - - - - - - - - - - - - -
0.5516 3520 0.7951 0.9037 0.5180 4.6285 0.1791 0.0601 0.5547 0.2480 0.0573 0.1186 1.0017 3.6985 0.0899 0.2575 0.2898 0.0476 1.6558 0.5602 0.8003
0.5567 3552 0.5201 - - - - - - - - - - - - - - - - - -
0.5617 3584 0.6351 - - - - - - - - - - - - - - - - - -
0.5667 3616 0.8652 - - - - - - - - - - - - - - - - - -
0.5717 3648 0.6407 - - - - - - - - - - - - - - - - - -
0.5767 3680 0.9435 - - - - - - - - - - - - - - - - - -
0.5817 3712 0.9295 - - - - - - - - - - - - - - - - - -
0.5867 3744 0.6829 - - - - - - - - - - - - - - - - - -
0.5918 3776 0.8683 - - - - - - - - - - - - - - - - - -
0.5968 3808 1.115 - - - - - - - - - - - - - - - - - -
0.6018 3840 1.0936 0.7936 0.4620 4.7559 0.1861 0.0611 0.5555 0.2324 0.0594 0.1389 0.9106 3.4692 0.0881 0.2492 0.2829 0.0463 1.6010 0.5736 0.8005
0.6068 3872 0.8689 - - - - - - - - - - - - - - - - - -
0.6118 3904 0.8692 - - - - - - - - - - - - - - - - - -
0.6168 3936 0.9083 - - - - - - - - - - - - - - - - - -
0.6218 3968 1.0782 - - - - - - - - - - - - - - - - - -
0.6269 4000 0.7711 - - - - - - - - - - - - - - - - - -
0.6319 4032 1.0005 - - - - - - - - - - - - - - - - - -
0.6369 4064 0.7229 - - - - - - - - - - - - - - - - - -
0.6419 4096 0.4871 - - - - - - - - - - - - - - - - - -
0.6469 4128 0.7853 - - - - - - - - - - - - - - - - - -
0.6519 4160 0.9271 0.8565 0.4259 4.7202 0.1750 0.0614 0.5042 0.2456 0.0526 0.1426 0.8980 3.9554 0.0884 0.2565 0.2947 0.0471 1.5555 0.5816 0.8043
0.6570 4192 0.5223 - - - - - - - - - - - - - - - - - -
0.6620 4224 1.0498 - - - - - - - - - - - - - - - - - -
0.6670 4256 0.6791 - - - - - - - - - - - - - - - - - -
0.6720 4288 0.8836 - - - - - - - - - - - - - - - - - -
0.6770 4320 0.6035 - - - - - - - - - - - - - - - - - -
0.6820 4352 0.5167 - - - - - - - - - - - - - - - - - -
0.6870 4384 0.981 - - - - - - - - - - - - - - - - - -
0.6921 4416 0.4873 - - - - - - - - - - - - - - - - - -
0.6971 4448 0.4762 - - - - - - - - - - - - - - - - - -
0.7021 4480 0.8201 0.7997 0.4325 4.7335 0.1771 0.0596 0.5347 0.2483 0.0307 0.1156 0.8704 3.5892 0.0871 0.2489 0.2863 0.0446 1.5271 0.5037 0.8043
0.7071 4512 0.7799 - - - - - - - - - - - - - - - - - -
0.7121 4544 0.8006 - - - - - - - - - - - - - - - - - -
0.7171 4576 0.5123 - - - - - - - - - - - - - - - - - -
0.7221 4608 0.7421 - - - - - - - - - - - - - - - - - -
0.7272 4640 0.9477 - - - - - - - - - - - - - - - - - -
0.7322 4672 0.5021 - - - - - - - - - - - - - - - - - -
0.7372 4704 0.931 - - - - - - - - - - - - - - - - - -
0.7422 4736 0.7777 - - - - - - - - - - - - - - - - - -
0.7472 4768 0.9462 - - - - - - - - - - - - - - - - - -
0.7522 4800 0.5846 0.8120 0.4563 4.6871 0.1704 0.0585 0.5062 0.2288 0.0621 0.1415 0.9292 3.8014 0.0868 0.2348 0.2816 0.0438 1.5671 0.4848 0.8044
0.7572 4832 0.6735 - - - - - - - - - - - - - - - - - -
0.7623 4864 1.1569 - - - - - - - - - - - - - - - - - -
0.7673 4896 0.9749 - - - - - - - - - - - - - - - - - -
0.7723 4928 0.6581 - - - - - - - - - - - - - - - - - -
0.7773 4960 0.6979 - - - - - - - - - - - - - - - - - -
0.7823 4992 0.7582 - - - - - - - - - - - - - - - - - -
0.7873 5024 1.0082 - - - - - - - - - - - - - - - - - -
0.7924 5056 0.6206 - - - - - - - - - - - - - - - - - -
0.7974 5088 0.5165 - - - - - - - - - - - - - - - - - -
0.8024 5120 0.4914 0.7989 0.4786 4.7251 0.1739 0.0556 0.5132 0.2343 0.0558 0.1053 0.8841 3.6535 0.0838 0.2356 0.2816 0.0417 1.5919 0.4890 0.8042
0.8074 5152 1.098 - - - - - - - - - - - - - - - - - -
0.8124 5184 0.821 - - - - - - - - - - - - - - - - - -
0.8174 5216 0.9351 - - - - - - - - - - - - - - - - - -
0.8224 5248 0.8784 - - - - - - - - - - - - - - - - - -
0.8275 5280 0.8326 - - - - - - - - - - - - - - - - - -
0.8325 5312 0.7551 - - - - - - - - - - - - - - - - - -
0.8375 5344 0.8234 - - - - - - - - - - - - - - - - - -
0.8425 5376 1.0922 - - - - - - - - - - - - - - - - - -
0.8475 5408 1.0925 - - - - - - - - - - - - - - - - - -
0.8525 5440 1.099 0.7568 0.4714 4.6969 0.1696 0.0573 0.5048 0.2302 0.0575 0.1292 0.8909 3.7946 0.0850 0.2318 0.2747 0.0438 1.5425 0.4945 0.8088
0.8575 5472 0.5396 - - - - - - - - - - - - - - - - - -
0.8626 5504 0.6636 - - - - - - - - - - - - - - - - - -
0.8676 5536 1.0095 - - - - - - - - - - - - - - - - - -
0.8726 5568 0.631 - - - - - - - - - - - - - - - - - -
0.8776 5600 0.5415 - - - - - - - - - - - - - - - - - -
0.8826 5632 0.9227 - - - - - - - - - - - - - - - - - -
0.8876 5664 0.8991 - - - - - - - - - - - - - - - - - -
0.8927 5696 0.5068 - - - - - - - - - - - - - - - - - -
0.8977 5728 1.2134 - - - - - - - - - - - - - - - - - -
0.9027 5760 0.4651 0.7480 0.4753 4.7176 0.1701 0.0550 0.4862 0.2252 0.0606 0.1183 0.8765 3.7598 0.0831 0.2294 0.2765 0.0418 1.5488 0.4697 0.8071
0.9077 5792 0.6346 - - - - - - - - - - - - - - - - - -
0.9127 5824 1.1103 - - - - - - - - - - - - - - - - - -
0.9177 5856 0.7667 - - - - - - - - - - - - - - - - - -
0.9227 5888 0.9174 - - - - - - - - - - - - - - - - - -
0.9278 5920 0.7609 - - - - - - - - - - - - - - - - - -
0.9328 5952 0.8993 - - - - - - - - - - - - - - - - - -
0.9378 5984 0.7587 - - - - - - - - - - - - - - - - - -
0.9428 6016 0.935 - - - - - - - - - - - - - - - - - -
0.9478 6048 0.8551 - - - - - - - - - - - - - - - - - -
0.9528 6080 1.4247 0.7455 0.4640 4.7075 0.1663 0.0553 0.4886 0.2193 0.0537 0.1200 0.8808 3.7687 0.0825 0.2277 0.2470 0.0418 1.5344 0.4775 0.8078
0.9578 6112 0.3377 - - - - - - - - - - - - - - - - - -
0.9629 6144 1.163 - - - - - - - - - - - - - - - - - -
0.9679 6176 1.1638 - - - - - - - - - - - - - - - - - -
0.9729 6208 0.7428 - - - - - - - - - - - - - - - - - -
0.9779 6240 0.3827 - - - - - - - - - - - - - - - - - -
0.9829 6272 1.0739 - - - - - - - - - - - - - - - - - -
0.9879 6304 0.7049 - - - - - - - - - - - - - - - - - -
0.9929 6336 0.9298 - - - - - - - - - - - - - - - - - -
0.9980 6368 0.6243 - - - - - - - - - - - - - - - - - -
1.0030 6400 0.8693 0.7522 0.4743 4.6926 0.1651 0.0544 0.4899 0.2182 0.0266 0.1122 0.8765 3.7495 0.0813 0.2270 0.2293 0.0412 1.5506 0.4720 0.8073
1.0080 6432 0.731 - - - - - - - - - - - - - - - - - -
1.0130 6464 0.7662 - - - - - - - - - - - - - - - - - -
1.0180 6496 0.5362 - - - - - - - - - - - - - - - - - -
1.0230 6528 0.9786 - - - - - - - - - - - - - - - - - -
1.0281 6560 0.9213 - - - - - - - - - - - - - - - - - -
1.0331 6592 0.7601 - - - - - - - - - - - - - - - - - -
1.0381 6624 0.4821 - - - - - - - - - - - - - - - - - -
1.0431 6656 0.73 - - - - - - - - - - - - - - - - - -
1.0481 6688 0.4139 - - - - - - - - - - - - - - - - - -
1.0531 6720 0.5152 0.7513 0.4694 4.6802 0.1659 0.0549 0.4901 0.2205 0.0250 0.1132 0.8771 3.7476 0.0817 0.2276 0.2293 0.0415 1.5460 0.4723 0.8081
1.0581 6752 0.4684 - - - - - - - - - - - - - - - - - -
1.0632 6784 0.445 - - - - - - - - - - - - - - - - - -
1.0682 6816 0.4288 - - - - - - - - - - - - - - - - - -
1.0732 6848 0.3797 - - - - - - - - - - - - - - - - - -
1.0782 6880 0.4304 - - - - - - - - - - - - - - - - - -
1.0832 6912 0.8562 - - - - - - - - - - - - - - - - - -
1.0882 6944 0.4902 - - - - - - - - - - - - - - - - - -
1.0932 6976 0.4285 - - - - - - - - - - - - - - - - - -
1.0983 7008 0.4782 - - - - - - - - - - - - - - - - - -
1.1033 7040 0.7503 0.9699 0.5473 4.5217 0.1793 0.0636 0.4798 0.2459 0.0316 0.1796 0.9263 3.8786 0.0929 0.2405 0.2890 0.0485 1.7124 0.5500 0.8109
1.1083 7072 1.0828 - - - - - - - - - - - - - - - - - -
1.1133 7104 0.6206 - - - - - - - - - - - - - - - - - -
1.1183 7136 0.8111 - - - - - - - - - - - - - - - - - -
1.1233 7168 0.49 - - - - - - - - - - - - - - - - - -
1.1283 7200 0.5289 - - - - - - - - - - - - - - - - - -
1.1334 7232 0.2983 - - - - - - - - - - - - - - - - - -
1.1384 7264 0.5183 - - - - - - - - - - - - - - - - - -
1.1434 7296 0.3254 - - - - - - - - - - - - - - - - - -
1.1484 7328 0.5142 - - - - - - - - - - - - - - - - - -
1.1534 7360 0.5605 0.9398 0.4742 4.8611 0.1884 0.0625 0.5194 0.2714 0.0587 0.2063 1.0348 3.8329 0.0926 0.2374 0.2771 0.0474 1.8109 0.6362 0.8067
1.1584 7392 0.6993 - - - - - - - - - - - - - - - - - -
1.1635 7424 0.3437 - - - - - - - - - - - - - - - - - -
1.1685 7456 0.3281 - - - - - - - - - - - - - - - - - -
1.1735 7488 1.0286 - - - - - - - - - - - - - - - - - -
1.1785 7520 0.6668 - - - - - - - - - - - - - - - - - -
1.1835 7552 0.3861 - - - - - - - - - - - - - - - - - -
1.1885 7584 0.4096 - - - - - - - - - - - - - - - - - -
1.1935 7616 0.5836 - - - - - - - - - - - - - - - - - -
1.1986 7648 0.2649 - - - - - - - - - - - - - - - - - -
1.2036 7680 0.5884 0.9296 0.5000 4.7865 0.1851 0.0548 0.5506 0.2425 0.0315 0.1508 1.0354 3.8022 0.0859 0.2493 0.2833 0.0428 1.6866 0.5460 0.8000
1.2086 7712 0.7018 - - - - - - - - - - - - - - - - - -
1.2136 7744 0.7082 - - - - - - - - - - - - - - - - - -
1.2186 7776 0.7527 - - - - - - - - - - - - - - - - - -
1.2236 7808 0.4255 - - - - - - - - - - - - - - - - - -
1.2286 7840 0.7488 - - - - - - - - - - - - - - - - - -
1.2337 7872 0.3364 - - - - - - - - - - - - - - - - - -
1.2387 7904 0.6963 - - - - - - - - - - - - - - - - - -
1.2437 7936 0.2829 - - - - - - - - - - - - - - - - - -
1.2487 7968 0.7504 - - - - - - - - - - - - - - - - - -
1.2537 8000 0.7759 0.7162 0.4485 4.8030 0.1880 0.0619 0.4859 0.2364 0.0622 0.2386 0.9781 4.1984 0.0850 0.2367 0.2703 0.0472 1.7419 0.6093 0.8107
1.2587 8032 0.5297 - - - - - - - - - - - - - - - - - -
1.2638 8064 0.4933 - - - - - - - - - - - - - - - - - -
1.2688 8096 0.3868 - - - - - - - - - - - - - - - - - -
1.2738 8128 0.9955 - - - - - - - - - - - - - - - - - -
1.2788 8160 0.5548 - - - - - - - - - - - - - - - - - -
1.2838 8192 0.4924 - - - - - - - - - - - - - - - - - -
1.2888 8224 0.3422 - - - - - - - - - - - - - - - - - -
1.2938 8256 0.4707 - - - - - - - - - - - - - - - - - -
1.2989 8288 0.3956 - - - - - - - - - - - - - - - - - -
1.3039 8320 0.547 0.8857 0.4749 4.7629 0.1739 0.0527 0.5004 0.2118 0.0293 0.1351 0.9302 3.5312 0.0791 0.2362 0.2984 0.0405 1.8043 0.5669 0.8020
1.3089 8352 0.5412 - - - - - - - - - - - - - - - - - -
1.3139 8384 0.3885 - - - - - - - - - - - - - - - - - -
1.3189 8416 0.4274 - - - - - - - - - - - - - - - - - -
1.3239 8448 0.893 - - - - - - - - - - - - - - - - - -
1.3289 8480 0.3456 - - - - - - - - - - - - - - - - - -
1.3340 8512 0.4292 - - - - - - - - - - - - - - - - - -
1.3390 8544 0.4275 - - - - - - - - - - - - - - - - - -
1.3440 8576 0.3236 - - - - - - - - - - - - - - - - - -
1.3490 8608 0.3961 - - - - - - - - - - - - - - - - - -
1.3540 8640 0.5146 0.8409 0.4793 4.7572 0.1706 0.0500 0.4634 0.2150 0.0247 0.1045 0.9968 3.5627 0.0777 0.2310 0.2708 0.0391 1.7370 0.5490 0.8042
1.3590 8672 0.7562 - - - - - - - - - - - - - - - - - -
1.3640 8704 0.7881 - - - - - - - - - - - - - - - - - -
1.3691 8736 0.6117 - - - - - - - - - - - - - - - - - -
1.3741 8768 1.3083 - - - - - - - - - - - - - - - - - -
1.3791 8800 0.5359 - - - - - - - - - - - - - - - - - -
1.3841 8832 0.45 - - - - - - - - - - - - - - - - - -
1.3891 8864 0.6022 - - - - - - - - - - - - - - - - - -
1.3941 8896 0.6664 - - - - - - - - - - - - - - - - - -
1.3992 8928 0.3255 - - - - - - - - - - - - - - - - - -
1.4042 8960 0.6036 0.8256 0.3827 4.6999 0.1707 0.0539 0.4949 0.2415 0.0267 0.1461 0.8724 3.5730 0.0786 0.2278 0.2640 0.0412 1.7219 0.5385 0.8082
1.4092 8992 0.4723 - - - - - - - - - - - - - - - - - -
1.4142 9024 0.2569 - - - - - - - - - - - - - - - - - -
1.4192 9056 0.5794 - - - - - - - - - - - - - - - - - -
1.4242 9088 1.022 - - - - - - - - - - - - - - - - - -
1.4292 9120 1.0539 - - - - - - - - - - - - - - - - - -
1.4343 9152 0.4634 - - - - - - - - - - - - - - - - - -
1.4393 9184 0.3755 - - - - - - - - - - - - - - - - - -
1.4443 9216 0.4033 - - - - - - - - - - - - - - - - - -
1.4493 9248 0.522 - - - - - - - - - - - - - - - - - -
1.4543 9280 1.1067 0.7743 0.4368 4.6339 0.1731 0.0541 0.4894 0.2487 0.0238 0.1774 1.0503 3.7515 0.0801 0.2187 0.2320 0.0442 1.6407 0.5432 0.8089
1.4593 9312 0.6612 - - - - - - - - - - - - - - - - - -
1.4643 9344 0.5152 - - - - - - - - - - - - - - - - - -
1.4694 9376 0.7975 - - - - - - - - - - - - - - - - - -
1.4744 9408 0.574 - - - - - - - - - - - - - - - - - -
1.4794 9440 0.8784 - - - - - - - - - - - - - - - - - -
1.4844 9472 0.807 - - - - - - - - - - - - - - - - - -
1.4894 9504 0.4858 - - - - - - - - - - - - - - - - - -
1.4944 9536 0.542 - - - - - - - - - - - - - - - - - -
1.4995 9568 0.4288 - - - - - - - - - - - - - - - - - -
1.5045 9600 0.3218 0.8712 0.4019 4.7045 0.1558 0.0489 0.4834 0.2272 0.0451 0.0879 0.9722 3.9334 0.0725 0.2280 0.2603 0.0378 1.7225 0.5696 0.8058
1.5095 9632 0.7936 - - - - - - - - - - - - - - - - - -
1.5145 9664 0.5664 - - - - - - - - - - - - - - - - - -
1.5195 9696 0.7019 - - - - - - - - - - - - - - - - - -
1.5245 9728 0.6887 - - - - - - - - - - - - - - - - - -
1.5295 9760 0.5558 - - - - - - - - - - - - - - - - - -
1.5346 9792 0.7874 - - - - - - - - - - - - - - - - - -
1.5396 9824 0.6661 - - - - - - - - - - - - - - - - - -
1.5446 9856 0.314 - - - - - - - - - - - - - - - - - -
1.5496 9888 0.6541 - - - - - - - - - - - - - - - - - -
1.5546 9920 0.3876 0.8161 0.4282 4.6019 0.1474 0.0478 0.5195 0.2488 0.0193 0.1276 0.9812 3.7592 0.0711 0.2148 0.2558 0.0373 1.7022 0.5342 0.8063
1.5596 9952 0.4225 - - - - - - - - - - - - - - - - - -
1.5646 9984 0.5979 - - - - - - - - - - - - - - - - - -
1.5697 10016 0.4349 - - - - - - - - - - - - - - - - - -
1.5747 10048 0.8265 - - - - - - - - - - - - - - - - - -
1.5797 10080 0.4669 - - - - - - - - - - - - - - - - - -
1.5847 10112 0.6543 - - - - - - - - - - - - - - - - - -
1.5897 10144 0.5953 - - - - - - - - - - - - - - - - - -
1.5947 10176 0.7695 - - - - - - - - - - - - - - - - - -
1.5997 10208 1.0416 - - - - - - - - - - - - - - - - - -
1.6048 10240 0.582 0.8168 0.3981 4.6404 0.1526 0.0492 0.4880 0.2256 0.0500 0.1247 0.8930 3.7979 0.0761 0.2191 0.2541 0.0378 1.6163 0.5045 0.8082
1.6098 10272 0.4853 - - - - - - - - - - - - - - - - - -
1.6148 10304 0.7606 - - - - - - - - - - - - - - - - - -
1.6198 10336 0.7573 - - - - - - - - - - - - - - - - - -
1.6248 10368 0.8745 - - - - - - - - - - - - - - - - - -
1.6298 10400 0.5335 - - - - - - - - - - - - - - - - - -
1.6349 10432 0.8592 - - - - - - - - - - - - - - - - - -
1.6399 10464 0.5884 - - - - - - - - - - - - - - - - - -
1.6449 10496 0.5912 - - - - - - - - - - - - - - - - - -
1.6499 10528 0.4696 - - - - - - - - - - - - - - - - - -
1.6549 10560 0.6711 0.7470 0.3632 4.6748 0.1483 0.0474 0.4563 0.2126 0.0224 0.1501 0.8485 4.1075 0.0723 0.2246 0.2475 0.0358 1.5900 0.5061 0.8069
1.6599 10592 0.6604 - - - - - - - - - - - - - - - - - -
1.6649 10624 0.7325 - - - - - - - - - - - - - - - - - -
1.6700 10656 0.5003 - - - - - - - - - - - - - - - - - -
1.6750 10688 0.7602 - - - - - - - - - - - - - - - - - -
1.6800 10720 0.3509 - - - - - - - - - - - - - - - - - -
1.6850 10752 0.5256 - - - - - - - - - - - - - - - - - -
1.6900 10784 0.72 - - - - - - - - - - - - - - - - - -
1.6950 10816 0.3566 - - - - - - - - - - - - - - - - - -
1.7000 10848 0.4914 - - - - - - - - - - - - - - - - - -
1.7051 10880 0.803 0.7336 0.3736 4.7360 0.1498 0.0480 0.4678 0.2350 0.0248 0.1196 0.8494 3.8989 0.0745 0.2213 0.2422 0.0366 1.5623 0.4660 0.8067
1.7101 10912 0.631 - - - - - - - - - - - - - - - - - -
1.7151 10944 0.4674 - - - - - - - - - - - - - - - - - -
1.7201 10976 0.59 - - - - - - - - - - - - - - - - - -
1.7251 11008 0.6661 - - - - - - - - - - - - - - - - - -
1.7301 11040 0.5495 - - - - - - - - - - - - - - - - - -
1.7352 11072 0.4449 - - - - - - - - - - - - - - - - - -
1.7402 11104 0.9734 - - - - - - - - - - - - - - - - - -
1.7452 11136 0.8756 - - - - - - - - - - - - - - - - - -
1.7502 11168 0.5044 - - - - - - - - - - - - - - - - - -
1.7552 11200 0.4335 0.7242 0.3770 4.7027 0.1496 0.0460 0.4625 0.2057 0.0449 0.1332 0.8772 4.0420 0.0724 0.2132 0.2311 0.0358 1.5884 0.4569 0.8063
1.7602 11232 0.9002 - - - - - - - - - - - - - - - - - -
1.7652 11264 0.7993 - - - - - - - - - - - - - - - - - -
1.7703 11296 0.7534 - - - - - - - - - - - - - - - - - -
1.7753 11328 0.505 - - - - - - - - - - - - - - - - - -
1.7803 11360 0.5255 - - - - - - - - - - - - - - - - - -
1.7853 11392 1.1055 - - - - - - - - - - - - - - - - - -
1.7903 11424 0.4554 - - - - - - - - - - - - - - - - - -
1.7953 11456 0.4593 - - - - - - - - - - - - - - - - - -
1.8003 11488 0.3412 - - - - - - - - - - - - - - - - - -
1.8054 11520 0.5286 0.7080 0.3906 4.7211 0.1465 0.0450 0.4674 0.2045 0.0527 0.1149 0.8453 3.9840 0.0704 0.2134 0.2339 0.0343 1.5864 0.4692 0.8070
1.8104 11552 1.1054 - - - - - - - - - - - - - - - - - -
1.8154 11584 0.8731 - - - - - - - - - - - - - - - - - -
1.8204 11616 0.7774 - - - - - - - - - - - - - - - - - -
1.8254 11648 0.7425 - - - - - - - - - - - - - - - - - -
1.8304 11680 0.4233 - - - - - - - - - - - - - - - - - -
1.8354 11712 1.0839 - - - - - - - - - - - - - - - - - -
1.8405 11744 1.0086 - - - - - - - - - - - - - - - - - -
1.8455 11776 0.9838 - - - - - - - - - - - - - - - - - -
1.8505 11808 1.0228 - - - - - - - - - - - - - - - - - -
1.8555 11840 0.5337 0.6966 0.3860 4.7110 0.1449 0.0454 0.4610 0.2012 0.0451 0.1217 0.8483 4.0680 0.0720 0.2120 0.2290 0.0352 1.5618 0.4564 0.8091
1.8605 11872 0.4719 - - - - - - - - - - - - - - - - - -
1.8655 11904 0.9254 - - - - - - - - - - - - - - - - - -
1.8706 11936 0.4605 - - - - - - - - - - - - - - - - - -
1.8756 11968 0.5605 - - - - - - - - - - - - - - - - - -
1.8806 12000 0.804 - - - - - - - - - - - - - - - - - -
1.8856 12032 0.8148 - - - - - - - - - - - - - - - - - -
1.8906 12064 0.6428 - - - - - - - - - - - - - - - - - -
1.8956 12096 0.764 - - - - - - - - - - - - - - - - - -
1.9006 12128 0.8099 - - - - - - - - - - - - - - - - - -
1.9057 12160 0.3568 0.6930 0.3947 4.7134 0.1439 0.0448 0.4563 0.1980 0.0221 0.1216 0.8413 4.0270 0.0713 0.2107 0.2297 0.0346 1.5734 0.4509 0.8084
1.9107 12192 0.6994 - - - - - - - - - - - - - - - - - -
1.9157 12224 1.102 - - - - - - - - - - - - - - - - - -
1.9207 12256 0.7589 - - - - - - - - - - - - - - - - - -
1.9257 12288 0.8421 - - - - - - - - - - - - - - - - - -
1.9307 12320 0.6796 - - - - - - - - - - - - - - - - - -
1.9357 12352 0.8515 - - - - - - - - - - - - - - - - - -
1.9408 12384 0.6122 - - - - - - - - - - - - - - - - - -
1.9458 12416 1.1603 - - - - - - - - - - - - - - - - - -
1.9508 12448 1.2334 - - - - - - - - - - - - - - - - - -
1.9558 12480 0.6642 0.6915 0.3929 4.7102 0.1436 0.0447 0.4545 0.1974 0.0493 0.1189 0.8452 4.0444 0.0710 0.2108 0.2221 0.0345 1.5630 0.4530 0.8084
1.9608 12512 0.747 - - - - - - - - - - - - - - - - - -
1.9658 12544 0.9231 - - - - - - - - - - - - - - - - - -
1.9709 12576 1.1242 - - - - - - - - - - - - - - - - - -
1.9759 12608 0.5239 - - - - - - - - - - - - - - - - - -
1.9809 12640 0.697 - - - - - - - - - - - - - - - - - -
1.9859 12672 0.9842 - - - - - - - - - - - - - - - - - -
1.9909 12704 0.8476 - - - - - - - - - - - - - - - - - -
1.9959 12736 0.6754 - - - - - - - - - - - - - - - - - -
2.0 12762 - 0.6916 0.3938 4.7109 0.1435 0.0447 0.4545 0.1969 0.0197 0.1176 0.8446 4.0425 0.0709 0.2108 0.2206 0.0344 1.5648 0.4528 0.8083

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.3
  • PyTorch: 2.1.2
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

AdaptiveLayerLoss

@misc{li20242d,
    title={2D Matryoshka Sentence Embeddings}, 
    author={Xianming Li and Zongxi Li and Jing Li and Haoran Xie and Qing Li},
    year={2024},
    eprint={2402.14776},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}, 
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bobox/DeBERTa-ST-AllLayers-v3.1

Finetuned
(2)
this model

Datasets used to train bobox/DeBERTa-ST-AllLayers-v3.1

Evaluation results