diff --git "a/checkpoint-657/README.md" "b/checkpoint-657/README.md"
new file mode 100644--- /dev/null
+++ "b/checkpoint-657/README.md"
@@ -0,0 +1,1264 @@
+---
+language:
+- en
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:220804
+- loss:GISTEmbedLoss
+base_model: BAAI/bge-m3
+widget:
+- source_sentence: when was the venetian built in las vegas
+ sentences:
+ - The Venetian Las Vegas In April 1996, Sheldon Adelson announced plans to create
+ on the property the largest resort on the Strip. This project would be situated
+ on the former Sands property. On November 26, 1996, eight years after it was purchased
+ by the owners of The Interface Group—Adelson, Richard Katzeff, Ted Cutler, Irwin
+ Chafetz and Jordan Shapiro, the Sands Hotel was imploded to make way for The Venetian
+ Resort Hotel Casino. Groundbreaking for the hotel began on April 14, 1997.
+ - 8 Out of 10 Cats The show is currently recorded at Pinewood Studios, previously
+ at BBC Television Centre, typically the day before transmission. However, in the
+ past few years, due to Jimmy Carr's stand-up schedule, certain episodes are pretaped
+ and broadcast later as "special" episodes dealing with a particular subject.[5]
+ - Hurricane Harvey The eighth named storm, third hurricane, and the first major
+ hurricane of the extremely active 2017 Atlantic hurricane season, Harvey developed
+ from a tropical wave to the east of the Lesser Antilles, reaching tropical storm
+ status on August 17. The storm crossed through the Windward Islands on the following
+ day, passing just south of Barbados and later near Saint Vincent. Upon entering
+ the Caribbean Sea, Harvey began to weaken due to moderate wind shear and degenerated
+ into a tropical wave north of Colombia early on August 20. The remnants were monitored
+ for regeneration as it continued west-northwestward across the Caribbean and the
+ Yucatán Peninsula, before redeveloping over the Bay of Campeche on August 23.
+ Harvey then began to rapidly intensify on August 24, regaining tropical storm
+ status and becoming a hurricane later that day. While the storm moved generally
+ northwest, Harvey's intensification phase stalled slightly overnight from August 24–25;
+ however, Harvey soon resumed strengthening and quickly became a major hurricane
+ and attained Category 4 intensity later that day. Hours later, Harvey made landfall
+ near Rockport, Texas, at peak intensity. Afterwards, rapid weakening ensued, and
+ Harvey had downgraded to a tropical storm as it stalled near the coastline of
+ the state, dropping torrential and unprecedented amounts of rainfall over the
+ Lone Star state. On August 28, it emerged back over the Gulf of Mexico, strengthening
+ slightly before making a third and final landfall in Louisiana on August 29. As
+ Harvey drifted inland, it quickly weakened again as it became extratropical on
+ September 1, before dissipating two days later.
+- source_sentence: Over 435,000 coronavirus cases had been confirmed around the world
+ by March 25 , while more than 110,000 people had recovered .
+ sentences:
+ - As of 25 March , more than 445,000 cases of COVID-19 have been reported in more
+ than 190 countries and territories , resulting in more than 19,700 deaths and
+ more than 112,000 recoveries .
+ - As of 25 March , more than 435,000 cases of COVID-19 have been reported in more
+ than 190 countries and territories , resulting in more than 19,600 deaths and
+ more than 111,000 recoveries .
+ - As of 23 March , more than 372,000 cases of COVID-19 have been reported in over
+ 190 countries and territories , resulting in more than 16,300 deaths and over
+ 101,000 recoveries .
+- source_sentence: When it rains, some animals will move to seek shelter.
+ sentences:
+ - What is the term for cellular eating?
+ - What kind of family is haumea part of?
+ - When it rains, some animals will
+- source_sentence: Pyramid ecosystem modeling can also be used to show energy flow
+ through the trophic levels; pyramids of energy are always upright since energy
+ decreases at each trophic level.
+ sentences:
+ - Pyramid ecosystem is used to show energy flow through the trophic levels.
+ - The extensive property is a property that depends on the amount of matter in a
+ sample.
+ - Other than gametes, normal human cells have a total of 46 chromosomes per cell.
+- source_sentence: Where is the White Desert National Park situated?
+ sentences:
+ - 'Victoria, the Capital of British Columbia, Canada Updated: 12/16/2014 About the
+ City of Victoria, British Columbia Victoria is the capital city of the province
+ of British Columbia , Canada. Victoria is a gateway to the Pacific Rim, is close
+ to U.S. Markets, and has many sea and air links that make it a business hub. With
+ the mildest climate in Canada, Victoria is known for its gardens and is a clean
+ and charming city. Victoria holds many reminders of both its native and British
+ heritage, and views of totem poles combine with afternoon tea. The focus of downtown
+ Victoria is the inner harbour, overlooked by the Parliament Buildings and the
+ historic Fairmont Empress Hotel. Location of Victoria, British Columbia Victoria
+ is located on the southern tip of Vancouver Island.'
+ - Egyptian Tourism Authority HOME > Adventure The White Desert The White Desert
+ is justifiably the most well-known desert destination in Egypt – and for a good
+ reason. The quantity of unearthly and beautiful wind-carved rock formations shaped
+ in the form of giant mushrooms or pebbles is unequalled in any desert in the world.
+ Farafra is nearer than Bahariya to this 300 kilometres protectorate, yet it offers
+ a more limited choice of tours and safaris. However, it is still the perfect starting
+ point for an overnight journey into the infinite whiteness.
+ - Madness Concert Music Dubai Madness live concert, Dubai Concert, Dubai live entertainment,Dubai
+ Duty Free Tennis Stadium British music legends Madness have been confirmed to
+ play at the Dubai Duty Free Tennis Stadium on Thursday October 6. One of the most
+ prominent bands of the late 1970s and early 1980s, Madness are an English SKA
+ band that formed in Camden, London and were part of the 2Tone Ska revival. They
+ were hugely popular in the 1980s, spending over 214 weeks in the UK singles chart
+ between 1980 and 1986, including 15 singles in the Top 10. They had only one No.1
+ during their career, but House of Fun is still a current radio favourite. Led
+ by gang leader Suggs, Madness are still a band in demand and in February 2015
+ the band announced the Grandslam tour, taking in 20 outdoor venues.The band shot
+ back into prominence in 2012 when they performed at the Queen's Diamond Jubilee
+ Concert at Buckingham Palace, and then played at the closing Ceremony for the
+ London Olympics. Madness also announced their new album Can't Touch Us Now which
+ is due to be released 21st October 2016 and a tour to follow in December 2016.
+ Doors open at 7PM. Tickets for the show on October 6 are available from Platinumlist.net,
+ with seated tickets AED250 and standing AED300.
+datasets:
+- bobox/enhanced_NLI-50K
+- tals/vitaminc
+- allenai/scitail
+- bobox/xSum-processed
+- allenai/sciq
+- allenai/qasc
+- bobox/OpenbookQA-4ST
+- sentence-transformers/natural-questions
+- sentence-transformers/trivia-qa
+- sentence-transformers/gooaq
+- google-research-datasets/paws
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+metrics:
+- pearson_cosine
+- spearman_cosine
+- cosine_accuracy
+- cosine_accuracy_threshold
+- cosine_f1
+- cosine_f1_threshold
+- cosine_precision
+- cosine_recall
+- cosine_ap
+model-index:
+- name: SentenceTransformer based on BAAI/bge-m3
+ results:
+ - task:
+ type: semantic-similarity
+ name: Semantic Similarity
+ dataset:
+ name: sts test
+ type: sts-test
+ metrics:
+ - type: pearson_cosine
+ value: 0.8245725507043073
+ name: Pearson Cosine
+ - type: spearman_cosine
+ value: 0.8556805260032072
+ name: Spearman Cosine
+ - task:
+ type: binary-classification
+ name: Binary Classification
+ dataset:
+ name: allNLI dev
+ type: allNLI-dev
+ metrics:
+ - type: cosine_accuracy
+ value: 0.73828125
+ name: Cosine Accuracy
+ - type: cosine_accuracy_threshold
+ value: 0.8462234139442444
+ name: Cosine Accuracy Threshold
+ - type: cosine_f1
+ value: 0.6362545018007203
+ name: Cosine F1
+ - type: cosine_f1_threshold
+ value: 0.7372293472290039
+ name: Cosine F1 Threshold
+ - type: cosine_precision
+ value: 0.5364372469635628
+ name: Cosine Precision
+ - type: cosine_recall
+ value: 0.7817109144542773
+ name: Cosine Recall
+ - type: cosine_ap
+ value: 0.6244503911184303
+ name: Cosine Ap
+ - task:
+ type: binary-classification
+ name: Binary Classification
+ dataset:
+ name: Qnli dev
+ type: Qnli-dev
+ metrics:
+ - type: cosine_accuracy
+ value: 0.7177734375
+ name: Cosine Accuracy
+ - type: cosine_accuracy_threshold
+ value: 0.7129597663879395
+ name: Cosine Accuracy Threshold
+ - type: cosine_f1
+ value: 0.7221719457013575
+ name: Cosine F1
+ - type: cosine_f1_threshold
+ value: 0.6851584911346436
+ name: Cosine F1 Threshold
+ - type: cosine_precision
+ value: 0.6435483870967742
+ name: Cosine Precision
+ - type: cosine_recall
+ value: 0.822680412371134
+ name: Cosine Recall
+ - type: cosine_ap
+ value: 0.7715845808061284
+ name: Cosine Ap
+---
+
+# SentenceTransformer based on BAAI/bge-m3
+
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) on the [negation-triplets](https://huggingface.co/datasets/bobox/enhanced_NLI-50K), [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc), [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail), [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail), [xsum-pairs](https://huggingface.co/datasets/bobox/xSum-processed), [sciq_pairs](https://huggingface.co/datasets/allenai/sciq), [qasc_pairs](https://huggingface.co/datasets/allenai/qasc), [openbookqa_pairs](https://huggingface.co/datasets/bobox/OpenbookQA-4ST), [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions), [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa), [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq), [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) and global_dataset datasets. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+
+## Model Details
+
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
+- **Maximum Sequence Length:** 8192 tokens
+- **Output Dimensionality:** 1024 dimensions
+- **Similarity Function:** Cosine Similarity
+- **Training Datasets:**
+ - [negation-triplets](https://huggingface.co/datasets/bobox/enhanced_NLI-50K)
+ - [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc)
+ - [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail)
+ - [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail)
+ - [xsum-pairs](https://huggingface.co/datasets/bobox/xSum-processed)
+ - [sciq_pairs](https://huggingface.co/datasets/allenai/sciq)
+ - [qasc_pairs](https://huggingface.co/datasets/allenai/qasc)
+ - [openbookqa_pairs](https://huggingface.co/datasets/bobox/OpenbookQA-4ST)
+ - [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions)
+ - [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa)
+ - [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq)
+ - [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws)
+ - global_dataset
+- **Language:** en
+
+
+### Model Sources
+
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+
+### Full Model Architecture
+
+```
+SentenceTransformer(
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
+ (1): AdvancedWeightedPooling(
+ (mha): MultiheadAttention(
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True)
+ )
+ )
+)
+```
+
+## Usage
+
+### Direct Usage (Sentence Transformers)
+
+First install the Sentence Transformers library:
+
+```bash
+pip install -U sentence-transformers
+```
+
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+
+# Download from the 🤗 Hub
+model = SentenceTransformer("bobox/XLMRoBERTaM3-CustomPoolin-v1-s1-checkpoints-tmp")
+# Run inference
+sentences = [
+ 'Where is the White Desert National Park situated?',
+ 'Egyptian Tourism Authority HOME > Adventure The White Desert The White Desert is justifiably the most well-known desert destination in Egypt – and for a good reason. The quantity of unearthly and beautiful wind-carved rock formations shaped in the form of giant mushrooms or pebbles is unequalled in any desert in the world. Farafra is nearer than Bahariya to this 300 kilometres protectorate, yet it offers a more limited choice of tours and safaris. However, it is still the perfect starting point for an overnight journey into the infinite whiteness.',
+ "Madness Concert Music Dubai Madness live concert, Dubai Concert, Dubai live entertainment,Dubai Duty Free Tennis Stadium British music legends Madness have been confirmed to play at the Dubai Duty Free Tennis Stadium on Thursday October 6. One of the most prominent bands of the late 1970s and early 1980s, Madness are an English SKA band that formed in Camden, London and were part of the 2Tone Ska revival. They were hugely popular in the 1980s, spending over 214 weeks in the UK singles chart between 1980 and 1986, including 15 singles in the Top 10. They had only one No.1 during their career, but House of Fun is still a current radio favourite. Led by gang leader Suggs, Madness are still a band in demand and in February 2015 the band announced the Grandslam tour, taking in 20 outdoor venues.The band shot back into prominence in 2012 when they performed at the Queen's Diamond Jubilee Concert at Buckingham Palace, and then played at the closing Ceremony for the London Olympics. Madness also announced their new album Can't Touch Us Now which is due to be released 21st October 2016 and a tour to follow in December 2016. Doors open at 7PM. Tickets for the show on October 6 are available from Platinumlist.net, with seated tickets AED250 and standing AED300.",
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 1024]
+
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+
+
+
+
+
+
+
+## Evaluation
+
+### Metrics
+
+#### Semantic Similarity
+
+* Dataset: `sts-test`
+* Evaluated with [EmbeddingSimilarityEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+
+| Metric | Value |
+|:--------------------|:-----------|
+| pearson_cosine | 0.8246 |
+| **spearman_cosine** | **0.8557** |
+
+#### Binary Classification
+
+* Datasets: `allNLI-dev` and `Qnli-dev`
+* Evaluated with [BinaryClassificationEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
+
+| Metric | allNLI-dev | Qnli-dev |
+|:--------------------------|:-----------|:-----------|
+| cosine_accuracy | 0.7383 | 0.7178 |
+| cosine_accuracy_threshold | 0.8462 | 0.713 |
+| cosine_f1 | 0.6363 | 0.7222 |
+| cosine_f1_threshold | 0.7372 | 0.6852 |
+| cosine_precision | 0.5364 | 0.6435 |
+| cosine_recall | 0.7817 | 0.8227 |
+| **cosine_ap** | **0.6245** | **0.7716** |
+
+
+
+
+
+## Training Details
+
+### Training Datasets
+
+#### negation-triplets
+
+* Dataset: [negation-triplets](https://huggingface.co/datasets/bobox/enhanced_NLI-50K) at [d43e6fe](https://huggingface.co/datasets/bobox/enhanced_NLI-50K/tree/d43e6fe7f1e171f916502c123235d4b9ec997cb4)
+* Size: 5,025 training samples
+* Columns: anchor
, entailment
, and negative
+* Approximate statistics based on the first 1000 samples:
+ | | anchor | entailment | negative |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string | string |
+ | details |
He apparently often tells the joke that he had better make his money at AOL, because after he leaves, no one will ever hire him because he has been so obnoxious for so long.
| He tells a joke about how much money he could make at AOL.
| He tells a joke about how little money he could make at AOL.
|
+ | Though, if it is as we suspect, it seems a clear enough case.
| It seems like a clear case.
| It doesn`t seem like a clear case.
|
+ | In the apartments, in addition to fine 16th-century Flemish tapestries and French, Italian, and Spanish furniture, you'll see Diane's neatly kept household accounts.
| Diane kept neat household accounts along with having fine 16th-century Flemish tapestries.
| Diane did not keep messy household accounts along with not having fine 16th-century Flemish tapestries.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### vitaminc-pairs
+
+* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
+* Size: 5,025 training samples
+* Columns: claim
and evidence
+* Approximate statistics based on the first 1000 samples:
+ | | claim | evidence |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Culver City is bordered by Mar Vista and Psalms to the south .
| The city is surrounded by the Los Angeles neighborhoods of Mar Vista and Palms to the north ; Westchester to the south ; Mid-City and West Adams to the east ; the Baldwin Hills and Ladera Heights unincorporated areas to the southeast ; and the
|
+ | Demarai Gray 's manager thought it was not right to book him for falling .
| Against Leeds United in September , he was denied a penalty and booked for diving when apparently fouled by Giuseppe Bellusci ; both managers thought it the wrong decision .
|
+ | Ian Simpson stage name is Kevin Abstract .
| Ian Simpson ( born July 16 , 1996 ) , known by his stage name Kevin Abstract , is an American rapper , singer-songwriter , and director .
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### scitail-pairs-qa
+
+* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Individual is the term for an organism, or single living thing.
| What is the term for an organism, or single living thing?
|
+ | A(n) increase in length happens to metal railroad tracks during the heat of a summer day.
| What happens to metal railroad tracks during the heat of a summer day?
|
+ | Most red algae species live in oceans.
| Where do most red algae species live?
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### scitail-pairs-pos
+
+* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Density is the amount of mass in a specified space.
| Density describes how much matter is in a certain amount of space.
|
+ | The Pupil is the round hole that connects the front chamber of the eye to the interior chamber of the eye.
| The pupil is the opening in the front of the eye.
|
+ | 95% of its DNA encodes for proteins for stable RNA molecules.
| Dna encodes instructions for proteins molecules.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### xsum-pairs
+
+* Dataset: [xsum-pairs](https://huggingface.co/datasets/bobox/xSum-processed) at [044020f](https://huggingface.co/datasets/bobox/xSum-processed/tree/044020f516c1830da392e567474cd5452971366f)
+* Size: 131,779 training samples
+* Columns: summary
and document
+* Approximate statistics based on the first 1000 samples:
+ | | summary | document |
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Welsh Lib Dems would cut the basic rate of income tax to help "ordinary workers" once the power passes to Wales, its leader has said.
| Kirsty Williams promised a costed plan to cut the 20% starting rate to 19%.
She said the Lib Dems had cut taxes for low and middle-income earners during the UK coalition government.
She denounced the Welsh Tories for promising to prioritise tax cuts for higher earners, and expected Labour and Plaid Cymru to defend the status quo.
During his Autumn Statement last Wednesday, Chancellor George Osborne said control of some of the income tax levied in Wales could be devolved without a referendum.
The sharing of tax powers between ministers in Cardiff and London would mean the Welsh government controlling ??3bn of taxes a year by 2020.
|
+ | The 2017 BBC Sports Personality of the Year ceremony will be held on Sunday, 17 December at Liverpool's Echo Arena.
| The prestigious awards event, first staged in 1954, was last held at the venue in 2008.
Gary Lineker, Clare Balding and Gabby Logan will host a celebration of the best sporting achievements of 2017, in front of an audience of nearly 11,000.
Britain's world number one tennis player Andy Murray has won the main prize in three of the past four years.
The Scot is the only person to win the award more than twice, while other former winners include Bobby Moore, Sir Henry Cooper, Virginia Wade and Daley Thompson, plus Princess Anne and daughter Zara Phillips.
Ticket details for this year's event will be announced later in the year.
Barbara Slater, director of BBC Sport, said: "2017 marks a very exciting year of sport, from England winning the Six Nations to Chelsea winning the Premier League, Arsenal scooping the FA Cup at Wembley to Anthony Joshua's nail-biting fight against Wladimir Klitschko."
Liverpool's mayor Joe Anderson said: "We're honoured and excited to be rolling out the red carpet...
|
+ | UKIP leader Nigel Farage has insisted his party will win seats at next year's Welsh assembly election but has ruled out standing himself.
| Speaking to BBC's Sunday Politics Wales, Mr Farage said Wales was now a "top priority" for the party.
He said: "The people who are standing for the assembly in Cardiff are not doing so as a protest movement.
"We're doing so with a positive frame of mind and... to do our very best for the people in Wales who elect us."
He added: "If that means becoming a constructive opposition, or if it did mean in some way helping in government, we'd be quite prepared to fulfil either role."
When asked if he would consider standing in the assembly election, he said: "It's a lovely thought and great part of the world to live in.
"I think in some ways life would be more comfortable living in Wales than living on the edge of London as I am but unfortunately, I'll have to rule it out".
He also called for a return of grammar schools in Wales, saying: "We want to make sure that bright kids who come from poor backgrounds have the opportunity to do as well as kids that come from the richest families in Wales...
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### sciq_pairs
+
+* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | What happens to energy when an atom gains an electron?
| A: Energy is released when an atom gains an electron. Halogens release the most energy when they form ions. As a result, they are very reactive elements.
|
+ | The most common two-lens telescope, like the simple microscope, uses lenses of what shape?
| The most common two-lens telescope, like the simple microscope, uses two convex lenses and is shown in Figure 26.23(b). The object is so far away from the telescope that it is essentially at infinity compared with the focal lengths of the lenses ( d o ≈ ∞ ). The first image is thus produced at.
|
+ | What is the wheeled robot developed by nasa to explore the surface of mars?
| The Mars Rover pictured here is a wheeled robot developed by NASA. Its job is to explore the surface of Mars. The rover contains a lot of complex modern technology. But how it moves by rolling on wheels is a very old invention. The wheel was probably invented many times in different cultures, beginning at least 10,000 years ago. In addition to wheeled carts and chariots, early wheels were used for water wheels, grinding wheels, and wheels for spinning pottery. Wheels really changed human life. They revolutionized transportation and made it much easier to do many different kinds of work.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### qasc_pairs
+
+* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | what carries deoxygenated blood?
| Veins generally carry deoxygenated blood.. Arteries and veins are blood vessels.
vessels carry deoxygenated blood
|
+ | what gets affected when worldwide temperature increases?
| global warming is when worldwide temperature increases. Global warming affects wildlife .
it affects wildlife when worldwide temperature increases
|
+ | Incandescent bulbs convert electricity into what in the electric field?
| an incandescent light bulb converts electricity into light by sending electricity through a filament. Light is the oscillations in the electric field.
Incandescent bulbs convert electricity into oscillations in the electric field.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### openbookqa_pairs
+
+* Dataset: [openbookqa_pairs](https://huggingface.co/datasets/bobox/OpenbookQA-4ST) at [3c2724c](https://huggingface.co/datasets/bobox/OpenbookQA-4ST/tree/3c2724cbd7a9828685de0976c8a6ea6491b2e326)
+* Size: 4,957 training samples
+* Columns: question
and fact
+* Approximate statistics based on the first 1000 samples:
+ | | question | fact |
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | What requires nutrients to grow?
| sharp beaks are a kind of adaptation for catching prey
|
+ | Which kind of behavior has a fixed action pattern?
| migration is an instinctive behavior
|
+ | Which of the following would be excluded from a list of ecosystems?
| the Earth contains many ecosystems
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### nq_pairs
+
+* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | who wrote with great power comes great responsibility
| Uncle Ben The origins of the phrase pre-date its use in Spider-Man. Most famously quoted by Albert Einstein in the early 20th century, this phrase has questionable origins. In 1817, member of British parliament William Lamb is recorded saying, "the possession of great power necessarily implies great responsibility."[33] In 1906, Under-Secretary of the Colonial Office Winston Churchill said, "Where there is great power there is great responsibility",[34] even indicating that it was already a cultural maxim invoked toward government at the time.
|
+ | who votes on the ap college football poll
| AP Poll The Associated Press (AP Poll) provides weekly rankings of the top 25 NCAA teams in one of three Division I college sports: football, men's basketball and women's basketball. The rankings are compiled by polling 65 sportswriters and broadcasters from across the nation.[1] Each voter provides his own ranking of the top 25 teams, and the individual rankings are then combined to produce the national ranking by giving a team 25 points for a first place vote, 24 for a second place vote, and so on down to 1 point for a twenty-fifth place vote. Ballots of the voting members in the AP Poll are made public.[2]
|
+ | what is the definition of man made environment
| Built environment In social science, the term built environment, or built world, refers to the human-made surroundings that provide the setting for human activity, ranging in scale from buildings to parks. It has been defined as "the human-made space in which people live, work, and recreate on a day-to-day basis."[1] The "built environment encompasses places and spaces created or modified by people including buildings, parks, and transportation systems." In recent years,[when?] public health research has expanded the definition of "built environment" to include healthy food access, community gardens, mental health,[2][3] "walkability", and "bikeability".[4]
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### trivia_pairs
+
+* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
+* Size: 5,025 training samples
+* Columns: query
and answer
+* Approximate statistics based on the first 1000 samples:
+ | | query | answer |
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Antares is the brightest star in which zodiacal constellation?
| Antares - definition of Antares by The Free Dictionary Antares - definition of Antares by The Free Dictionary http://www.thefreedictionary.com/Antares (ăn-târ′ēz, -tăr′-) n. A giant red binary star, the brightest in the constellation Scorpio, about 424 light years from Earth. [Greek Antarēs : anti-, rival of; see anti- + Arēs, Ares, the planet Mars.] Antares (ænˈtɛəriːz) n (Astronomy) the brightest star in the constellation Scorpius. It is a variable binary star whose main component, a red supergiant, is associated with a fainter green component. Visual magnitude: 1.2 (red), 6.8 (green); spectral type: M1.5Ib (red); distance: 600 light years [from Greek Antarēs, literally: simulating Mars (in colour), from anti- + Arēs Mars] An•tar•es
|
+ | Which famous politician is married to Miriam Gonzales Valladold and has three sons?
| Nick Clegg - Bio, Facts, Family | Famous Birthdays Nick Clegg Capricorn Politician#7 About English politician who was installed as the Deputy Prime Minister of the U.K. in 2010. He began representing Sheffield Hallam in Parliament in 2005. Before Fame He graduated from the University of Minnesota and Cambridge, where he was active in the school's theater scene. Trivia He was named the Leader of the Liberal Democrats from 2007 to 2010 and is a fluent speaker of five European languages. Family Life Has three sons, Antonio, Alberto and Miguel, with wife Miriam Gonzalez Durantez, whom he married in 2000.
|
+ | In Greek mythology, who fell in love with a statue called Galatea?
| Pygmalion | Article about Pygmalion by The Free Dictionary Pygmalion | Article about Pygmalion by The Free Dictionary http://encyclopedia2.thefreedictionary.com/Pygmalion Also found in: Dictionary , Thesaurus , Medical , Wikipedia . Pygmalion (pĭgmāl`yən). 1 In Greek mythology, king of Cyprus. He fell in love with a beautiful statue of a woman. When he prayed to Aphrodite for a wife like it, the goddess brought the statue to life and Pygmalion married her. In one version of the legend, the statue becomes Aphrodite; another states that Pygmalion sculpted the statue himself and that after coming to life it was called Galatea. 2 In Vergil's Aeneid, king of Tyre. He was the brother of Dido Dido , in Roman mythology, queen of Carthage, also called Elissa. She was the daughter of a king of Tyre. After her brother Pygmalion murdered her husband, she fled to Libya, where she founded and ruled Carthage. ..... Click the link for more information. and killed her husband, Sychaeus, to get his ric...
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### gooaq_pairs
+
+* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | what data gathering technique includes focus groups?
| Combining two or more data collections methods, for instance interviews as well as focus groups ('data triangulation') enhances the credibility of the study.
|
+ | what is borax acid good for?
| Borax, also called sodium tetraborate, is a powdery white mineral that has been used as a cleaning product for several decades. It has many uses: It helps get rid of stains, mold, and mildew around the house. It can kill insects such as ants.
|
+ | what are the main causes of mangrove loss in florida?
| Mangroves are victims of dredging, filling, and diking, water pollution from oil spills and herbicides, and urban development within the state of Florida.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### paws-pos
+
+* Dataset: [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) at [161ece9](https://huggingface.co/datasets/google-research-datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
+* Size: 5,025 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | The expressway continues for another mile and crosses several buildings in short tunnels before crossing the Harlem River into the Bronx via the Alexander Hamilton Bridge .
| The expressway continues another mile , crossing under several buildings in short tunnels , before crossing the Harlem River via the Alexander Hamilton Bridge into the Bronx .
|
+ | This French-supported production with John Eliot Gardiner , conductor , and his orchestra was directed by Jean Louis Martinoty .
| This French supported production with John Eliot Gardiner , conductor , and his Orchestra was directed by Jean Louis Martinoty .
|
+ | However , mice with a single copy of the non-working TWIST gene survived .
| However , mice with a single copy of the non-working TWIST - Gens survived .
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### global_dataset
+
+* Dataset: global_dataset
+* Size: 33,818 training samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Kid Ink is a singer and songwriter .
| Brian Todd Collins ( born April 1 , 1986 ) , best known by his stage name Kid Ink is an American rapper , singer , songwriter and record producer from Los Angeles , California .
|
+ | It is easy to rationalize avoiding or deferring taking action to address a problem if you do not know how big the problem is.
| If you misjudge the size of a problem it is easier to justify not addressing it.
|
+ | Epithelial tissue covers external and internal surfaces of the body organs.
| Epithelial? tissue consists of cells that cover inner and outer body surfaces.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+### Evaluation Datasets
+
+#### vitaminc-pairs
+
+* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
+* Size: 256 evaluation samples
+* Columns: claim
and evidence
+* Approximate statistics based on the first 256 samples:
+ | | claim | evidence |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Dragon Con had over 5000 guests .
| Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .
|
+ | COVID-19 has reached more than 185 countries .
| As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .
|
+ | In March , Italy had 3.6x times more cases of coronavirus than China .
| As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### negation-triplets
+
+* Dataset: [negation-triplets](https://huggingface.co/datasets/bobox/enhanced_NLI-50K) at [d43e6fe](https://huggingface.co/datasets/bobox/enhanced_NLI-50K/tree/d43e6fe7f1e171f916502c123235d4b9ec997cb4)
+* Size: 256 evaluation samples
+* Columns: anchor
, entailment
, and negative
+* Approximate statistics based on the first 256 samples:
+ | | anchor | entailment | negative |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string | string |
+ | details | The British Rail Class 502 was a type of electric multiple unit originally built by the London Midland and Scottish Railway at their Derby Works .
| The British Rail Class 502 was a type of electric multiple unit originally built by the London Midland and Scottish Railway at their Derby Works workshop .
| The British Rail Class 502 was a type of steam multiple unit originally built by the Southern Railway at their Brighton Works workshop.
|
+ | A group of kids is splashing in deep water nearby a rock formation.
| The kids are in deep water
| The kids are on dry land.
|
+ | The Mongolic languages are a group of languages spoken in Central Asia , notably including Mongolian .
| The Mongolic languages are a group of languages that are spoken in Central Asia .
| The Mongolic languages are a group of languages that are spoken in Western Europe.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### scitail-pairs-pos
+
+* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Wavelength The distance between two consecutive points on a sinusoidal wave that are in phase;
| Wavelength is the distance between two corresponding points of adjacent waves called.
|
+ | humans normally have 23 pairs of chromosomes.
| Humans typically have 23 pairs pairs of chromosomes.
|
+ | kinetic energy the energy of motion.
| Kinetic energy is the energy of motion.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### scitail-pairs-qa
+
+* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Coal can be mined from earth and used as an energy source.
| Which of these can be mined from Earth and used as an energy source?
|
+ | Air is made of atoms.
| Which of the following is made of atoms?
|
+ | The term macroevolution refers to larger evolutionary changes that result in new species.
| What term refers to larger evolutionary changes that result in new species?
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### xsum-pairs
+
+* Dataset: [xsum-pairs](https://huggingface.co/datasets/bobox/xSum-processed) at [044020f](https://huggingface.co/datasets/bobox/xSum-processed/tree/044020f516c1830da392e567474cd5452971366f)
+* Size: 131,779 evaluation samples
+* Columns: summary
and document
+* Approximate statistics based on the first 1000 samples:
+ | | summary | document |
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | More than 50 Conservative election posters have been vandalised in Great Yarmouth since the start of the election campaign, the party has said.
| Brandon Lewis, standing for Great Yarmouth, said the worst incident was outside Scratby where a poster had been cut to look like a swastika.
"These mindless people would rather vandalise election posters than engage in political discussion," he said.
Mr Lewis has reported the vandalism to police due to the scale of the attacks.
Swear words had also been scrawled across other posters.
His election agent is due to meet police officers.
"We are replacing the posters as soon as we find one has been vandalised and we have plenty of them," said Mr Lewis, who won the seat in 2010 and has been a local government minister.
Party workers discovered 22 defaced or destroyed banners over the weekend, with most on private land.
In addition, a further 31 have been attacked since posters were put up at the start of the election campaign.
Other candidates who have declared as standing in the constituency are Alan Grey (UKIP), Lara Norris (Labour), Harry Webb (Green Party) and James Joyce (Liberal Democ...
|
+ | Moy Park has met officials from two executive departments over the botched Renewable Heat Incentive (RHI) scheme.
| Hundreds of farmers supplying chickens to the poultry processing company are recipients of the heating subsidy.
Earlier Finance Minister Máirtín Ó Muilleoir said the meeting had raised "fresh concerns" over the operation of the scheme.
Poultry farmers use the wood chip boilers to heat the buildings where chicks are housed.
Many took advantage of the scheme to replace LPG heating systems.
In his statement, Mr Ó Muilleoir said the Moy Park briefing with his officials had raised "further issues".
But in a statement the company made no reference to any concerns about the operation of the scheme.
It said it had met officials from the departments of economy and finance in order to help find a solution "to secure the Northern Ireland RHI scheme within budget".
It said it had extensive experience of "benchmarked energy use" in the poultry industry.
That appears to be a reference to attempts to establish what would be considered acceptable levels of heat use by poultry farms.
Moy Park said it a...
|
+ | A bouncer was killed by a single punch thrown by a clubber he had just escorted from a bar, a jury has heard.
| James Darrah, 54, died after the incident outside the Stone House bar in Hertford on 23 August 2014.
St Albans Crown Court heard William Wade, 27, became loud and angry and, after he was escorted out of the bar, threw a punch.
Mr Wade, from High Cross, near Ware in Hertfordshire, denies manslaughter and causing Mr Darrah actual bodily harm.
Prosecutor Michael Speak said: "The prosecution say that the defendant William Wade punched James Darrah.
"As it happens, Mr Darrah suffered from a heart condition and, as a result of being punched by the defendant, Mr Darrah died very shortly afterwards.
"The charge is not murder. We don't say he intended to kill Mr Darrah."
Mr Speak told the jury Mr Darrah, a registered door supervisor for a number of years, was doing his first shift at the Stone House club on the night he died.
He said Mr Darrah was dealing with a girl who had had too much to drink, and told her friends she could stay in the club as long as she did not have any more alcohol.
"The...
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### sciq_pairs
+
+* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Which is the least massive outer planet?
| Uranus is the least massive outer planet. Its mass is only about 14 times the mass of Earth. Like all of the outer planets, Uranus is much less dense than Earth. Gravity is actually weaker than on Earth’s surface. If you were at the top of the clouds on Uranus, you would weigh about 10 percent less than what you weigh on Earth.
|
+ | Gametes are products through meiosis in which organs?
| At the end of meiosis, haploid cells are produced. These cells need to further develop into mature gametes capable of fertilization, a process called gametogenesis ( Figure below ). Animals produce gametes directly through meiosis in organs called gonads. Gametogenesis differs between the sexes. In the male, the production of mature sperm cells, or spermatogenesis , results in four haploid gametes, whereas, in the female, the production of a mature egg cell, oogenesis , results in just one mature gamete.
|
+ | What is the process of making an observation in terms of a numerical scale and recording the value?
| Measurement is the process of making an observation in terms of a numerical scale and recording the value.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### qasc_pairs
+
+* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | What is formed by deposition of sediment at the mouth of a water supply fanning out?
| a delta is formed by deposition of sediment at the mouth of a river by water fanning out. Rivers provide a water supply.
a delta is formed by deposition of sediment at the mouth of a water supply fanning out
|
+ | what does energy require?
| healing requires rest. Healing requires energy.
energy requires rest
|
+ | How does a battery work?
| a battery converts chemical energy into electrical energy. Electricity is a form of energy and is sometimes called electrical energy.
Battery electricity comes conversion
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### openbookqa_pairs
+
+* Dataset: [openbookqa_pairs](https://huggingface.co/datasets/bobox/OpenbookQA-4ST) at [3c2724c](https://huggingface.co/datasets/bobox/OpenbookQA-4ST/tree/3c2724cbd7a9828685de0976c8a6ea6491b2e326)
+* Size: 500 evaluation samples
+* Columns: question
and fact
+* Approximate statistics based on the first 500 samples:
+ | | question | fact |
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | The thermal production of a stove is generically used for
| a stove generates heat for cooking usually
|
+ | What creates a valley?
| a valley is formed by a river flowing
|
+ | when it turns day and night on a planet, what cause this?
| a planet rotating causes cycles of day and night on that planet
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### nq_pairs
+
+* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | where do they film 8 out of 10 cats
| 8 Out of 10 Cats The show is currently recorded at Pinewood Studios, previously at BBC Television Centre, typically the day before transmission. However, in the past few years, due to Jimmy Carr's stand-up schedule, certain episodes are pretaped and broadcast later as "special" episodes dealing with a particular subject.[5]
|
+ | when is marvel's the punisher coming out on netflix
| The Punisher (TV series) The Punisher is scheduled to be released on November 17, 2017.
|
+ | what is it called when you vote a president out of office
| Recall election A recall election (also called a recall referendum or representative recall) is a procedure by which voters can remove an elected official from office through a direct vote before that official's term has ended. Recalls, which are initiated when sufficient voters sign a petition, have a history dating back to ancient Athenian democracy[1] and feature in several contemporary constitutions. In indirect or representative democracy people's representatives are elected and these representatives rule for a specific period of time. But if any representative comes to be perceived as not properly discharging their responsibilities, then they can be called back with the written request of specific number or proportion of voters.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### trivia_pairs
+
+* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
+* Size: 256 evaluation samples
+* Columns: query
and answer
+* Approximate statistics based on the first 256 samples:
+ | | query | answer |
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Which TV detective was assisted by Inspector Mike Burden?
| The Ruth Rendell Mysteries - Show News, Reviews, Recaps and Photos - TV.com The Ruth Rendell Mysteries EDIT Welcome to The Ruth Rendell Mysteries guide at TV.com. The guide takes in over thirty hours of crime drama productions, based on the novels and short stories of Ruth Rendell. In most of these, veteran actor George Baker plays Chief Inspector Reg Wexford of Kingsmarkham (a fictional town in the real English county of Hampshire), with Christopher Ravenscroft as his partner-in-detection, Inspector Mike Burden. Wexford's wife Dora is played by Louie Ramsay. She had known George Baker for many years, but they became close while working on the series and married in real life. The Inspector Wexford stories were filmed on location in and around Romsey, Hampshire, and most were broadcast under The Ruth Rendell Mysteries banner. In 1992, having exhausted the supply of Wexford stories, Meridian started adapting other Rendell mysteries, thus belatedly justifying the use of the confusing umbr...
|
+ | Who set a new land speed record in October 1983, driving Thrust 2 at the Black Rock Desert in Nevada, USA?
| Land Speed Record - Black Rock Desert Nevada wiki Land Speed Record Jump to: navigation , search The Black Rock Desert has been the site of several land speed records. As of May, 2013, the world land speed record title is held by Andy Green in his 1997 run. Contents 4 October 1983, Richard Noble, Thrust2, 633.468 mph 25 September 1997, Andy Green, ThrustSSC, 714.144 mph 15 October 1997, Andy Green, ThrustSSC, 763.035 mph (Mach 1.016) Thrust2 BBC Motion Gallery - Search Results "Thrust 2" Derrel S. Fulwider, " From Resource Management to People Management: Reflections of a Federal Land Manager ," p 4, Winter-Spring 1986, The Humboldt Historian. Discussion about the May 11, 1983 public permit hearings in Gerlach and Reno. Map that shows the land speed record site. Thrust SSC Reno Gazette Journal, " Land Speed Record 10th Anniversary ," October 15, 2007. Includes photos. Steve Fossett CNN, " Search continues for aviation adventurer Steve Fossett ," September 4, 2007. Maj. Cynthia S. Ryan ...
|
+ | New Zealand celebrates Matariki, Maori New Year in late May or early June, marking the rising in the southern hemisphere of the seven stars of what?
| Matariki – Māori New Year – Te Ara Encyclopedia of New Zealand What is Matariki? Matariki is the Māori name for the cluster of stars also known as the Pleiades. It rises in mid-winter – late May or early June. For many Māori, it heralds the start of a new year. Matariki literally means the ‘eyes of god’ (mata ariki ) or ‘little eyes’ (mata riki). According to myth, when Ranginui, the sky father, and Papatūānuku, the earth mother, were separated by their children, the god of the winds, Tāwhirimātea, became so angry that he tore out his eyes and threw them into the heavens. Cycles of life and death Traditionally, Matariki was a time to remember those who had died in the last year. But it was also a happy event – crops had been harvested and seafood and birds had been collected. With plenty of food in the storehouses, Matariki was a time for singing, dancing and feasting. Modern Matariki Matariki, or Māori New Year celebrations were once popular, but stopped in the 1940s. In 2000, they we...
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### gooaq_pairs
+
+* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | actor who crashed his plane?
| US actor Harrison Ford is being investigated over an incident last week at an airport in southern California. The Federal Aviation Administration (FAA) said he was piloting a small plane that wrongly crossed a runway where another aircraft was landing.
|
+ | why are triglycerides not classed as polymers?
| The definition of a polymer is a long chain of monomers held together by chemical bonds. ... That is to say, nothing but polarity and weak van der Waals' attraction is holding the triglyceride molecules together and it is because the “monomers” aren't joined together that they can't be considered a polymer.
|
+ | is it okay to eat ice cream when you have a cold?
| In fact, Dr. Steckelberg recommends that cold sufferers drink or eat dairy products such as cream-based soups, ice cream, pudding, or milk, as they are soothing on sore throats and provide calories they otherwise might not eat while they're feeling so lousy.
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### paws-pos
+
+* Dataset: [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) at [161ece9](https://huggingface.co/datasets/google-research-datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
+* Size: 256 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 256 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | They were there to enjoy us and they were there to pray for us .
| They were there for us to enjoy and they were there for us to pray .
|
+ | After the end of the war in June 1902 , Higgins left Southampton in the `` SSBavarian '' in August , returning to Cape Town the following month .
| In August , after the end of the war in June 1902 , Higgins Southampton left the `` SSBavarian '' and returned to Cape Town the following month .
|
+ | From the merger of the Four Rivers Council and the Audubon Council , the Shawnee Trails Council was born .
| Shawnee Trails Council was formed from the merger of the Four Rivers Council and the Audubon Council .
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+#### global_dataset
+
+* Dataset: global_dataset
+* Size: 1,224 evaluation samples
+* Columns: sentence1
and sentence2
+* Approximate statistics based on the first 1000 samples:
+ | | sentence1 | sentence2 |
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+ | type | string | string |
+ | details | Man on a bicycle riding on the road next to a river.
| A man peddling a bike along side a lake on the road
|
+ | What do you call the temperature at which a liquid boils and starts changing to a gas?
| The temperature at which a liquid boils and starts changing to a gas is called its boiling point. The boiling point of pure water is 100°C.
|
+ | when did the us change the legal drinking age
| U.S. history of alcohol minimum purchase age by state From 1976 to 1983, several states voluntarily raised their purchase ages to 19 (or, less commonly, 20 or 21), in part to combat drunk driving fatalities.[citation needed] In 1984, Congress passed the National Minimum Drinking Age Act, which required states to raise their ages for purchase and public possession to 21 by October 1986 or lose 10% of their federal highway funds. By mid-1988, all 50 states and the District of Columbia had raised their purchase ages to 21 (but not Puerto Rico, Guam, or the Virgin Islands, see Additional Notes below). South Dakota and Wyoming were the final two states to comply with the age 21 mandate. The current drinking age of 21 remains a point of contention among many Americans, because of it being higher than the age of majority (18 in most states) and higher than the drinking ages of most other countries. The National Minimum Drinking Age Act is also seen as a congressional sidestep of the tenth ame...
|
+* Loss: [GISTEmbedLoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#gistembedloss) with these parameters:
+ ```json
+ {'guide': SentenceTransformer(
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+ (2): Normalize()
+ ), 'temperature': 0.025}
+ ```
+
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 128
+- `per_device_eval_batch_size`: 256
+- `learning_rate`: 0.001
+- `weight_decay`: 0.001
+- `lr_scheduler_type`: cosine_with_min_lr
+- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 0.0001}
+- `warmup_ratio`: 0.25
+- `save_safetensors`: False
+- `fp16`: True
+- `remove_unused_columns`: False
+- `push_to_hub`: True
+- `hub_model_id`: bobox/XLMRoBERTaM3-CustomPoolin-v1-s1-checkpoints-tmp
+- `hub_strategy`: all_checkpoints
+- `hub_private_repo`: False
+- `batch_sampler`: no_duplicates
+
+#### All Hyperparameters
+