Training in progress, step 16, checkpoint

74ccd24 verified 11 months ago

171 kB

	---
	base_model: bobox/DeBERTa-small-ST-v1-test-step3
	datasets:
	- tals/vitaminc
	- allenai/scitail
	- allenai/sciq
	- allenai/qasc
	- sentence-transformers/msmarco-msmarco-distilbert-base-v3
	- sentence-transformers/natural-questions
	- sentence-transformers/trivia-qa
	- sentence-transformers/gooaq
	- google-research-datasets/paws
	language:
	- en
	library_name: sentence-transformers
	metrics:
	- pearson_cosine
	- spearman_cosine
	- pearson_manhattan
	- spearman_manhattan
	- pearson_euclidean
	- spearman_euclidean
	- pearson_dot
	- spearman_dot
	- pearson_max
	- spearman_max
	- cosine_accuracy
	- cosine_accuracy_threshold
	- cosine_f1
	- cosine_f1_threshold
	- cosine_precision
	- cosine_recall
	- cosine_ap
	- dot_accuracy
	- dot_accuracy_threshold
	- dot_f1
	- dot_f1_threshold
	- dot_precision
	- dot_recall
	- dot_ap
	- manhattan_accuracy
	- manhattan_accuracy_threshold
	- manhattan_f1
	- manhattan_f1_threshold
	- manhattan_precision
	- manhattan_recall
	- manhattan_ap
	- euclidean_accuracy
	- euclidean_accuracy_threshold
	- euclidean_f1
	- euclidean_f1_threshold
	- euclidean_precision
	- euclidean_recall
	- euclidean_ap
	- max_accuracy
	- max_accuracy_threshold
	- max_f1
	- max_f1_threshold
	- max_precision
	- max_recall
	- max_ap
	pipeline_tag: sentence-similarity
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:19500
	- loss:CachedGISTEmbedLoss
	widget:
	- source_sentence: what documents do you need for the nj driving test
	sentences:
	- 'Rating Newest Oldest. Best Answer: tea is an old term for gossip, which came
	from the idea of what goes on at a tea party. Tea party is an old term for a gossip
	fest, and spilling tea means accidentally( on purpose) dropping an irresistible
	tidbit of gossip. True tea would still be gossip, but the gossiper believes they
	have the true story.'
	- The United States declared war on Germany for many reasons. One of them was the
	sinking on neutral ships and the lusitania which killed about 148 Americans. This
	was not the immediate cause, another is propaganda which persuaded public opinion.
	- When you apply for NJ Driver's Permit, you will need to bring the $10 fee, a primary
	and secondary ID, your Social Security Number and proof of your address. These
	documents must total 6 points on New Jersey's verification scale.
	- source_sentence: More than 515,000 coronavirus cases and more than 23,300 fatalities
	had been registered around the world by 26 March
	sentences:
	- more than 680,000 cases of COVID-19 have been reported in over 190 countries and
	territories , resulting in approximately 31,900 deaths .
	- As of 26 March , more than 519,000 cases of COVID-19 have been reported in over
	200 countries and territories , resulting in approximately 23,500 deaths and more
	than 123,000 recoveries .
	- more than 4,900 deaths have been attributed to COVID-19 .
	- source_sentence: Scientists know that some mountains were once at the bottom of
	an ocean because marine fossils have been found on the peaks of some mountains.
	sentences:
	- Different media affect what property of light?
	- Which of these is an example of liquid water?
	- How do scientists know that some mountains were once at the bottom of an ocean?
	- source_sentence: Exposure to ultraviolet radiation can increase the amount of pigment
	in the skin and make it appear darker.
	sentences:
	- Early in the development of a human fetus, the skeleton is made entirely of what?
	- After infecting a host, what inactive state do some viruses enter?
	- Exposure to what can increase the amount of pigment in the skin and make it appear
	darker?
	- source_sentence: Which 2011 Nobel Prize was jointly awarded to Ellen Johnson Sirleaf,
	Leymah Gbowee and Tawakkol Karman?
	sentences:
	- 'The Nobel Peace Prize 2011 The Nobel Peace Prize 2011 Ellen Johnson Sirleaf,
	Leymah Gbowee, Tawakkol Karman Share this: The Nobel Peace Prize 2011 Photo: K.
	Opprann Tawakkol Karman Prize share: 1/3 The Nobel Peace Prize 2011 was awarded
	jointly to Ellen Johnson Sirleaf, Leymah Gbowee and Tawakkol Karman "for their
	non-violent struggle for the safety of women and for women''s rights to full participation
	in peace-building work". Photos: Copyright © The Nobel Foundation Share this:
	To cite this page MLA style: "The Nobel Peace Prize 2011". Nobelprize.org. Nobel
	Media AB 2014. Web. 18 Jan 2017. <http://www.nobelprize.org/nobel_prizes/peace/laureates/2011/>'
	- Daiquiri Cocktail Recipe You must be logged in to post a comment. Adding comment …
	dannynannady2007.2b626c1 posted 10 months ago I usually enjoy the videos with
	Dushan, but FREE POURING a daiquiri???? You gotta be out of your mind... johndixon548gmailcom1017305228
	posted 1 year ago I made this tonight with Pusser's British Navy Rum and Rose's
	Lime Juice. Very citrusy until I added an additional .5 oz (ish) of rum. Refreshing!
	cholo7 posted 2 years ago I thought the classic is supposed to be with light rum??
	CocktailSeb posted 2 years ago Dark Rum?!?!
	- Chemical Elements.com - Iron (Fe) The homepage of the Iron and Steel Society If
	you know of any other links for Iron, please let me know Bentor, Yinon. Chemical
	Element.com - Iron. <http://www.chemicalelements.com/elements/fe.html>. For more
	information about citing online sources, please visit the MLA's Website . This
	page was created by Yinon Bentor. Use of this web site is restricted by this site's
	license agreement . Copyright © 1996-2012 Yinon Bentor. All Rights Reserved.
	model-index:
	- name: SentenceTransformer based on bobox/DeBERTa-small-ST-v1-test-step3
	results:
	- task:
	type: semantic-similarity
	name: Semantic Similarity
	dataset:
	name: sts test
	type: sts-test
	metrics:
	- type: pearson_cosine
	value: 0.8857317653883374
	name: Pearson Cosine
	- type: spearman_cosine
	value: 0.9078527293932209
	name: Spearman Cosine
	- type: pearson_manhattan
	value: 0.9071116458208927
	name: Pearson Manhattan
	- type: spearman_manhattan
	value: 0.903667904686727
	name: Spearman Manhattan
	- type: pearson_euclidean
	value: 0.9064480556463691
	name: Pearson Euclidean
	- type: spearman_euclidean
	value: 0.9030452803869885
	name: Spearman Euclidean
	- type: pearson_dot
	value: 0.8740754819923455
	name: Pearson Dot
	- type: spearman_dot
	value: 0.8744967501974819
	name: Spearman Dot
	- type: pearson_max
	value: 0.9071116458208927
	name: Pearson Max
	- type: spearman_max
	value: 0.9078527293932209
	name: Spearman Max
	- task:
	type: binary-classification
	name: Binary Classification
	dataset:
	name: allNLI dev
	type: allNLI-dev
	metrics:
	- type: cosine_accuracy
	value: 0.7421875
	name: Cosine Accuracy
	- type: cosine_accuracy_threshold
	value: 0.7840416431427002
	name: Cosine Accuracy Threshold
	- type: cosine_f1
	value: 0.6367713004484306
	name: Cosine F1
	- type: cosine_f1_threshold
	value: 0.5954304337501526
	name: Cosine F1 Threshold
	- type: cosine_precision
	value: 0.5201465201465202
	name: Cosine Precision
	- type: cosine_recall
	value: 0.8208092485549133
	name: Cosine Recall
	- type: cosine_ap
	value: 0.6349487371750362
	name: Cosine Ap
	- type: dot_accuracy
	value: 0.724609375
	name: Dot Accuracy
	- type: dot_accuracy_threshold
	value: 291.18670654296875
	name: Dot Accuracy Threshold
	- type: dot_f1
	value: 0.6070588235294118
	name: Dot F1
	- type: dot_f1_threshold
	value: 242.49884033203125
	name: Dot F1 Threshold
	- type: dot_precision
	value: 0.5119047619047619
	name: Dot Precision
	- type: dot_recall
	value: 0.7456647398843931
	name: Dot Recall
	- type: dot_ap
	value: 0.5913303711668598
	name: Dot Ap
	- type: manhattan_accuracy
	value: 0.7421875
	name: Manhattan Accuracy
	- type: manhattan_accuracy_threshold
	value: 307.83563232421875
	name: Manhattan Accuracy Threshold
	- type: manhattan_f1
	value: 0.6533665835411472
	name: Manhattan F1
	- type: manhattan_f1_threshold
	value: 357.5172119140625
	name: Manhattan F1 Threshold
	- type: manhattan_precision
	value: 0.5745614035087719
	name: Manhattan Precision
	- type: manhattan_recall
	value: 0.7572254335260116
	name: Manhattan Recall
	- type: manhattan_ap
	value: 0.6414745653251337
	name: Manhattan Ap
	- type: euclidean_accuracy
	value: 0.744140625
	name: Euclidean Accuracy
	- type: euclidean_accuracy_threshold
	value: 13.559854507446289
	name: Euclidean Accuracy Threshold
	- type: euclidean_f1
	value: 0.6450116009280741
	name: Euclidean F1
	- type: euclidean_f1_threshold
	value: 17.65105628967285
	name: Euclidean F1 Threshold
	- type: euclidean_precision
	value: 0.5387596899224806
	name: Euclidean Precision
	- type: euclidean_recall
	value: 0.8034682080924855
	name: Euclidean Recall
	- type: euclidean_ap
	value: 0.6408865358113234
	name: Euclidean Ap
	- type: max_accuracy
	value: 0.744140625
	name: Max Accuracy
	- type: max_accuracy_threshold
	value: 307.83563232421875
	name: Max Accuracy Threshold
	- type: max_f1
	value: 0.6533665835411472
	name: Max F1
	- type: max_f1_threshold
	value: 357.5172119140625
	name: Max F1 Threshold
	- type: max_precision
	value: 0.5745614035087719
	name: Max Precision
	- type: max_recall
	value: 0.8208092485549133
	name: Max Recall
	- type: max_ap
	value: 0.6414745653251337
	name: Max Ap
	- task:
	type: binary-classification
	name: Binary Classification
	dataset:
	name: Qnli dev
	type: Qnli-dev
	metrics:
	- type: cosine_accuracy
	value: 0.6953125
	name: Cosine Accuracy
	- type: cosine_accuracy_threshold
	value: 0.6684989929199219
	name: Cosine Accuracy Threshold
	- type: cosine_f1
	value: 0.6764705882352942
	name: Cosine F1
	- type: cosine_f1_threshold
	value: 0.49889329075813293
	name: Cosine F1 Threshold
	- type: cosine_precision
	value: 0.550531914893617
	name: Cosine Precision
	- type: cosine_recall
	value: 0.8771186440677966
	name: Cosine Recall
	- type: cosine_ap
	value: 0.7189160511154118
	name: Cosine Ap
	- type: dot_accuracy
	value: 0.66796875
	name: Dot Accuracy
	- type: dot_accuracy_threshold
	value: 303.81427001953125
	name: Dot Accuracy Threshold
	- type: dot_f1
	value: 0.6697247706422018
	name: Dot F1
	- type: dot_f1_threshold
	value: 184.08914184570312
	name: Dot F1 Threshold
	- type: dot_precision
	value: 0.5239234449760766
	name: Dot Precision
	- type: dot_recall
	value: 0.9279661016949152
	name: Dot Recall
	- type: dot_ap
	value: 0.6791091476516111
	name: Dot Ap
	- type: manhattan_accuracy
	value: 0.712890625
	name: Manhattan Accuracy
	- type: manhattan_accuracy_threshold
	value: 367.2698974609375
	name: Manhattan Accuracy Threshold
	- type: manhattan_f1
	value: 0.6824034334763949
	name: Manhattan F1
	- type: manhattan_f1_threshold
	value: 368.1672058105469
	name: Manhattan F1 Threshold
	- type: manhattan_precision
	value: 0.691304347826087
	name: Manhattan Precision
	- type: manhattan_recall
	value: 0.673728813559322
	name: Manhattan Recall
	- type: manhattan_ap
	value: 0.7327153618467641
	name: Manhattan Ap
	- type: euclidean_accuracy
	value: 0.7109375
	name: Euclidean Accuracy
	- type: euclidean_accuracy_threshold
	value: 17.21223258972168
	name: Euclidean Accuracy Threshold
	- type: euclidean_f1
	value: 0.6848739495798319
	name: Euclidean F1
	- type: euclidean_f1_threshold
	value: 17.62983512878418
	name: Euclidean F1 Threshold
	- type: euclidean_precision
	value: 0.6791666666666667
	name: Euclidean Precision
	- type: euclidean_recall
	value: 0.690677966101695
	name: Euclidean Recall
	- type: euclidean_ap
	value: 0.7305713530476123
	name: Euclidean Ap
	- type: max_accuracy
	value: 0.712890625
	name: Max Accuracy
	- type: max_accuracy_threshold
	value: 367.2698974609375
	name: Max Accuracy Threshold
	- type: max_f1
	value: 0.6848739495798319
	name: Max F1
	- type: max_f1_threshold
	value: 368.1672058105469
	name: Max F1 Threshold
	- type: max_precision
	value: 0.691304347826087
	name: Max Precision
	- type: max_recall
	value: 0.9279661016949152
	name: Max Recall
	- type: max_ap
	value: 0.7327153618467641
	name: Max Ap
	---

	# SentenceTransformer based on bobox/DeBERTa-small-ST-v1-test-step3

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [bobox/DeBERTa-small-ST-v1-test-step3](https://huggingface.co/bobox/DeBERTa-small-ST-v1-test-step3) on the negation-triplets, [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc), [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail), [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail), xsum-pairs, [sciq_pairs](https://huggingface.co/datasets/allenai/sciq), [qasc_pairs](https://huggingface.co/datasets/allenai/qasc), openbookqa_pairs, [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3), [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions), [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa), [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq), [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) and global_dataset datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [bobox/DeBERTa-small-ST-v1-test-step3](https://huggingface.co/bobox/DeBERTa-small-ST-v1-test-step3) <!-- at revision df9aaa75fe0c2791e5ed35ff33de1689d9a5f5ff -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 tokens
	- Similarity Function: Cosine Similarity
	- Training Datasets:
	- negation-triplets
	- [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc)
	- [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail)
	- [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail)
	- xsum-pairs
	- [sciq_pairs](https://huggingface.co/datasets/allenai/sciq)
	- [qasc_pairs](https://huggingface.co/datasets/allenai/qasc)
	- openbookqa_pairs
	- [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3)
	- [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions)
	- [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa)
	- [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq)
	- [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws)
	- global_dataset
	- Language: en
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("bobox/DeBERTa2-0.9B-ST-v2-checkpoints-tmp")
	# Run inference
	sentences = [
	'Which 2011 Nobel Prize was jointly awarded to Ellen Johnson Sirleaf, Leymah Gbowee and Tawakkol Karman?',
	'The Nobel Peace Prize 2011 The Nobel Peace Prize 2011 Ellen Johnson Sirleaf, Leymah Gbowee, Tawakkol Karman Share this: The Nobel Peace Prize 2011 Photo: K. Opprann Tawakkol Karman Prize share: 1/3 The Nobel Peace Prize 2011 was awarded jointly to Ellen Johnson Sirleaf, Leymah Gbowee and Tawakkol Karman "for their non-violent struggle for the safety of women and for women\'s rights to full participation in peace-building work". Photos: Copyright © The Nobel Foundation Share this: To cite this page MLA style: "The Nobel Peace Prize 2011". Nobelprize.org. Nobel Media AB 2014. Web. 18 Jan 2017. <http://www.nobelprize.org/nobel_prizes/peace/laureates/2011/>',
	"Daiquiri Cocktail Recipe You must be logged in to post a comment. Adding comment\xa0…\xa0 dannynannady2007.2b626c1 posted 10 months ago I usually enjoy the videos with Dushan, but FREE POURING a daiquiri???? You gotta be out of your mind... johndixon548gmailcom1017305228 posted 1 year ago I made this tonight with Pusser's British Navy Rum and Rose's Lime Juice. Very citrusy until I added an additional .5 oz (ish) of rum. Refreshing! cholo7 posted 2 years ago I thought the classic is supposed to be with light rum?? CocktailSeb posted 2 years ago Dark Rum?!?!",
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	## Evaluation

	### Metrics

	#### Semantic Similarity
	* Dataset: `sts-test`
	* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| pearson_cosine \| 0.8857 \|
	\| spearman_cosine \| 0.9079 \|
	\| pearson_manhattan \| 0.9071 \|
	\| spearman_manhattan \| 0.9037 \|
	\| pearson_euclidean \| 0.9064 \|
	\| spearman_euclidean \| 0.903 \|
	\| pearson_dot \| 0.8741 \|
	\| spearman_dot \| 0.8745 \|
	\| pearson_max \| 0.9071 \|
	\| spearman_max \| 0.9079 \|

	#### Binary Classification
	* Dataset: `allNLI-dev`
	* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)

	\| Metric \| Value \|
	\|:-----------------------------\|:-----------\|
	\| cosine_accuracy \| 0.7422 \|
	\| cosine_accuracy_threshold \| 0.784 \|
	\| cosine_f1 \| 0.6368 \|
	\| cosine_f1_threshold \| 0.5954 \|
	\| cosine_precision \| 0.5201 \|
	\| cosine_recall \| 0.8208 \|
	\| cosine_ap \| 0.6349 \|
	\| dot_accuracy \| 0.7246 \|
	\| dot_accuracy_threshold \| 291.1867 \|
	\| dot_f1 \| 0.6071 \|
	\| dot_f1_threshold \| 242.4988 \|
	\| dot_precision \| 0.5119 \|
	\| dot_recall \| 0.7457 \|
	\| dot_ap \| 0.5913 \|
	\| manhattan_accuracy \| 0.7422 \|
	\| manhattan_accuracy_threshold \| 307.8356 \|
	\| manhattan_f1 \| 0.6534 \|
	\| manhattan_f1_threshold \| 357.5172 \|
	\| manhattan_precision \| 0.5746 \|
	\| manhattan_recall \| 0.7572 \|
	\| manhattan_ap \| 0.6415 \|
	\| euclidean_accuracy \| 0.7441 \|
	\| euclidean_accuracy_threshold \| 13.5599 \|
	\| euclidean_f1 \| 0.645 \|
	\| euclidean_f1_threshold \| 17.6511 \|
	\| euclidean_precision \| 0.5388 \|
	\| euclidean_recall \| 0.8035 \|
	\| euclidean_ap \| 0.6409 \|
	\| max_accuracy \| 0.7441 \|
	\| max_accuracy_threshold \| 307.8356 \|
	\| max_f1 \| 0.6534 \|
	\| max_f1_threshold \| 357.5172 \|
	\| max_precision \| 0.5746 \|
	\| max_recall \| 0.8208 \|
	\| max_ap \| 0.6415 \|

	#### Binary Classification
	* Dataset: `Qnli-dev`
	* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)

	\| Metric \| Value \|
	\|:-----------------------------\|:-----------\|
	\| cosine_accuracy \| 0.6953 \|
	\| cosine_accuracy_threshold \| 0.6685 \|
	\| cosine_f1 \| 0.6765 \|
	\| cosine_f1_threshold \| 0.4989 \|
	\| cosine_precision \| 0.5505 \|
	\| cosine_recall \| 0.8771 \|
	\| cosine_ap \| 0.7189 \|
	\| dot_accuracy \| 0.668 \|
	\| dot_accuracy_threshold \| 303.8143 \|
	\| dot_f1 \| 0.6697 \|
	\| dot_f1_threshold \| 184.0891 \|
	\| dot_precision \| 0.5239 \|
	\| dot_recall \| 0.928 \|
	\| dot_ap \| 0.6791 \|
	\| manhattan_accuracy \| 0.7129 \|
	\| manhattan_accuracy_threshold \| 367.2699 \|
	\| manhattan_f1 \| 0.6824 \|
	\| manhattan_f1_threshold \| 368.1672 \|
	\| manhattan_precision \| 0.6913 \|
	\| manhattan_recall \| 0.6737 \|
	\| manhattan_ap \| 0.7327 \|
	\| euclidean_accuracy \| 0.7109 \|
	\| euclidean_accuracy_threshold \| 17.2122 \|
	\| euclidean_f1 \| 0.6849 \|
	\| euclidean_f1_threshold \| 17.6298 \|
	\| euclidean_precision \| 0.6792 \|
	\| euclidean_recall \| 0.6907 \|
	\| euclidean_ap \| 0.7306 \|
	\| max_accuracy \| 0.7129 \|
	\| max_accuracy_threshold \| 367.2699 \|
	\| max_f1 \| 0.6849 \|
	\| max_f1_threshold \| 368.1672 \|
	\| max_precision \| 0.6913 \|
	\| max_recall \| 0.928 \|
	\| max_ap \| 0.7327 \|

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Datasets

	#### negation-triplets

	* Dataset: negation-triplets
	* Size: 750 training samples
	* Columns: <code>anchor</code>, <code>entailment</code>, and <code>negative</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| entailment \| negative \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 22.07 tokens</li><li>max: 154 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 14.13 tokens</li><li>max: 43 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 14.41 tokens</li><li>max: 42 tokens</li></ul> \|
	* Samples:
	\| anchor \| entailment \| negative \|
	\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:----------------------------------------------------\|:------------------------------------------------------\|
	\| <code>Judge Bisbee took the stand on his own behalf for several hours today.</code> \| <code>Bisbee takes the stand</code> \| <code>Bisbee refuses to take the stand</code> \|
	\| <code>Half of Germans want to ditch the euro and return to the deutschmark, a major survey showed just a day after Angela Merkel drove through a vote in favour of a Greek bailout.</code> \| <code>Half of all Germans want to ditch euro</code> \| <code>Half of all Belgians want to uphold euro</code> \|
	\| <code>216 "At first, the thing seemed utterly impossible.</code> \| <code>It seemed impossible at first. </code> \| <code>It didn`t seem impossible at first.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### vitaminc-pairs

	* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
	* Size: 750 training samples
	* Columns: <code>claim</code> and <code>evidence</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| claim \| evidence \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 15.23 tokens</li><li>max: 47 tokens</li></ul> \| <ul><li>min: 9 tokens</li><li>mean: 37.68 tokens</li><li>max: 185 tokens</li></ul> \|
	* Samples:
	\| claim \| evidence \|
	\|:--------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Based on less than 8 critics , the film scored more than 42 out of 100 .</code> \| <code>`` On Metacritic the film has a score of 43 out of 100 , based on 7 critics , indicating `` '' mixed or average reviews '' '' . ''</code> \|
	\| <code>Haim resumed activities in 2012 .</code> \| <code>This led to Haim resuming as a full-time operation in 2012.The group 's first release , Forever ( an EP released on a limited-time download ) , combined with positive reception at the South by Southwest festival , led to a deal with Polydor Records and a management deal with Jay-Z 's Roc Nation group in mid-2012 .</code> \|
	\| <code>Iman 's second husband was Spencer Haywood .</code> \| <code>He also has a stepsister , Zulekha Haywood ( born 1978 ) , who is the daughter of Iman and former NBA basketball player Spencer Haywood , Iman 's second husband.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### scitail-pairs-qa

	* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 17.41 tokens</li><li>max: 41 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 15.12 tokens</li><li>max: 34 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-----------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------\|
	\| <code>A binary molecular compound is made up of two elements.</code> \| <code>A binary molecular compound is made up of two of what?</code> \|
	\| <code>Ash that enters the air naturally as a result of a volcano eruption is classified as a primary pollutant.</code> \| <code>Ash that enters the air naturally as a result of a volcano eruption is classified as what kind of pollutant?</code> \|
	\| <code>Coal is nonrenewable, and the sun is renewable is how coal and the sun compare as sources of energy.</code> \| <code>How do coal and the sun compare as sources of energy?</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### scitail-pairs-pos

	* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 24.72 tokens</li><li>max: 71 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 16.21 tokens</li><li>max: 41 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------\|
	\| <code>The Earth rotates once a day.</code> \| <code>Earth rotates on its axis once times in one day.</code> \|
	\| <code>When the plates crash into each other, geologists call this type of plate boundary a convergent boundary.</code> \| <code>A convergent plate boundary is created when two plates come toward each other.</code> \|
	\| <code>In biochemistry, amino acids having both the amine and the carboxylic acid groups attached to the first (alpha-) carbon atom have particular importance.</code> \| <code>Amino acids contain both a carboxylic acid group and a(n) amine group.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### xsum-pairs

	* Dataset: xsum-pairs
	* Size: 750 training samples
	* Columns: <code>summary</code> and <code>document</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| summary \| document \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 25.57 tokens</li><li>max: 43 tokens</li></ul> \| <ul><li>min: 57 tokens</li><li>mean: 218.28 tokens</li><li>max: 408 tokens</li></ul> \|
	* Samples:
	\| summary \| document \|
	\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Defender Neil Taylor believes Swansea City "turned a corner" with the manner of their performance in Sunday's 1-0 Premier League defeat at Liverpool.</code> \| <code>A handball decision against Taylor gave James Milner the chance to earn victory from the penalty spot for the Reds.<br>Taylor believes referee Anthony Taylor was wrong, but felt his side played well despite falling to their ninth league loss in 10 starts.<br>"We played well and turned the corner a bit as a performance," said Taylor.<br>"But we obviously haven't got the result which is disappointing and hard to swallow.<br>"The performance was good, the boys played well and I think overall we can be proud of what we did and we frustrated Liverpool."<br>But defeat meant dropping a place to 15th in the table, one place below Chelsea and four points off the relegation zone.<br>Swansea's plight has led to speculation over manager Garry Monk's future, but Taylor believes the display at Anfield was an improvement.<br>"We got back to how we wanted to play and how we have been playing so we're happy with that," said the Wales international.<br>Taylor felt he was legitimately protecting his face when Jordan Ibe's cross struck his arm, leading to the penalty award.<br>"I don't think the ref was going to give it. Time stood still for about 10 seconds, the crowd shouted loud enough and it's a pen," lamented Taylor.</code> \|
	\| <code>Photographs showing the Duke of Windsor's visit to Nazi Germany in 1937 are expected to fetch up to Â£1,000 at auction.</code> \| <code>The album was compiled by the former King Edward VIII's equerry, Sir Dudley Forwood, and has been in his family ever since.<br>It features 60 photos, some previously unseen, of the duke meeting Nazis, including Adolf Hitler.<br>The album is due to be sold at Duke's of Dorchester on 10 March.<br>It details the visit the duke took with his new wife Wallis Simpson.<br>Sir Dudley said years later that the trip was made, not in order to support the Nazi regime, as many thought, but so that the Duchess could experience a state visit.<br>The equerry's invitation to the funeral of the Duchess of Windsor in 1986 is also being sold.<br>Timothy Medhurst, of the auction house, said: "It shows the couple in a relaxed environment being shown around by Nazis who are clearly proud of their nation.<br>"It is a unique piece of history compiled at a time when the Nazi war machine was preparing for European conquest and the systematic slaughter of millions of people."<br>The photographs show the duke and his wife visiting many places, including a mine, a winter relief headquarters, a lightbulb factory and a school.<br>Source: BBC History</code> \|
	\| <code>A man has appeared in court charged as part of an investigation into the death of a man who was fatally shot in his car outside his London home.</code> \| <code>Redwan El-Ghaidouni was shot on his driveway by a man who approached him in Vine Lane, Hillingdon, west London, on 3 February.<br>Marwais Kakar, 34, appeared at Uxbridge Magistrates' Court earlier charged with perverting the course of justice.<br>He was bailed to attend Isleworth Crown Court on 6 July.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### sciq_pairs

	* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 17.29 tokens</li><li>max: 60 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 86.8 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>What is the only continent without amphibians?</code> \| <code>Amphibians can be found in freshwater and moist terrestrial habitats throughout the world. The only continent without amphibians is Antarctica. Amphibians are especially numerous in temperate lakes and ponds and in tropical rainforests.</code> \|
	\| <code>Where are aerofoils found?</code> \| <code>Birds also have wings that function as an aerofoil . The surface of the aerofoil is curved to help the bird control and use the air currents to fly. Aerofoils are also found on the wings of airplanes.</code> \|
	\| <code>A black solid by itself, this element is incredibly important because of what it makes when it combines with many other elements, including oxygen?</code> \| <code>Carbon is an element. By itself, it’s a black solid. You can see a lump of carbon in Figure below . Carbon is incredibly important because of what it makes when it combines with many other elements. Carbon can form a wide variety of substances. For example, in the air, carbon combines with oxygen to form the gas carbon dioxide.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### qasc_pairs

	* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 11.52 tokens</li><li>max: 25 tokens</li></ul> \| <ul><li>min: 13 tokens</li><li>mean: 34.06 tokens</li><li>max: 65 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>what enters the egg?</code> \| <code>Fertilization occurs when sperm swim to an egg inside an archegonium.. Sperm nuclei are released into the archegonium. <br> sperm nuclei enter the egg</code> \|
	\| <code>When does water freeze into ice?</code> \| <code>when water freezes , that water expands. Upon further cooling to 32 degrees Fahrenheit, water expands as it turns to ice. <br> Water freezes into ice under 32 degrees fahrenheit</code> \|
	\| <code>What responds to daily and seasonal cycles and to disease?</code> \| <code>Plants respond to daily and seasonal cycles and to disease.. Bamboos are vigorous, rugged plants. <br> bamboo respond to daily and seasonal cycles and to disease</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### openbookqa_pairs

	* Dataset: openbookqa_pairs
	* Size: 750 training samples
	* Columns: <code>question</code> and <code>fact</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| question \| fact \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 3 tokens</li><li>mean: 13.95 tokens</li><li>max: 52 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 11.67 tokens</li><li>max: 31 tokens</li></ul> \|
	* Samples:
	\| question \| fact \|
	\|:----------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------\|
	\| <code>A boy has a scar from an oven on his leg, so his leg was likely</code> \| <code>if a body part was burned then that body part was exposed to a lot of heat energy</code> \|
	\| <code>The only stage of the water cycle process that is nonexistent is</code> \| <code>evaporation is a stage in the water cycle process</code> \|
	\| <code>over a period of three years, a farmer cultivates, corn, millet and potatoes. What is this exemplary of?</code> \| <code>crop rotation is when different crops are planted on a field in different years</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### msmarco_pairs

	* Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:---------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 8.76 tokens</li><li>max: 27 tokens</li></ul> \| <ul><li>min: 18 tokens</li><li>mean: 76.27 tokens</li><li>max: 232 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>what are archegonia</code> \| <code>An archegonium (pl: archegonia), from the ancient Greek á¼ÏÏÎ® (beginning) and Î³ÏÎ½Î¿Ï (offspring), is a multicellular structure or organ of the gametophyte phase of certain plants producing and containing the ovum or female gamete.</code> \|
	\| <code>what does it mean to deduct commission</code> \| <code>What is Commission Pay? Commission is a sum of money that is paid to an employee upon completion of a task, usually selling a certain amount of goods or services. It can be paid as a percentage of the sale or as a flat dollar amount based on sales volume. Employers often use sales commissions as incentives to increase worker productivity.</code> \|
	\| <code>how many grams of fiber should you eat per day</code> \| <code>So just how much fiber do you need? The national fiber recommendations are 30 to 38 grams a day for men and 25 grams a day for women between 18 and 50 years old, and 21 grams a day if a woman is 51 and older. Another general guideline is to get 14 grams of fiber for every 1,000 calories in your diet.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### nq_pairs

	* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 11.79 tokens</li><li>max: 22 tokens</li></ul> \| <ul><li>min: 24 tokens</li><li>mean: 130.64 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>who was the viceroy of india when delhi became capital</code> \| <code>New Delhi The foundation stone of the city was laid by George V, Emperor of India during the Delhi Durbar of 1911.[6] It was designed by British architects, Sir Edwin Lutyens and Sir Herbert Baker. The new capital was inaugurated on 13 February 1931,[7] by Viceroy and Governor-General of India Lord Irwin.</code> \|
	\| <code>who is the singer for alice in chains</code> \| <code>Layne Staley Layne Staley (born Layne Rutherford Staley,[1] August 22, 1967 â€“ April 5, 2002)[7][8][9] was an American musician known for being the lead vocalist, occasional rhythm guitarist and co-songwriter of the rock band Alice in Chains from 1987 until 1998. The band rose to international fame in the early 1990s during Seattle's grunge movement, and became known for Staley's distinct vocal style, as well as the harmonized vocals between him and guitarist/vocalist Jerry Cantrell.[10][11]</code> \|
	\| <code>explain the salient features of banking regulation act 1949</code> \| <code>Banking Regulation Act, 1949 The Act gives the Reserve Bank of India (RBI) the power to license banks, have regulation over shareholding and voting rights of shareholders; supervise the appointment of the boards and management; regulate the operations of banks; lay down instructions for audits; control moratorium, mergers and liquidation; issue directives in the interests of public good and on banking policy, and impose penalties.[2]</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### trivia_pairs

	* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
	* Size: 750 training samples
	* Columns: <code>query</code> and <code>answer</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| query \| answer \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 16.06 tokens</li><li>max: 41 tokens</li></ul> \| <ul><li>min: 19 tokens</li><li>mean: 207.41 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| query \| answer \|
	\|:----------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>English musician Ian Curtis, who died on 18th May 1980, was best known as lead singer of which post-punk band?</code> \| <code>Ian Curtis ( of Touching from a distance) Ian Kevin Curtis was an English musician and singer-songwriter. He is best known as the lead singer and lyricist of the post-punk band Joy Division. Joy Division released its debut album, Unknown Pleasures, in 1979 and recorded its follow-up, Closer, in 1980. Curtis, who suffered from epilepsy and depression, committed suicide on 18 May 1980, on the eve of Joy Division's first North American tour, resulting in the band's dissolution and the subsequent formation of New Order. Curtis was known for his baritone voice, dance style, and songwriting filled with imagery of desolation, emptiness and alienation. In 1995, Curtis's widow Deborah published Touching from a Distance: Ian Curtis and Joy Division, a biography of the singer. His life and death Ian Kevin Curtis was an English musician and singer-songwriter. He is best known as the lead singer and lyricist of the post-punk band Joy Division. Joy Division released its debut album, Unknown Pleasures, in 1979 and recorded its follow-up, Closer, in 1980. Curtis, who suffered from epilepsy and depression, committed suicide on 18 May 1980, on the eve of Joy Division's first North American tour, resulting in the band's dissolution and the subsequent formation of New Order. Curtis was known for his baritone voice, dance style, and songwriting filled with imagery of desolation, emptiness and alienation. In 1995, Curtis's widow Deborah published Touching from a Distance: Ian Curtis and Joy Division, a biography of the singer. His life and death have been dramatised in the films 24 Hour Party People (2002) and Control (2007). ...more</code> \|
	\| <code>From which film comes the line 'Mrs. Robinson, you're trying to seduce me... aren't you?'</code> \| <code>The Graduate (1967) - "Mrs. Robinson, you're trying to seduce me. Aren't you?" - YouTube The Graduate (1967) - "Mrs. Robinson, you're trying to seduce me. Aren't you?" Want to watch this again later? Sign in to add this video to a playlist. Need to report the video? Sign in to report inappropriate content. The interactive transcript could not be loaded. Loading... Rating is available when the video has been rented. This feature is not available right now. Please try again later. Uploaded on May 5, 2009 The famous scene from The Graduate. Category</code> \|
	\| <code>The Amazon flows through how many countries?</code> \| <code>Learn About the Six Amazon River Basin Countries Share By Amanda Briney The Amazon River is the second longest river (it is just shorter than the Nile River in Egypt ) in the world and it has the largest watershed or drainage basin as well as the most tributaries of any river in the world. For reference, a watershed is defined as the area of land that releases its water into a river. This entire area is often referred to as the Amazon Basin. The Amazon River begins with streams in the Andes Mountains in Peru and flows into the Atlantic Ocean about 4,000 miles (6,437 km) away. The Amazon River and its watershed encompass an area of 2,720,000 square miles (7,050,000 sq km). This area includes the largest tropical rainforest in the world - the Amazon Rainforest . In addition parts of the Amazon Basin also include grassland and savannah landscapes. As a result, this area is some of the least developed and most biodiverse in the world. Along its length, the Amazon River flows through three countries and its basin includes three more. continue reading below our video Test Your General Science Knowledge The following is a list of these six countries that have claims to the Amazon region arranged by their area. For reference, their capitals and populations have also been included. All information was obtained from the CIA World Factbook . • Area: 3,287,612 square miles (8,514,877 sq km) • Capital: Brasilia</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### gooaq_pairs

	* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 11.51 tokens</li><li>max: 20 tokens</li></ul> \| <ul><li>min: 13 tokens</li><li>mean: 57.44 tokens</li><li>max: 155 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>has roku ultra been discontinued?</code> \| <code>No, its not been discontinued. It may depend on where you live what shows up for sale on roku pages. The newest model of Ultra is the 4670R, try amazon.</code> \|
	\| <code>how much percent is malibu?</code> \| <code>Malibu is a coconut flavored liqueur, made with Caribbean rum, and possessing an alcohol content by volume of 21.0 % (42 proof).</code> \|
	\| <code>how to change a pdf document back to word?</code> \| <code>['Open a PDF file in Acrobat DC.', 'Click on the “Export PDF” tool in the right pane.', 'Choose Microsoft Word as your export format, and then choose “Word Document.”', 'Click “Export.” If your PDF contains scanned text, the Acrobat Word converter will run text recognition automatically.']</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### paws-pos

	* Dataset: [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) at [161ece9](https://huggingface.co/datasets/google-research-datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
	* Size: 750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 25.72 tokens</li><li>max: 51 tokens</li></ul> \| <ul><li>min: 9 tokens</li><li>mean: 25.62 tokens</li><li>max: 49 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Tampa became less important when several digger projects made the port of Tampa accessible to all shipping .</code> \| <code>Tampa became less important when several dredging projects made the Port of Port Tampa accessible to all shipping .</code> \|
	\| <code>Yarde sometimes improvised , often while listening to jazz .</code> \| <code>Yarde sometimes improvised , often while listening to the jazz .</code> \|
	\| <code>Major airports near Seymour include : Austin Straubel International Airport ( public ) , in Ashwaubenon ; Appleton International Airport ( public ) , in Greenville .</code> \| <code>Airports near Seymour : Austin Straubel International Airport ( public ) in Ashwaubenon and International Airport Appleton ( public ) in Greenville .</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### global_dataset

	* Dataset: global_dataset
	* Size: 9,750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 32.84 tokens</li><li>max: 354 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 56.79 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Binary compounds of carbon with less electronegative elements are called carbides.</code> \| <code>Binary compounds of carbon with less electronegative elements are called what?</code> \|
	\| <code>who played ramses in exodus gods and kings</code> \| <code>Exodus: Gods and Kings On March 15, 2013, Deadline.com reported Scott wanted Christian Bale to star in the film;[16] in August he confirmed the role to be Moses himself.[17] On the same day, Joel Edgerton joined the cast to play Ramses and production was set to begin in September.[18] The studio announced the casting calls in Spain's Almería and Pechina for 3,000 to 4,000 extras and with another 1,000 to 2,000 extras on the island of Fuerteventura.[19] On August 27, Aaron Paul joined the film to play Joshua.[20] Sigourney Weaver, Ben Kingsley and John Turturro were then still in talks about joining the cast.[21]</code> \|
	\| <code>From 1996 to September 2010 he was Ambassador of Belgium to Liechtenstein .</code> \| <code>From 1996 to September 2010 he was ambassador of Belgium near Liechtenstein .</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	### Evaluation Datasets

	#### vitaminc-pairs

	* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
	* Size: 128 evaluation samples
	* Columns: <code>claim</code> and <code>evidence</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| claim \| evidence \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 21.42 tokens</li><li>max: 41 tokens</li></ul> \| <ul><li>min: 11 tokens</li><li>mean: 35.55 tokens</li><li>max: 79 tokens</li></ul> \|
	* Samples:
	\| claim \| evidence \|
	\|:------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Dragon Con had over 5000 guests .</code> \| <code>Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .</code> \|
	\| <code>COVID-19 has reached more than 185 countries .</code> \| <code>As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .</code> \|
	\| <code>In March , Italy had 3.6x times more cases of coronavirus than China .</code> \| <code>As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### negation-triplets

	* Dataset: negation-triplets
	* Size: 128 evaluation samples
	* Columns: <code>anchor</code>, <code>entailment</code>, and <code>negative</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| entailment \| negative \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 14.26 tokens</li><li>max: 35 tokens</li></ul> \| <ul><li>min: 6 tokens</li><li>mean: 12.25 tokens</li><li>max: 21 tokens</li></ul> \| <ul><li>min: 6 tokens</li><li>mean: 12.59 tokens</li><li>max: 22 tokens</li></ul> \|
	* Samples:
	\| anchor \| entailment \| negative \|
	\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-------------------------------------------\|:-----------------------------------------------\|
	\| <code>A guy riding a motorcycle near junk cars</code> \| <code>A man is riding a motorcycle.</code> \| <code>A man is not riding a motorcycle.</code> \|
	\| <code>People jump over a mountain crevasse on a rope.</code> \| <code>People are jumping outside.</code> \| <code>People are not jumping outside.</code> \|
	\| <code>Three men, one holding pipes, another holding a large object above his head, and one resting against the pipe bed on the truck, are looking at the camera.</code> \| <code>three men look at the camera</code> \| <code>three men ignore the camera</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### scitail-pairs-pos

	* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 20.28 tokens</li><li>max: 56 tokens</li></ul> \| <ul><li>min: 8 tokens</li><li>mean: 15.48 tokens</li><li>max: 23 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------\|
	\| <code>humans normally have 23 pairs of chromosomes.</code> \| <code>Humans typically have 23 pairs pairs of chromosomes.</code> \|
	\| <code>A solution is a homogenous mixture of two or more substances that exist in a single phase.</code> \| <code>Solution is the term for a homogeneous mixture of two or more substances.</code> \|
	\| <code>Upwelling The physical process in near-shore ocean systems of rising of nutrients and colder bottom waters to the surface because of constant wind patterns along the shoreline.</code> \| <code>Upwelling is the term for when deep ocean water rises to the surface.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### scitail-pairs-qa

	* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 16.16 tokens</li><li>max: 29 tokens</li></ul> \| <ul><li>min: 8 tokens</li><li>mean: 14.59 tokens</li><li>max: 24 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------\|
	\| <code>Hail during a storm describes water in a solid state.</code> \| <code>Which of these describes water in a solid state?</code> \|
	\| <code>A role of mushrooms in ecosystems is breaking down dead plant material.</code> \| <code>Which of the following best describes a role of mushrooms in ecosystems?</code> \|
	\| <code>Because trees add water vapor to air, cutting down forests leads to longer periods of drought.</code> \| <code>Because trees add water vapor to air, cutting down forests leads to longer periods of what?</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### xsum-pairs

	* Dataset: xsum-pairs
	* Size: 128 evaluation samples
	* Columns: <code>summary</code> and <code>document</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| summary \| document \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 15 tokens</li><li>mean: 26.25 tokens</li><li>max: 46 tokens</li></ul> \| <ul><li>min: 64 tokens</li><li>mean: 212.7 tokens</li><li>max: 366 tokens</li></ul> \|
	* Samples:
	\| summary \| document \|
	\|:--------------------------------------------------------------------------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>A fifth senior Kenyan Olympic official has been arrested as part of the investigation into missing money and equipment following Rio 2016.</code> \| <code>Police went to the Nairobi home of Ben Ekumbo, who is vice-president of the country's Olympic committee and head of its swimming federation.<br>He was reportedly hiding under a bed.<br>Officers found boxes of new Nike running shoes and unused Kenyan team uniforms that were meant to be given to athletes competing at the Games.<br>The arrest came as it was alleged in a forthcoming report by investigators that high-ranking sports officials stole more than £6.4m in athletes' expenses and equipment at the Olympics.<br>Four other officials were arrested in September.<br>Kenya's team leader at the Rio Olympics, Stephen Arap Soi, was charged with stealing over $250,000 (£200,400) that was meant to be used for athletes' travel, accommodation and other expenses in Rio. Another vice president, Pius Ochieng, and secretary general Francis Kinyili Paul were charged with stealing Nike kit.<br>They have all denied the charges and are out on bail.<br>The other official, committee treasurer Fridah Shiroya, had charges against her dropped and she is expected to be a state witness and testify against the others.</code> \|
	\| <code>Manchester City have extended the contract of midfielder Frank Lampard until the end of the season.</code> \| <code>The 36-year-old's deal from New York City was due to expire at midnight on 31 December but he will now stay until the summer and miss the start of the MLS season.<br>Former England international Lampard is available to face Sunderland in the Premier League on Thursday.<br>Lampard has scored six goals in 17 appearances for City this season.<br>He signed for New York after being released by Chelsea last June and his new starting date in the US will be confirmed "as the Premier League and MLS seasons unfold", New York City said in a statement.<br>New York City FC director of football Claudio Reyna said: "Frank is a star and it is no surprise Manchester City is rewarded by his contributions on the field every single day.<br>"He is eager to get to New York once his commitment ends in England and will be available to play on arrival as a permanent member of the squad, given he will come to us having played at the highest level."</code> \|
	\| <code>A Texas man has died two months after contracting a flesh-eating bacterium through a new tattoo on his leg, medical officials say.</code> \| <code>The man, who has not been identified, had received a tattoo with the words "Jesus is my life" five days before going for a swim in the Gulf of Mexico.<br>The man was then admitted to a Dallas hospital complaining of severe pains nearby to the tattoo on his calf.<br>Doctors advise that new tattoos be kept clean in order to prevent infection.<br>The man had a history of alcohol cirrhosis of the liver, and reportedly told doctors that he drank six beers daily.<br>Doctors at the Parkland Memorial Hospital put the man on life support 24 hours after he was admitted, as he went into septic shock.<br>Doctors say he tested positive for the flesh-eating bacterium Vibrio vulnificus.<br>The British Medical Journal reports that the pathogen is common in the Gulf of Mexico's coastal waters, and the risks of infection rises during warmer months.<br>According to the US Centers for Disease Control and Prevention, V. vulnificus causes 80,000 illnesses and 100 deaths each year in the US with most infections being attributed to eating raw shellfish.<br>Medical professionals advise that new tattoos be covered during bathing, and that people avoid swimming after getting new ink.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### sciq_pairs

	* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
	* Size: 87 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 17.85 tokens</li><li>max: 41 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 74.61 tokens</li><li>max: 382 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Saturn is made mostly of helium and what else?</code> \| <code>Saturn’s composition is similar to Jupiter's. The planet is made mostly of hydrogen and helium. These elements are gases in the outer layers and liquids in the deeper layers. Saturn may also have a small solid core. Saturn's upper atmosphere has clouds in bands of different colors. These clouds rotate rapidly around the planet. But Saturn has fewer storms than Jupiter. Thunder and lightning have been seen in the storms on Saturn ( Figure below ).</code> \|
	\| <code>What is a device that changes kinetic energy to electrical energy through electromagnetic induction?</code> \| <code>An electric generator is a device that changes kinetic energy to electrical energy through electromagnetic induction. A simple diagram of an electric generator is shown in Figure below . In a generator, some form of energy is applied to turn a shaft. This causes a coil of wire to rotate between opposite poles of a magnet. Because the coil is rotating in a magnetic field, electric current is generated in the wire. If the diagram in Figure below looks familiar to you, that’s because a generator is an electric motor in reverse. Look back at the electric motor in Figure above . If you were to mechanically turn the shaft of the motor (instead of using electromagnetism to turn it), the motor would generate electricity just like an electric generator. You can learn how to make a very simple electric generator by watching the video at the URL below. Making your own generator will help you understand how a generator works.</code> \|
	\| <code>What results when the water vapor from a hot shower contacts the cooler surface of a mirror?</code> \| <code>If you take a hot shower in a closed bathroom, the mirror is likely to "fog" up. The "fog" consists of tiny droplets of water that form on the cool surface of the mirror. Why does this happen? Some of the hot water from the shower evaporates, so the air in the bathroom contains a lot of water vapor. When the water vapor contacts cooler surfaces, such as the mirror, it cools and loses energy. The cooler water particles no longer have enough energy to overcome the forces of attraction between them. They come together and form droplets of liquid water.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### qasc_pairs

	* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
	* Size: 87 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 11.48 tokens</li><li>max: 20 tokens</li></ul> \| <ul><li>min: 20 tokens</li><li>mean: 34.82 tokens</li><li>max: 53 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>What is it called when an animal needs to replenish bodily water?</code> \| <code>thirst is used to make an animal realize that it needs to replenish its bodily water by the body. Dehydration causes increased thirst and water consumption. <br> dehydration is when an animal needs to replenish bodily water</code> \|
	\| <code>What can enhance a plant's growth?</code> \| <code>the looseness of soil has a positive impact on a plant 's roots' growth in that soil. Air exchange in the root zone is essential for root growth. <br> Loose soil has increased air exchange, which is essential for plants.</code> \|
	\| <code>What can a flashlight use to produce light?</code> \| <code>a flashlight requires a source of electricity to produce light. Electricity is usually provided by batteries or AC current. <br> a flashlight can use batteries to produce light</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### openbookqa_pairs

	* Dataset: openbookqa_pairs
	* Size: 128 evaluation samples
	* Columns: <code>question</code> and <code>fact</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| question \| fact \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 3 tokens</li><li>mean: 13.98 tokens</li><li>max: 47 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 11.78 tokens</li><li>max: 28 tokens</li></ul> \|
	* Samples:
	\| question \| fact \|
	\|:-----------------------------------------------------------------------\|:-----------------------------------------------------------------------------\|
	\| <code>The thermal production of a stove is generically used for</code> \| <code>a stove generates heat for cooking usually</code> \|
	\| <code>What creates a valley?</code> \| <code>a valley is formed by a river flowing</code> \|
	\| <code>when it turns day and night on a planet, what cause this?</code> \| <code>a planet rotating causes cycles of day and night on that planet</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### msmarco_pairs

	* Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
	* Size: 87 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:---------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 8.55 tokens</li><li>max: 15 tokens</li></ul> \| <ul><li>min: 27 tokens</li><li>mean: 73.72 tokens</li><li>max: 210 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>what is sales tax rate in lower burrell pa</code> \| <code>The unemployment rate in Lower Burrell, Pennsylvania, is 6.10%, with job growth of 0.81%. Future job growth over the next ten years is predicted to be 35.51%. Lower Burrell, Pennsylvania Taxes. Lower Burrell, Pennsylvania,sales tax rate is 6.00%.</code> \|
	\| <code>what is vertebral column</code> \| <code>spinal column. n. 1. (Anatomy) a series of contiguous or interconnecting bony or cartilaginous segments that surround and protect the spinal cord. Also called: spine, vertebral column or rachis Nontechnical name: backbone.</code> \|
	\| <code>what does carbon anhydrase form</code> \| <code>From Wikipedia, the free encyclopedia. The carbonic anhydrases (or carbonate dehydratases) form a family of enzymes that catalyze the rapid interconversion of carbon dioxide and water to bicarbonate and protons (or vice versa), a reversible reaction that occurs relatively slowly in the absence of a catalyst.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### nq_pairs

	* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
	* Size: 87 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 11.32 tokens</li><li>max: 16 tokens</li></ul> \| <ul><li>min: 29 tokens</li><li>mean: 140.38 tokens</li><li>max: 421 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>how many california state senate members are there</code> \| <code>California State Senate The California State Senate is the upper house of the California State Legislature. Due to the state's large population and relatively small legislature, the State Senate has the largest population per representative ratio of any state legislative house. In the United States House of Representatives, California is apportioned 53 representatives, each representing approximately 704,566 people,[1] while in the State Senate, each of the 40 Senators represents approximately 931,349 people,[2] with the result that California state senators each actually represent more voters than California's representatives to the United States Congress do. Each member represents a population roughly equivalent to the state of Delaware. As a result of Proposition 140 in 1990 and Proposition 28 in 2012, members elected to the legislature prior to 2012 are restricted by term limits to two four-year terms (eight years), while those elected in or after 2012 are allowed to serve 12 years in the legislature in any combination of four-year state senate or two-year state assembly terms.[3]</code> \|
	\| <code>who is known as the father of statistics in india</code> \| <code>Prasanta Chandra Mahalanobis Prasanta Chandra Mahalanobis OBE, FRS[1] (29 June 1893 – 28 June 1972) was an Indian scientist and applied statistician. He is best remembered for the Mahalanobis distance, a statistical measure and for being one of the members of the first Planning commission of free India. He made pioneering studies in anthropometry in India. He founded the Indian Statistical Institute, and contributed to the design of large-scale sample surveys.[1][4][5][6]</code> \|
	\| <code>what was the purpose of the kirby-bauer method</code> \| <code>Agar diffusion test The agar diffusion test (Kirbyâ€“Bauer antibiotic testing, KB testing, or disc diffusion antibiotic sensitivity testing) is a test of the antibiotic sensitivity of bacteria. It uses antibiotic discs to test the extent to which bacteria are affected by those antibiotics. In this test, wafers containing antibiotics are placed on an agar plate where bacteria have been placed, and the plate is left to incubate. If an antibiotic stops the bacteria from growing or kills the bacteria, there will be an area around the wafer where the bacteria have not grown enough to be visible. This is called a zone of inhibition.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### trivia_pairs

	* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
	* Size: 100 evaluation samples
	* Columns: <code>query</code> and <code>answer</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| query \| answer \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 15.73 tokens</li><li>max: 33 tokens</li></ul> \| <ul><li>min: 29 tokens</li><li>mean: 206.38 tokens</li><li>max: 377 tokens</li></ul> \|
	* Samples:
	\| query \| answer \|
	\|:------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Anderlecht Football Club is based in which European country?</code> \| <code>Belgium Football - Football.com \| Europe Europe Belgium Are You Visible? Register for a Player Page and get your talent in front of the people who matter to your career.</code> \|
	\| <code>Roman Numerals CD represent which number?</code> \| <code>CD in Roman Numerals \| InRomanNumerals.com CD in Roman Numerals The Roman Numerals CD represents 400 Decimal Number People find this page searching for: what does CD mean in roman numerals</code> \|
	\| <code>The phrase ‘Sweets to the sweet’ is from which Shakespeare play?</code> \| <code>Sweets to the sweet - eNotes Shakespeare Quotes Sweets to the sweet [Scattering flowers] Sweets to the sweet, farewell! I hop'd thou shouldst have been my Hamlet's wife: I thought thy bride-bed to have deck'd, sweet maid, And not have strew'd thy grave. Read on Owl Eyes This eText is now on Owl Eyes. Clicking this link will open a new window. When Hamlet's mother, the queen, delivers "Sweets to the sweet," she's not bearing a hostess gift or offering candy to her date. The queen's "sweets" are funeral bouquets scattered in the grave of Ophelia, Hamlet's former flame. The prince, who has just finished addressing the skull of Yorick [see ALAS, POOR YORICK ], stumbles upon the funeral, ignorant that Ophelia has likely committed suicide. The murder of her father had driven Ophelia mad; Hamlet was the murderer, and the queen a witness. This is all bad enough. But the queen's elegiac nostalgia for her son's courtship of this deceased "sweet" is all the more disturbing in light of Hamlet's somewhat over-arduous attachment to his mother. It's therefore ironic that "sweets to the sweet" has become a corny quotation for those special romantic moments. How effective the line proves depends on how vividly one's "sweet" is likely to recall the graveyard scene in Hamlet . You might, however, find these bons mots most winning when offered with a willow branch and a whiff of charm to a soon-to-be-insignificant other.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### gooaq_pairs

	* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
	* Size: 87 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 11.64 tokens</li><li>max: 18 tokens</li></ul> \| <ul><li>min: 17 tokens</li><li>mean: 57.25 tokens</li><li>max: 101 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:---------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>how can i see if someone is connected to my wifi?</code> \| <code>Use Your Router's Web Interface The best way to find this information will be to check your router's web interface. Your router hosts your Wi-Fi network, so it has the most accurate data about which devices are connected to it. Most routers offer a way to view a list of connected devices, although some may not.</code> \|
	\| <code>are heart palpitations normal during perimenopause?</code> \| <code>Women and men can have heart palpitations. In healthy people, they are most common in perimenopausal and menopausal women as a result of fluctuating hormones such as estrogen and progesterone. Some perimenopausal and menopausal women suggest their palpitations occur during or after a hot flash.</code> \|
	\| <code>do klaus and caroline end up together in the vampire diaries?</code> \| <code>together for the last few seasons of The Vampire Diaries and continued to keep Klaus in check in the final season of The Originals. ... When all is said and done, Caroline remains the last love of both Stefan Salvatore and Klaus Mikaelson. However, in the end, they both picked someone else over her.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### paws-pos

	* Dataset: [paws-pos](https://huggingface.co/datasets/google-research-datasets/paws) at [161ece9](https://huggingface.co/datasets/google-research-datasets/paws/tree/161ece9501cf0a11f3e48bd356eaa82de46d6a09)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 25.72 tokens</li><li>max: 42 tokens</li></ul> \| <ul><li>min: 10 tokens</li><li>mean: 25.55 tokens</li><li>max: 41 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>They were there to enjoy us and they were there to pray for us .</code> \| <code>They were there for us to enjoy and they were there for us to pray .</code> \|
	\| <code>After the end of the war in June 1902 , Higgins left Southampton in the `` SSBavarian '' in August , returning to Cape Town the following month .</code> \| <code>In August , after the end of the war in June 1902 , Higgins Southampton left the `` SSBavarian '' and returned to Cape Town the following month .</code> \|
	\| <code>From the merger of the Four Rivers Council and the Audubon Council , the Shawnee Trails Council was born .</code> \| <code>Shawnee Trails Council was formed from the merger of the Four Rivers Council and the Audubon Council .</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	#### global_dataset

	* Dataset: global_dataset
	* Size: 663 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 30.18 tokens</li><li>max: 351 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 56.34 tokens</li><li>max: 421 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Seeds can be dispersed through all sorts of creative ways, such as</code> \| <code>seed dispersal is when the seeds of a plant are moved from the plant to a new environment</code> \|
	\| <code>It was part of the Hanover Township , then Chatham Township , before being recorded in 1899 as Florham Park .</code> \| <code>It was part of Hanover Township , then Chatham Township before being incorporated as Florham Park in 1899 .</code> \|
	\| <code>Darren Jones, standing for Bristol North West, felt unwell and left his podium about 45 seconds before the programme on Made in Bristol TV began.<br>Tory Charlotte Leslie rushed to assist Mr Jones as he lay on the studio floor.<br>Mr Jones said he was "perfectly fine now" and was "raring to get back on the campaign trail".<br>He was taking part in a live debate, on Wednesday, with other candidates standing in the constituency, including Liberal Democrat Clare Campion-Smith and UKIP's Michael Frost.<br>An ambulance was called but cancelled after Mr Jones said it was "not needed".<br>Mr Jones said his fainting was not due to a medical condition but "the fact that I had a cold, it was a hot room, I'd had a busy day".<br>He said he had been feeling "quite poorly all day" but he had wanted to ensure Labour was represented in the debate.<br>Mr Jones said he was very grateful to Ms Leslie, the TV production team and the audience but added he was "amazed that having a cold is so newsworthy".<br>The candidates for the constituency are:<br>Clare Campion-Smith, Liberal Democrat<br>Michael Frost. UKIP<br>Darren Jones, Labour<br>Anne Lemon, TUSC<br>Charlotte Leslie, Conservative<br>Justin Quinnell, Green</code> \| <code>A Labour candidate who fainted during a live TV election debate in Bristol and was helped by his Conservative rival has blamed his collapse on a cold.</code> \|
	* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
	```json
	{'guide': SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	), 'temperature': 0.025}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `eval_strategy`: steps
	- `per_device_train_batch_size`: 96
	- `per_device_eval_batch_size`: 128
	- `gradient_accumulation_steps`: 2
	- `learning_rate`: 4.5e-05
	- `weight_decay`: 0.001
	- `lr_scheduler_type`: cosine_with_min_lr
	- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 9e-06}
	- `warmup_ratio`: 0.33
	- `save_safetensors`: False
	- `fp16`: True
	- `push_to_hub`: True
	- `hub_model_id`: bobox/DeBERTa2-0.9B-ST-v2-checkpoints-tmp
	- `hub_strategy`: all_checkpoints
	- `batch_sampler`: no_duplicates

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: steps
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 96
	- `per_device_eval_batch_size`: 128
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 2
	- `eval_accumulation_steps`: None
	- `torch_empty_cache_steps`: None
	- `learning_rate`: 4.5e-05
	- `weight_decay`: 0.001
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 3
	- `max_steps`: -1
	- `lr_scheduler_type`: cosine_with_min_lr
	- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 9e-06}
	- `warmup_ratio`: 0.33
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: False
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: True
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: False
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: True
	- `resume_from_checkpoint`: None
	- `hub_model_id`: bobox/DeBERTa2-0.9B-ST-v2-checkpoints-tmp
	- `hub_strategy`: all_checkpoints
	- `hub_private_repo`: False
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: False
	- `eval_use_gather_object`: False
	- `batch_sampler`: no_duplicates
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	\| Epoch \| Step \| Training Loss \| xsum-pairs loss \| nq pairs loss \| trivia pairs loss \| sciq pairs loss \| paws-pos loss \| openbookqa pairs loss \| global dataset loss \| vitaminc-pairs loss \| qasc pairs loss \| scitail-pairs-qa loss \| scitail-pairs-pos loss \| msmarco pairs loss \| gooaq pairs loss \| negation-triplets loss \| Qnli-dev_max_ap \| allNLI-dev_max_ap \| sts-test_spearman_cosine \|
	\|:------:\|:----:\|:-------------:\|:---------------:\|:-------------:\|:-----------------:\|:---------------:\|:-------------:\|:---------------------:\|:-------------------:\|:-------------------:\|:---------------:\|:---------------------:\|:----------------------:\|:------------------:\|:----------------:\|:----------------------:\|:---------------:\|:-----------------:\|:------------------------:\|
	\| 0.0097 \| 1 \| 0.1607 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0194 \| 2 \| 0.1664 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0291 \| 3 \| 0.2686 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0388 \| 4 \| 0.1656 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0485 \| 5 \| 0.1269 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0583 \| 6 \| 0.1066 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0680 \| 7 \| 0.1936 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0777 \| 8 \| 0.087 \| 0.0002 \| 0.0051 \| 0.2676 \| 0.0582 \| 0.0459 \| 1.3787 \| 0.4902 \| 2.8019 \| 0.0736 \| 0.0000 \| 0.0607 \| 0.1278 \| 0.0576 \| 1.3557 \| 0.7325 \| 0.6417 \| 0.9075 \|
	\| 0.0874 \| 9 \| 0.1952 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0971 \| 10 \| 0.4167 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1068 \| 11 \| 0.7876 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1165 \| 12 \| 0.3714 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1262 \| 13 \| 0.1852 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1359 \| 14 \| 0.1144 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1456 \| 15 \| 0.1234 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1553 \| 16 \| 0.0569 \| 0.0002 \| 0.0057 \| 0.2679 \| 0.0570 \| 0.0460 \| 1.3709 \| 0.4804 \| 2.7995 \| 0.0738 \| 0.0000 \| 0.0608 \| 0.1316 \| 0.0578 \| 1.3501 \| 0.7327 \| 0.6415 \| 0.9079 \|


	### Framework Versions
	- Python: 3.10.14
	- Sentence Transformers: 3.0.1
	- Transformers: 4.44.0
	- PyTorch: 2.4.0
	- Accelerate: 0.33.0
	- Datasets: 2.21.0
	- Tokenizers: 0.19.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->