Training in progress, step 1540, checkpoint

e5ea348 verified about 1 year ago

230 kB

	---
	base_model: microsoft/deberta-v3-small
	datasets:
	- sentence-transformers/all-nli
	- jinaai/negation-dataset-v2
	- tals/vitaminc
	- nyu-mll/glue
	- allenai/scitail
	- sentence-transformers/xsum
	- sentence-transformers/sentence-compression
	- allenai/sciq
	- allenai/qasc
	- sentence-transformers/msmarco-msmarco-distilbert-base-v3
	- sentence-transformers/natural-questions
	- sentence-transformers/trivia-qa
	- sentence-transformers/quora-duplicates
	- sentence-transformers/gooaq
	- sentence-transformers/simple-wiki
	language:
	- en
	library_name: sentence-transformers
	metrics:
	- pearson_cosine
	- spearman_cosine
	- pearson_manhattan
	- spearman_manhattan
	- pearson_euclidean
	- spearman_euclidean
	- pearson_dot
	- spearman_dot
	- pearson_max
	- spearman_max
	- cosine_accuracy
	- dot_accuracy
	- manhattan_accuracy
	- euclidean_accuracy
	- max_accuracy
	- cosine_accuracy_threshold
	- cosine_f1
	- cosine_f1_threshold
	- cosine_precision
	- cosine_recall
	- cosine_ap
	- dot_accuracy_threshold
	- dot_f1
	- dot_f1_threshold
	- dot_precision
	- dot_recall
	- dot_ap
	- manhattan_accuracy_threshold
	- manhattan_f1
	- manhattan_f1_threshold
	- manhattan_precision
	- manhattan_recall
	- manhattan_ap
	- euclidean_accuracy_threshold
	- euclidean_f1
	- euclidean_f1_threshold
	- euclidean_precision
	- euclidean_recall
	- euclidean_ap
	- max_accuracy_threshold
	- max_f1
	- max_f1_threshold
	- max_precision
	- max_recall
	- max_ap
	pipeline_tag: sentence-similarity
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:322768
	- loss:AdaptiveLayerLoss
	- loss:GISTEmbedLoss
	- loss:TripletLoss
	- loss:OnlineContrastiveLoss
	- loss:MultipleNegativesSymmetricRankingLoss
	- loss:MultipleNegativesRankingLoss
	widget:
	- source_sentence: A magnet will stick to which of the following?
	sentences:
	- if a magnet is attracted to a metal then that magnet will stick to that metal
	- the circulatory system brings oxygen from the lungs to the rest of the body
	- humans eat crops
	- source_sentence: What does the activity of an organism depend on the totality of?
	sentences:
	- The digestive system consists of organs that break down food, absorb nutrients,
	and eliminate waste.
	- The fundamental and overtones can be present simultaneously in a variety of combinations.
	For example, middle C on a trumpet has a sound distinctively different from middle
	C on a clarinet, both instruments being modified versions of a tube closed at
	one end. The fundamental frequency is the same (and usually the most intense),
	but the overtones and their mix of intensities are different and subject to shading
	by the musician. This mix is what gives various musical instruments (and human
	voices) their distinctive characteristics, whether they have air columns, strings,
	sounding boxes, or drumheads. In fact, much of our speech is determined by shaping
	the cavity formed by the throat and mouth and positioning the tongue to adjust
	the fundamental and combination of overtones. Simple resonant cavities can be
	made to resonate with the sound of the vowels, for example. (See Figure 17.30.
	) In boys, at puberty, the larynx grows and the shape of the resonant cavity changes
	giving rise to the difference in predominant frequencies in speech between men
	and women.
	- the activity of an organism depends on the total activity of independent cells,.
	- source_sentence: The Ministry insisted that it was not the Sports Code, but IOA's
	constitution, which was not aligned with the Olympic Charter.
	sentences:
	- Three more sketches of those involved in serial Jaipur blasts released
	- 'IOA''s constitution not aligned with Olympic Charter:'
	- Cincinnati Reds defeat Florida Marlins
	- source_sentence: Two female workers sit on some steps during work.
	sentences:
	- A woman is brushing her hair.
	- a guy is waxing
	- Two women sitting on steps at their job.
	- source_sentence: A volcanic arc is formed when a subducting plate flows under another
	tectonic plate.
	sentences:
	- Where do streams often start?
	- What is formed when a subducting plate flows under another tectonic plate?
	- Where is multiple fission more often observed?
	model-index:
	- name: SentenceTransformer based on microsoft/deberta-v3-small
	results:
	- task:
	type: semantic-similarity
	name: Semantic Similarity
	dataset:
	name: sts test
	type: sts-test
	metrics:
	- type: pearson_cosine
	value: 0.731642468323062
	name: Pearson Cosine
	- type: spearman_cosine
	value: 0.7388342851809555
	name: Spearman Cosine
	- type: pearson_manhattan
	value: 0.7306876393922224
	name: Pearson Manhattan
	- type: spearman_manhattan
	value: 0.7326322535482364
	name: Spearman Manhattan
	- type: pearson_euclidean
	value: 0.7213340791831213
	name: Pearson Euclidean
	- type: spearman_euclidean
	value: 0.7248067929450137
	name: Spearman Euclidean
	- type: pearson_dot
	value: 0.7060174899825389
	name: Pearson Dot
	- type: spearman_dot
	value: 0.7163801725525887
	name: Spearman Dot
	- type: pearson_max
	value: 0.731642468323062
	name: Pearson Max
	- type: spearman_max
	value: 0.7388342851809555
	name: Spearman Max
	- task:
	type: triplet
	name: Triplet
	dataset:
	name: negation
	type: negation
	metrics:
	- type: cosine_accuracy
	value: 1.0
	name: Cosine Accuracy
	- type: dot_accuracy
	value: 0.0
	name: Dot Accuracy
	- type: manhattan_accuracy
	value: 1.0
	name: Manhattan Accuracy
	- type: euclidean_accuracy
	value: 1.0
	name: Euclidean Accuracy
	- type: max_accuracy
	value: 1.0
	name: Max Accuracy
	- task:
	type: binary-classification
	name: Binary Classification
	dataset:
	name: mrpc
	type: mrpc
	metrics:
	- type: cosine_accuracy
	value: 0.70703125
	name: Cosine Accuracy
	- type: cosine_accuracy_threshold
	value: 0.7692825198173523
	name: Cosine Accuracy Threshold
	- type: cosine_f1
	value: 0.8009708737864077
	name: Cosine F1
	- type: cosine_f1_threshold
	value: 0.6043864488601685
	name: Cosine F1 Threshold
	- type: cosine_precision
	value: 0.6762295081967213
	name: Cosine Precision
	- type: cosine_recall
	value: 0.9821428571428571
	name: Cosine Recall
	- type: cosine_ap
	value: 0.7930850238332522
	name: Cosine Ap
	- type: dot_accuracy
	value: 0.68359375
	name: Dot Accuracy
	- type: dot_accuracy_threshold
	value: 111.09579467773438
	name: Dot Accuracy Threshold
	- type: dot_f1
	value: 0.8028846153846153
	name: Dot F1
	- type: dot_f1_threshold
	value: 100.36712646484375
	name: Dot F1 Threshold
	- type: dot_precision
	value: 0.6733870967741935
	name: Dot Precision
	- type: dot_recall
	value: 0.9940476190476191
	name: Dot Recall
	- type: dot_ap
	value: 0.685668349677386
	name: Dot Ap
	- type: manhattan_accuracy
	value: 0.6953125
	name: Manhattan Accuracy
	- type: manhattan_accuracy_threshold
	value: 166.01010131835938
	name: Manhattan Accuracy Threshold
	- type: manhattan_f1
	value: 0.7970660146699267
	name: Manhattan F1
	- type: manhattan_f1_threshold
	value: 243.34291076660156
	name: Manhattan F1 Threshold
	- type: manhattan_precision
	value: 0.6763485477178424
	name: Manhattan Precision
	- type: manhattan_recall
	value: 0.9702380952380952
	name: Manhattan Recall
	- type: manhattan_ap
	value: 0.8185487109494757
	name: Manhattan Ap
	- type: euclidean_accuracy
	value: 0.6953125
	name: Euclidean Accuracy
	- type: euclidean_accuracy_threshold
	value: 8.249982833862305
	name: Euclidean Accuracy Threshold
	- type: euclidean_f1
	value: 0.7999999999999999
	name: Euclidean F1
	- type: euclidean_f1_threshold
	value: 10.622720718383789
	name: Euclidean F1 Threshold
	- type: euclidean_precision
	value: 0.6960352422907489
	name: Euclidean Precision
	- type: euclidean_recall
	value: 0.9404761904761905
	name: Euclidean Recall
	- type: euclidean_ap
	value: 0.8099812581176395
	name: Euclidean Ap
	- type: max_accuracy
	value: 0.70703125
	name: Max Accuracy
	- type: max_accuracy_threshold
	value: 166.01010131835938
	name: Max Accuracy Threshold
	- type: max_f1
	value: 0.8028846153846153
	name: Max F1
	- type: max_f1_threshold
	value: 243.34291076660156
	name: Max F1 Threshold
	- type: max_precision
	value: 0.6960352422907489
	name: Max Precision
	- type: max_recall
	value: 0.9940476190476191
	name: Max Recall
	- type: max_ap
	value: 0.8185487109494757
	name: Max Ap
	- task:
	type: binary-classification
	name: Binary Classification
	dataset:
	name: Vitaminc
	type: Vitaminc
	metrics:
	- type: cosine_accuracy
	value: 0.55859375
	name: Cosine Accuracy
	- type: cosine_accuracy_threshold
	value: 0.7416476011276245
	name: Cosine Accuracy Threshold
	- type: cosine_f1
	value: 0.6542553191489362
	name: Cosine F1
	- type: cosine_f1_threshold
	value: 0.39137744903564453
	name: Cosine F1 Threshold
	- type: cosine_precision
	value: 0.48616600790513836
	name: Cosine Precision
	- type: cosine_recall
	value: 1.0
	name: Cosine Recall
	- type: cosine_ap
	value: 0.5386900578010883
	name: Cosine Ap
	- type: dot_accuracy
	value: 0.55859375
	name: Dot Accuracy
	- type: dot_accuracy_threshold
	value: 168.04440307617188
	name: Dot Accuracy Threshold
	- type: dot_f1
	value: 0.6542553191489362
	name: Dot F1
	- type: dot_f1_threshold
	value: 82.83587646484375
	name: Dot F1 Threshold
	- type: dot_precision
	value: 0.48616600790513836
	name: Dot Precision
	- type: dot_recall
	value: 1.0
	name: Dot Recall
	- type: dot_ap
	value: 0.5406067293020295
	name: Dot Ap
	- type: manhattan_accuracy
	value: 0.5625
	name: Manhattan Accuracy
	- type: manhattan_accuracy_threshold
	value: 209.30810546875
	name: Manhattan Accuracy Threshold
	- type: manhattan_f1
	value: 0.6594005449591281
	name: Manhattan F1
	- type: manhattan_f1_threshold
	value: 300.77703857421875
	name: Manhattan F1 Threshold
	- type: manhattan_precision
	value: 0.4959016393442623
	name: Manhattan Precision
	- type: manhattan_recall
	value: 0.983739837398374
	name: Manhattan Recall
	- type: manhattan_ap
	value: 0.5310137997739139
	name: Manhattan Ap
	- type: euclidean_accuracy
	value: 0.57421875
	name: Euclidean Accuracy
	- type: euclidean_accuracy_threshold
	value: 10.546405792236328
	name: Euclidean Accuracy Threshold
	- type: euclidean_f1
	value: 0.6556473829201103
	name: Euclidean F1
	- type: euclidean_f1_threshold
	value: 14.913542747497559
	name: Euclidean F1 Threshold
	- type: euclidean_precision
	value: 0.49583333333333335
	name: Euclidean Precision
	- type: euclidean_recall
	value: 0.967479674796748
	name: Euclidean Recall
	- type: euclidean_ap
	value: 0.5319997521875279
	name: Euclidean Ap
	- type: max_accuracy
	value: 0.57421875
	name: Max Accuracy
	- type: max_accuracy_threshold
	value: 209.30810546875
	name: Max Accuracy Threshold
	- type: max_f1
	value: 0.6594005449591281
	name: Max F1
	- type: max_f1_threshold
	value: 300.77703857421875
	name: Max F1 Threshold
	- type: max_precision
	value: 0.4959016393442623
	name: Max Precision
	- type: max_recall
	value: 1.0
	name: Max Recall
	- type: max_ap
	value: 0.5406067293020295
	name: Max Ap
	---

	# SentenceTransformer based on microsoft/deberta-v3-small

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) on the [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli), [nli-pairs2](https://huggingface.co/datasets/sentence-transformers/all-nli), [negation-triplets](https://huggingface.co/datasets/jinaai/negation-dataset-v2), [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc), [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue), [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail), [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail), [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum), [xsum-pairs2](https://huggingface.co/datasets/sentence-transformers/xsum), [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression), [compression-pairs2](https://huggingface.co/datasets/sentence-transformers/sentence-compression), [compression-pairs3](https://huggingface.co/datasets/sentence-transformers/sentence-compression), [sciq_pairs](https://huggingface.co/datasets/allenai/sciq), [qasc_pairs](https://huggingface.co/datasets/allenai/qasc), [qasc_facts_sym](https://huggingface.co/datasets/allenai/qasc), openbookqa_pairs, [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3), [msmarco_pairs2](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3), [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions), [nq_pairs2](https://huggingface.co/datasets/sentence-transformers/natural-questions), [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa), [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates), [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq), [gooaq_pairs2](https://huggingface.co/datasets/sentence-transformers/gooaq), [mrpc_pairs](https://huggingface.co/datasets/nyu-mll/glue), [simple_wiki_pairs](https://huggingface.co/datasets/sentence-transformers/simple-wiki) and [simple_wiki_pairs2](https://huggingface.co/datasets/sentence-transformers/simple-wiki) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) <!-- at revision a36c739020e01763fe789b4b85e2df55d6180012 -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 tokens
	- Similarity Function: Cosine Similarity
	- Training Datasets:
	- [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli)
	- [nli-pairs2](https://huggingface.co/datasets/sentence-transformers/all-nli)
	- [negation-triplets](https://huggingface.co/datasets/jinaai/negation-dataset-v2)
	- [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc)
	- [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue)
	- [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail)
	- [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail)
	- [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum)
	- [xsum-pairs2](https://huggingface.co/datasets/sentence-transformers/xsum)
	- [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression)
	- [compression-pairs2](https://huggingface.co/datasets/sentence-transformers/sentence-compression)
	- [compression-pairs3](https://huggingface.co/datasets/sentence-transformers/sentence-compression)
	- [sciq_pairs](https://huggingface.co/datasets/allenai/sciq)
	- [qasc_pairs](https://huggingface.co/datasets/allenai/qasc)
	- [qasc_facts_sym](https://huggingface.co/datasets/allenai/qasc)
	- openbookqa_pairs
	- [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3)
	- [msmarco_pairs2](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3)
	- [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions)
	- [nq_pairs2](https://huggingface.co/datasets/sentence-transformers/natural-questions)
	- [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa)
	- [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates)
	- [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq)
	- [gooaq_pairs2](https://huggingface.co/datasets/sentence-transformers/gooaq)
	- [mrpc_pairs](https://huggingface.co/datasets/nyu-mll/glue)
	- [simple_wiki_pairs](https://huggingface.co/datasets/sentence-transformers/simple-wiki)
	- [simple_wiki_pairs2](https://huggingface.co/datasets/sentence-transformers/simple-wiki)
	- Language: en
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("bobox/DeBERTa-ST-AllLayers-v3.5-checkpoints-tmp")
	# Run inference
	sentences = [
	'A volcanic arc is formed when a subducting plate flows under another tectonic plate.',
	'What is formed when a subducting plate flows under another tectonic plate?',
	'Where is multiple fission more often observed?',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	## Evaluation

	### Metrics

	#### Semantic Similarity
	* Dataset: `sts-test`
	* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

	\| Metric \| Value \|
	\|:--------------------\|:-----------\|
	\| pearson_cosine \| 0.7316 \|
	\| spearman_cosine \| 0.7388 \|
	\| pearson_manhattan \| 0.7307 \|
	\| spearman_manhattan \| 0.7326 \|
	\| pearson_euclidean \| 0.7213 \|
	\| spearman_euclidean \| 0.7248 \|
	\| pearson_dot \| 0.706 \|
	\| spearman_dot \| 0.7164 \|
	\| pearson_max \| 0.7316 \|
	\| spearman_max \| 0.7388 \|

	#### Triplet
	* Dataset: `negation`
	* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)

	\| Metric \| Value \|
	\|:-------------------\|:--------\|
	\| cosine_accuracy \| 1.0 \|
	\| dot_accuracy \| 0.0 \|
	\| manhattan_accuracy \| 1.0 \|
	\| euclidean_accuracy \| 1.0 \|
	\| max_accuracy \| 1.0 \|

	#### Binary Classification
	* Dataset: `mrpc`
	* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)

	\| Metric \| Value \|
	\|:-----------------------------\|:-----------\|
	\| cosine_accuracy \| 0.707 \|
	\| cosine_accuracy_threshold \| 0.7693 \|
	\| cosine_f1 \| 0.801 \|
	\| cosine_f1_threshold \| 0.6044 \|
	\| cosine_precision \| 0.6762 \|
	\| cosine_recall \| 0.9821 \|
	\| cosine_ap \| 0.7931 \|
	\| dot_accuracy \| 0.6836 \|
	\| dot_accuracy_threshold \| 111.0958 \|
	\| dot_f1 \| 0.8029 \|
	\| dot_f1_threshold \| 100.3671 \|
	\| dot_precision \| 0.6734 \|
	\| dot_recall \| 0.994 \|
	\| dot_ap \| 0.6857 \|
	\| manhattan_accuracy \| 0.6953 \|
	\| manhattan_accuracy_threshold \| 166.0101 \|
	\| manhattan_f1 \| 0.7971 \|
	\| manhattan_f1_threshold \| 243.3429 \|
	\| manhattan_precision \| 0.6763 \|
	\| manhattan_recall \| 0.9702 \|
	\| manhattan_ap \| 0.8185 \|
	\| euclidean_accuracy \| 0.6953 \|
	\| euclidean_accuracy_threshold \| 8.25 \|
	\| euclidean_f1 \| 0.8 \|
	\| euclidean_f1_threshold \| 10.6227 \|
	\| euclidean_precision \| 0.696 \|
	\| euclidean_recall \| 0.9405 \|
	\| euclidean_ap \| 0.81 \|
	\| max_accuracy \| 0.707 \|
	\| max_accuracy_threshold \| 166.0101 \|
	\| max_f1 \| 0.8029 \|
	\| max_f1_threshold \| 243.3429 \|
	\| max_precision \| 0.696 \|
	\| max_recall \| 0.994 \|
	\| max_ap \| 0.8185 \|

	#### Binary Classification
	* Dataset: `Vitaminc`
	* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)

	\| Metric \| Value \|
	\|:-----------------------------\|:-----------\|
	\| cosine_accuracy \| 0.5586 \|
	\| cosine_accuracy_threshold \| 0.7416 \|
	\| cosine_f1 \| 0.6543 \|
	\| cosine_f1_threshold \| 0.3914 \|
	\| cosine_precision \| 0.4862 \|
	\| cosine_recall \| 1.0 \|
	\| cosine_ap \| 0.5387 \|
	\| dot_accuracy \| 0.5586 \|
	\| dot_accuracy_threshold \| 168.0444 \|
	\| dot_f1 \| 0.6543 \|
	\| dot_f1_threshold \| 82.8359 \|
	\| dot_precision \| 0.4862 \|
	\| dot_recall \| 1.0 \|
	\| dot_ap \| 0.5406 \|
	\| manhattan_accuracy \| 0.5625 \|
	\| manhattan_accuracy_threshold \| 209.3081 \|
	\| manhattan_f1 \| 0.6594 \|
	\| manhattan_f1_threshold \| 300.777 \|
	\| manhattan_precision \| 0.4959 \|
	\| manhattan_recall \| 0.9837 \|
	\| manhattan_ap \| 0.531 \|
	\| euclidean_accuracy \| 0.5742 \|
	\| euclidean_accuracy_threshold \| 10.5464 \|
	\| euclidean_f1 \| 0.6556 \|
	\| euclidean_f1_threshold \| 14.9135 \|
	\| euclidean_precision \| 0.4958 \|
	\| euclidean_recall \| 0.9675 \|
	\| euclidean_ap \| 0.532 \|
	\| max_accuracy \| 0.5742 \|
	\| max_accuracy_threshold \| 209.3081 \|
	\| max_f1 \| 0.6594 \|
	\| max_f1_threshold \| 300.777 \|
	\| max_precision \| 0.4959 \|
	\| max_recall \| 1.0 \|
	\| max_ap \| 0.5406 \|

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Datasets

	#### nli-pairs

	* Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
	* Size: 14,000 training samples
	* Columns: <code>anchor</code> and <code>positive</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| positive \|
	\|:--------\|:----------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 16.29 tokens</li><li>max: 48 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 9.62 tokens</li><li>max: 39 tokens</li></ul> \|
	* Samples:
	\| anchor \| positive \|
	\|:----------------------------------------------------------------------------\|:----------------------------------------------------\|
	\| <code>A young child jumping on the bed as a man looks the other way.</code> \| <code>A child plays on the bed.</code> \|
	\| <code>A running dog and a standing man on a dry field of grass.</code> \| <code>A dog is running.</code> \|
	\| <code>Children play with large hoops.</code> \| <code>The hoops being played with are large.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### nli-pairs2

	* Dataset: [nli-pairs2](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
	* Size: 6,000 training samples
	* Columns: <code>anchor</code> and <code>positive</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| positive \|
	\|:--------\|:----------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 16.03 tokens</li><li>max: 39 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 9.31 tokens</li><li>max: 24 tokens</li></ul> \|
	* Samples:
	\| anchor \| positive \|
	\|:---------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------\|
	\| <code>Man holding crying baby in chair near window.</code> \| <code>A man with a baby.</code> \|
	\| <code>A man with a white camp is standing on a platform in front of a large black cylinder device, and fabric on a clothesline.</code> \| <code>A man stands outdoors in front of several things.</code> \|
	\| <code>A man with glasses and a mustache plays an electric guitar while standing behind a microphone.</code> \| <code>The man is wearing glasses.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### negation-triplets

	* Dataset: [negation-triplets](https://huggingface.co/datasets/jinaai/negation-dataset-v2)
	* Size: 25,000 training samples
	* Columns: <code>anchor</code>, <code>entailment</code>, and <code>negative</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| entailment \| negative \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 22.12 tokens</li><li>max: 112 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 14.15 tokens</li><li>max: 43 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 14.45 tokens</li><li>max: 42 tokens</li></ul> \|
	* Samples:
	\| anchor \| entailment \| negative \|
	\|:--------------------------------------------------------------------------------\|:------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------\|
	\| <code>Cute little boy in an army shirt blowing a kiss at a small lizard.</code> \| <code>There was a kiss blown towards the small lizard by a little boy.</code> \| <code>There was no kiss blown towards the small lizard by a little boy.</code> \|
	\| <code>A woman wearing gold jewelry weaves a beautiful piece.</code> \| <code>A woman is wearing gold jewelry.</code> \| <code>A woman is not wearing gold jewelry.</code> \|
	\| <code>A dog runs uphill towards the camera in near a fenced area.</code> \| <code>a dog is running</code> \| <code>a dog is not running</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "TripletLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 1,
	"prior_layers_weight": 1,
	"kl_div_weight": 0.5,
	"kl_temperature": 1.1
	}
	```

	#### vitaminc-pairs

	* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
	* Size: 23,000 training samples
	* Columns: <code>claim</code> and <code>evidence</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| claim \| evidence \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 16.16 tokens</li><li>max: 50 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 36.84 tokens</li><li>max: 333 tokens</li></ul> \|
	* Samples:
	\| claim \| evidence \|
	\|:-----------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Before March 9 , more than five people who attended a dance class in Indonesia tested positive for COVID-19 due to the pandemic .</code> \| <code>By 8 March , a total of 6 people who had attended the dance class were infected by the coronavirus , including one case of repatriated Indonesian from the Diamond Princess .</code> \|
	\| <code>Benin is a tropical fruit .</code> \| <code>Benin is a juicy stone fruit produced from numerous species of tropical trees belonging to the flowering plant genus Mangifera , cultivated mostly for their edible fruit .</code> \|
	\| <code>Susan Calman is famous for her Glaswegian humor .</code> \| <code>The long-running TV drama Taggart and the comedies Empty , Chewin ' the Fat , Rab C. Nesbitt , Still Game , Limmy 's Show and Dear Green Place depict the Glaswegian patois , while Kevin Bridges , Frankie Boyle , Craig Ferguson , Susan Calman and Billy Connolly have made Glaswegian humour known to the rest of the world .</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### qnli-contrastive

	* Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
	* Size: 25,000 training samples
	* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \| label \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|:-----------------------------\|
	\| type \| string \| string \| int \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 13.78 tokens</li><li>max: 39 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 34.93 tokens</li><li>max: 174 tokens</li></ul> \| <ul><li>0: 100.00%</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \| label \|
	\|:-----------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------\|
	\| <code>What is the most popular sport in Germany?</code> \| <code>Football is by far the most popular sport, and the German Football Federation (Deutscher Fußballbund) with more than 6.3 million members is the largest athletic organisation in the country.</code> \| <code>0</code> \|
	\| <code>How many people did Carlton have per km2 between 2012 and 2013?</code> \| <code>Surrounding inner city suburbs experienced an increase in population density between 2012 and 2013; Carlton (9,000 people per km2) and Fitzroy (7,900).</code> \| <code>0</code> \|
	\| <code>What does this Gospel show that Athanasius also believed?</code> \| <code>This Gospel in itself is the greatest support of Athanasius's stand.</code> \| <code>0</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "OnlineContrastiveLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 1,
	"prior_layers_weight": 1,
	"kl_div_weight": 0.5,
	"kl_temperature": 1.1
	}
	```

	#### scitail-pairs-qa

	* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 14,687 training samples
	* Columns: <code>sentence2</code> and <code>sentence1</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence2 \| sentence1 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 15.86 tokens</li><li>max: 37 tokens</li></ul> \| <ul><li>min: 6 tokens</li><li>mean: 14.94 tokens</li><li>max: 34 tokens</li></ul> \|
	* Samples:
	\| sentence2 \| sentence1 \|
	\|:--------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------\|
	\| <code>Archaeans produce methane gas as a waste product.</code> \| <code>Archaeans produce what kind of gas as a waste product?</code> \|
	\| <code>The term elastic potential energy is used to describe potential energy due to an object’s shape.</code> \| <code>What term is used to describe potential energy due to an object’s shape?</code> \|
	\| <code>Over 90% of the energy we use comes originally from the sun.</code> \| <code>Over 90% of the energy we use comes originally from what?</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### scitail-pairs-pos

	* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 8,600 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 23.99 tokens</li><li>max: 65 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 15.62 tokens</li><li>max: 41 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>coagulation in water treatment, the use of chemicals to make suspended solids gather or group together into small flocs.</code> \| <code>When drinking water is treated, the term for when chemicals cause solids in the water to clump together is coagulation.</code> \|
	\| <code>air pressure is the force exerted on a surface by the weight of the air above it.</code> \| <code>At the earth’s surface, the air pressure exerted on you is a result of the weight of air above you.</code> \|
	\| <code>The heart is a pump with four chambers.</code> \| <code>There are four chambers in the heart.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### xsum-pairs

	* Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
	* Size: 15,500 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 222.93 tokens</li><li>max: 512 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 25.83 tokens</li><li>max: 51 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>The 20-year-old impressed in Uruguay's Copa America win in July and has signed a long-term contract at Anfield.<br>A 6ft 6ins centre-back, Coates passed a medical and secured a work permit and is Liverpool's sixth summer signing.<br>Coates was last seen on the pitch in Buenos Aires celebrating victory in the Copa America, and accepting an award for the best young player of the tournament. Impressive credentials for a player who is not 21 until October<br>Read more from Tim's blog<br>He joins compatriot Luis Suarez, who tweeted: "I want to welcome Coates, partner in the Uruguay team, friend and great player," at Anfield.<br>Liverpool boss Kenny Dalglish was keen to strengthen his defensive resources after Greek defender Sotirios Kyrgiakos left for Wolfsburg on a free transfer.<br>Manchester City were strongly linked with Coates but he has now joined a Liverpool squad recently bolstered by the arrivals of midfielders Jordan Henderson, Charlie Adam and Stewart Downing, goalkeeper Doni and defender Jose Enrique.</code> \| <code>Liverpool have completed the signing of Uruguay international defender Sebastian Coates from Nacional.</code> \|
	\| <code>The court made the decision after a case was brought by protester Samira Ibrahim.<br>She accused the Egyptian army of forcing her to undergo a virginity test after she was arrested during a protest in Tahrir Square in March.<br>Human rights organisations say the Egyptian military has used the practice widely as a punishment.<br>"The court orders that the execution of the procedure of virginity tests on girls inside military prisons be stopped," judge Aly Fekry, head of Cairo administrative court said, according to Reuters.<br>The ruling was greeted by cheers from hundreds of activists inside the courtroom.<br>Activists had demanded that the authorities prosecute anyone responsible for subjecting protesters to such tests.<br>Earlier this year, an Egyptian general was quoted as acknowledging that the military had conducted such tests, saying that they were used so women would not later claim they had been raped by authorities.<br>Human rights groups say such tests are a degrading form of abuse and the general's justification a legal absurdity.</code> \| <code>A Cairo court has ordered forced virginity tests on female detainees in military prisons to be stopped.</code> \|
	\| <code>The claim: Completing the single market in services will create 700 to 800,000 new jobs over the coming years.<br>Reality Check verdict: This is an estimate of the impact of a whole range of extensions planned for the single market by 2030.<br>He is referring to this report from the Centre for Economics and Business Research, commissioned by Britain Stronger in Europe.<br>The report said that the boost to the economy could deliver 300,000 new jobs by 2020, rising to 790,000 by 2030.<br>It has taken a report from the European Parliament called The Cost of Non-Europe, and worked out how much of the pan-European benefits it identified from completing the single market could be allocated to the UK.<br>The report does not suggest that all the extra jobs would come from the single market in services - it includes areas such as further reform of energy markets and the introduction of the Transatlantic Trade and Investment Partnership (TTIP) to increase trade with the USA.<br>Working out how many jobs would be created by economic benefits to the economy is not an easy thing to do and certainly not a precise science.<br>So, for example, one of the benefits cited was having cheaper mortgages as a result of increased competition in financial services, which the report decided would create a benefit of Â£4bn a year in 2020.<br>But would cheaper mortgages create any jobs? Not if the people who have cheaper mortgages decide to save that extra money. You have to assume that it means more mortgages are sold and that the money saved is spent elsewhere.<br>Read more: The facts behind claims in the EU debate</code> \| <code>"Most analysis says if we complete the single market in servicesâ€¦ it will create something like 700 to 800,000 new jobs over the coming years," Remain campaigner Alan Johnson said in last week's televised EU Referendum debate.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### xsum-pairs2

	* Dataset: [xsum-pairs2](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
	* Size: 4,500 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 222.42 tokens</li><li>max: 512 tokens</li></ul> \| <ul><li>min: 6 tokens</li><li>mean: 25.77 tokens</li><li>max: 70 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Media playback is unsupported on your device<br>11 February 2014 Last updated at 12:37 GMT<br>The woman had been driving east towards Newmarket on the westbound carriageway before being stopped at junction 27.<br>She stopped in front of the police car with inches to spare.<br>Simon Newton reports.</code> \| <code>An 81-year-old motorist driving the wrong way along the A14 at 50mph was stopped by police who put their car in the way of her vehicle.</code> \|
	\| <code>The 33-year-old all-rounder helped the Steelbacks win the T20 Blast competition in August, and has spent two seasons with the side.<br>In that time he has taken 81 County Championship wickets and scored 782 runs, with 30 wickets in T20 cricket.<br>"Rory Kleinveldt is a Steelback to the core, we love having him here," said Northants head coach David Ripley.</code> \| <code>Former South Africa international Rory Kleinveldt has signed a new one-year contract with Northants.</code> \|
	\| <code>There have been security concerns across Europe since the attacks in Paris, which killed 130 people.<br>However, France's sports minister Patrick Kanner said fan zones would "tell the French people and foreigners that everything is under control".<br>Elsewhere, Russia says security at the 2018 World Cup will be strengthened.<br>"We discussed the problems of security in detail," said Russia's sports minister Vitaly Mutko after a meeting with Fifa inspectors.<br>The minister said that security at the World Cup construction sites had already been toughened up, and the fan zones at the 2017 Confederation Cup and the World Cup will be at the centre of the security forces' attention.<br>Euro 2016 will be played from 10 June to 10 July in 10 French cities.<br>In Paris, the fan zone is expected to have a 120,000 capacity and will be situated on the Champs de Mars, below the Eiffel Tower.</code> \| <code>Euro 2016 organisers are hopeful supporters will be allowed to watch matches on big screens in public areas of France next summer.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### compression-pairs

	* Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
	* Size: 13,750 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 31.04 tokens</li><li>max: 105 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 10.11 tokens</li><li>max: 24 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------\|
	\| <code>Former basketball coach Ralph Klein, 77, has been hospitalized for two days at Sheba Medical Center at Tel Hashomer, in what hospital officials describe as very serious condition.</code> \| <code>Ralph Klein, coaching legend, hospitalized</code> \|
	\| <code>A 31-year-old man has been charged with glassing another man at a Canley Vale hotel last week and will face Bankstown Local Court today, July 11 .</code> \| <code>Man charged over glassing in Canley Vale</code> \|
	\| <code>A 14-year-old girl is in hospital after running into a TTC streetcar in Toronto's west end on Sunday.</code> \| <code>Girl in hospital after running into streetcar</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### compression-pairs2

	* Dataset: [compression-pairs2](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
	* Size: 5,624 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 31.25 tokens</li><li>max: 144 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 10.24 tokens</li><li>max: 27 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------------------------------------------------------------------------------\|:----------------------------------------------------\|
	\| <code>Lonmin PLC on Sunday warned miners to either return to work on Monday or face being sacked from the platinum mine.</code> \| <code>Lonmin warns miners;</code> \|
	\| <code>An autopsy couldn't determine a cause of death for an Elliot Lake man who died in police custody Wednesday.</code> \| <code>Autopsy can't determine cause of death</code> \|
	\| <code>A massive asteroid with its own small moon in tow flew by Earth over the weekend.</code> \| <code>Massive asteroid flies by Earth</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### compression-pairs3

	* Dataset: [compression-pairs3](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
	* Size: 5,624 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 31.39 tokens</li><li>max: 112 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 10.12 tokens</li><li>max: 28 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------------------------------------------------------------------------\|:-------------------------------------------------------\|
	\| <code>Ryan O'Neal pleaded guilty to a felony charge of drug possession on Friday afternoon and agreed to enter an 18-month drug deferment program.</code> \| <code>O'Neal pleads guilty to drug possession</code> \|
	\| <code>Australian sprinter 'Ortensia' has drawn well for her international debut in Sunday's Group 1 Hong Kong Sprint.</code> \| <code>Ortensia draws well in Hong Kong Sprint</code> \|
	\| <code>More than 30 cars in Kennewick are vandalized on Halloween night, possibly because the owners weren't handing out any candy.</code> \| <code>More than 30 cars vandalized on Halloween</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesRankingLoss",
	"n_layers_per_step": 3,
	"last_layer_weight": 0.25,
	"prior_layers_weight": 2.5,
	"kl_div_weight": 1.5,
	"kl_temperature": 0.5
	}
	```

	#### sciq_pairs

	* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
	* Size: 11,445 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 16.86 tokens</li><li>max: 57 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 89.64 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>The water cycle involves movement of water between air and what?</code> \| <code></code> \|
	\| <code>What do wind turbines turn wind energy into?</code> \| <code>Wind energy is energy provided by the blowing wind. Wind turbines, like those in Figure above , can turn wind energy into electricity. The wind blows because of differences in heating of Earth’s atmosphere by the sun. There will never be a shortage of wind.</code> \|
	\| <code>Lacking blood vessels, nerve endings, or glands, the epidermis is the outer layer of what?</code> \| <code>The epidermis is the outer layer of skin. It consists almost entirely of epithelial cells. There are no blood vessels, nerve endings, or glands in this skin layer. Nonetheless, this layer of skin is very active. It is constantly being renewed. How does this happen?.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### qasc_pairs

	* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
	* Size: 7,971 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:---------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 11.4 tokens</li><li>max: 23 tokens</li></ul> \| <ul><li>min: 17 tokens</li><li>mean: 34.32 tokens</li><li>max: 67 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>What do lobsters lack?</code> \| <code>Invertebrates are animals that lack a vertebral column, or backbone.. Lobsters are invertebrates, animals without a backbone.. Lobsters do not have a backbone</code> \|
	\| <code>What can cause the most damage to thin soil?</code> \| <code>soil erosion means soil loss through wind. Erosion also damages the thin soil.. Wind can cause damage to thin soil.</code> \|
	\| <code>What are encoded in DNA and called genetic traits?</code> \| <code>Characteristics that are encoded in DNA are called genetic traits.. An observable characteristic is called a phenotype .. phenotypes encoded in DNA are called genetic traits</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### qasc_facts_sym

	* Dataset: [qasc_facts_sym](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
	* Size: 7,971 training samples
	* Columns: <code>combinedfact</code> and <code>facts</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| combinedfact \| facts \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 11.62 tokens</li><li>max: 25 tokens</li></ul> \| <ul><li>min: 13 tokens</li><li>mean: 25.19 tokens</li><li>max: 47 tokens</li></ul> \|
	* Samples:
	\| combinedfact \| facts \|
	\|:---------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Biomass is the total mass of organisms at a step in the food chain</code> \| <code>Biomass is the total mass of organisms at a trophic level.. Each of the steps in the food chain is a trophic level ..</code> \|
	\| <code>Nuclear reactions in stars cause them to brighten up the summer sky.</code> \| <code>nuclear reactions in stars causes stars to produce light. June Summer Triangle Three bright stars light up the summer sky..</code> \|
	\| <code>Nephrons are the structural and functional units of an organ that filters blood</code> \| <code>Nephrons are the structural and functional units of the kidneys.. Blood is filtered in the kidney..</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### openbookqa_pairs

	* Dataset: openbookqa_pairs
	* Size: 4,505 training samples
	* Columns: <code>question</code> and <code>fact</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| question \| fact \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 3 tokens</li><li>mean: 13.81 tokens</li><li>max: 78 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 11.49 tokens</li><li>max: 30 tokens</li></ul> \|
	* Samples:
	\| question \| fact \|
	\|:-----------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------\|
	\| <code>What is animal competition?</code> \| <code>if two animals eat the same prey then those animals compete for that pey</code> \|
	\| <code>If you wanted to make a metal bed frame, where would you start?</code> \| <code>alloys are made of two or more metals</code> \|
	\| <code>Places lacking warmth have few what</code> \| <code>cold environments contain few organisms</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### msmarco_pairs

	* Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
	* Size: 15,399 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:---------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 8.66 tokens</li><li>max: 35 tokens</li></ul> \| <ul><li>min: 18 tokens</li><li>mean: 77.26 tokens</li><li>max: 199 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:---------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>what is the systrust service?</code> \| <code>The SysTrust service is an assurance service that was jointly developed by the American Institute of Certified Public Accountants (AICPA) and the Canadian Institute of Chartered Accountants (CICA) .</code> \|
	\| <code>does jamaica recognize daylight saving time</code> \| <code>Actual current time in Kingston, Jamaica, DST, Daylight Savings Time conversion dates 2015, GMT offset, fall time change 2015 Kingston clock. Copyright Â© 2005-2015 24TimeZones.com. All rights reserved.</code> \|
	\| <code>what is a caregivers responsibilities</code> \| <code>Caregiver Job Responsibilities. A caregiver is usually responsible for attending to the specific needs of an elderly person, but a caregiver may also attend to the needs of an infant or a disabled person. Caregivers serve a key role in the health care industry. Caregivers ensure that those under their care are clean, fed and safe.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### msmarco_pairs2

	* Dataset: [msmarco_pairs2](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
	* Size: 6,600 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:---------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 8.69 tokens</li><li>max: 26 tokens</li></ul> \| <ul><li>min: 19 tokens</li><li>mean: 75.78 tokens</li><li>max: 238 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:---------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>caillou first episode date</code> \| <code>Caillou is a Canadian educational children's television series that was first shown on TÃ©lÃ©toon and Teletoon, with its first episode airing on the former channel on September 15, 1997; the show... fandom</code> \|
	\| <code>how to practice yoga breathinh</code> \| <code>How to Practice Kapalabhati Pranayama in Yoga. Learning yogic breath control exercises is one of the most important parts of developing your yoga practice. Called âpranayamaâ in Sanskrit, these breathing exercises can help to bring balance and depth to your overall well-being.</code> \|
	\| <code>why identifying bacteria is important</code> \| <code>Bacterial identification is a process which is used to pinpoint the identity of specific bacteria. It is an important part of medical treatment, since many treatments are heavily dependent on the identity of the particular organism causing a medical problem, and it is also an important part of scientific research.acterial identification is a process which is used to pinpoint the identity of specific bacteria. It is an important part of medical treatment, since many treatments are heavily dependent on the identity of the particular organism causing a medical problem, and it is also an important part of scientific research.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### nq_pairs

	* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
	* Size: 15,399 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 11.88 tokens</li><li>max: 22 tokens</li></ul> \| <ul><li>min: 21 tokens</li><li>mean: 132.12 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>inflation means specific prices are rising and relative prices are falling</code> \| <code>Inflation Conceptually, inflation refers to the general trend of prices, not changes in any specific price. For example, if people choose to buy more cucumbers than tomatoes, cucumbers consequently become more expensive and tomatoes cheaper. These changes are not related to inflation, they reflect a shift in tastes. Inflation is related to the value of currency itself. When currency was linked with gold, if new gold deposits were found, the price of gold and the value of currency would fall, and consequently prices of all other goods would become higher.[36]</code> \|
	\| <code>what is the sum of all angles in a hexagon</code> \| <code>Hexagon In geometry, a hexagon (from Greek ἕξ hex, "six" and γωνία, gonía, "corner, angle") is a six-sided polygon or 6-gon. The total of the internal angles of any simple (non-self-intersecting) hexagon is 720°.</code> \|
	\| <code>when do deer lose their antlers in california</code> \| <code>California mule deer Rutting season occurs in autumn when the does come into estrus for a period lasting only several days. Males exhibit aggressive behavior in competing for mates. Does begin estrus again if they do not become pregnant. The gestation period is about 200 days, with fawns arriving in the spring; the young remain with mothers throughout the summer and are weaned in the autumn. The buck's antlers fall off in the winter, and commence growing once more in spring in anticipation of next autumn's rut.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### nq_pairs2

	* Dataset: [nq_pairs2](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
	* Size: 6,600 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 11.83 tokens</li><li>max: 22 tokens</li></ul> \| <ul><li>min: 11 tokens</li><li>mean: 132.0 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>tomb of the unknown soldier guard change times</code> \| <code>Tomb of the Unknowns While Arlington National Cemetery is open, during the day in summer months from April 1 to September 30, the guard is changed every half hour. During the winter months, from October 1 to March 31, the guard is changed every hour. After the cemetery closes to the public (7 p.m. to 8 a.m. April through September, and 5 p.m. to 8 a.m. October through March), the guard is changed every 2 hours. The ceremony can be witnessed by the public whenever Arlington National Cemetery is open.[20][21]</code> \|
	\| <code>which european country was the first to industrialize and when</code> \| <code>History of industrialisation In the 18th and 19th centuries, the UK experienced a massive increase in agricultural productivity known as the British Agricultural Revolution, which enabled an unprecedented population growth, freeing a significant percentage of the workforce from farming, and helping to drive the Industrial Revolution.</code> \|
	\| <code>who starred in barefoot in the park on broadway</code> \| <code>Barefoot in the Park Barefoot in the Park is a romantic comedy by Neil Simon. The play premiered on Broadway in 1963, and starred Robert Redford and Elizabeth Ashley. The play was made into a film in 1967, also starring Redford (and Jane Fonda).</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### trivia_pairs

	* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
	* Size: 19,600 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 17.95 tokens</li><li>max: 57 tokens</li></ul> \| <ul><li>min: 23 tokens</li><li>mean: 447.58 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-----------------------------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>In Europe which colour denotes runs for expert skiers?</code> \| <code>Ski slope colour codes Written by snowman What do the slope colors and symbols mean? This can be confusing for beginners, or even for experienced skiers who are heading to a different ski region. Europe is much different from the US in these codes. Let’s start with the US markings: North American slope markings Green (circle): Beginner slope. This is for people who have never skied before, or are just starting out. Probably not exciting for intermediate or expert skiers, because these may not even be steep enough to make many turns. Blue (square): Intermediate slope. More varied terrain, and a bit steeper. Could also be good for experts who want to do some nice, carving turns without picking up tons of speed. Black (diamond): Expert slope. This could be a mogul run, or a steep piste. It’s recommended that beginners avoid these entirely, and intermediates only try them once their skills are improving and their confidence is rising. Double black diamond: Experts only. If you have any fear at all on a standard Black Diamond, stay away from the double blacks. They may be very steep, have huge moguls, or have a narrow couloir (a steep and narrow corridor where you have to go straight down, and turning might crash you into rocks or trees). Orange (rectangle): Terrain park. May also have a trail rating showing how difficult it is. Only go in here if you want to do some jumps and tricks. Probably not for beginners ;-). European slope markings Green: Learner slope, hardly any grade at all, may be very wide. Not used in all countries. Blue: Beginner slope. This is usually equivalent to the US “Green Circle” so don’t be confused at an Austrian resort with no “green” slopes. Here, blue is beginner! Red: Intermediate slope. Yep, there’s an extra color in Europe, and red slopes are open for intermediate skiers and boarders to improve their skills. Black: Expert slope. These may range from a normal expert slope like in North America to a super-tough one. But in most places (like the ultra-steep Harikiri at Mayrhofen), there will be an extra sign explaining if the slope is exceptionally hard. Orange: This means extremely difficult, and may be found only in certain countries like Austria and Switzerland. Yellow: Generally a “skiroute,” which may be an uncontrolled or ungroomed off-piste area. Often these trails are marked but wind a long way down the mountain with flat spots which may be a pain for snowboarders. May be marked with orange squares in Austria.</code> \|
	\| <code>With which brand name do you associate the synthetic substance polytetrafluoroethylene?</code> \| <code>What is PTFE? (with pictures) What is PTFE? Last Modified Date: 09 January 2017 Copyright Protected: You won't believe these 10 facts about people Polytetrafluoroethylene (PTFE) is a synthetic material accidentally invented in the late 1930s while a chemist was endeavoring to develop a new type of perfluorethylene-based refrigerant. Rather than achieving a chlorofluorocarbon, the scientist was surprised to find that the perfluorethylene used in the process reacted with the iron content of its container and polymerized under pressure. Less than a decade later, this new material was being distributed on a commercial scale and was eventually patented under the name Teflon®. It would be another 20 years before this material would hit the frying pan and become known as the first non-stick coating for cookware, however. In fact, this material was used for a variety of other purposes at first. During World War II, PTFE was used to prevent the escape of radioactive materials from the facility designated to produce the first atomic bomb in the U.S., an objective dubbed as the Manhattan Project. This facility represented an impressive piece of real estate with more 2,000,000 square feet (609,600 sq. meters) in which to house uranium hexafluoride. Not only is this substance highly toxic and corrosive in its own right, but it also forms a dangerous gas known as hydrogen fluoride in the presence of water or water vapor. For this reason, PTFE was used as a coating for pipefittings to make them leak proof. Ad The exceptional insulating properties of this material made its use in electronic components ideal. For one thing, it is non-conductive, making it resistant to high electric fields. It is also highly resistant to water, heat, and chemical corrosion . In fact, it continues to be used to produce laboratory equipment and accessories that come into contact with hydrofluoric acid, which would otherwise dissolve other materials, even glass. PTFE also possesses very low frictional properties, which is expressed as frictional coefficient. This measurement is relative and differs according to the materials brought into contact to generate or simulate friction. In terms of plastics, friction is usually observed against polished steel. To put the low friction coefficient of PTFE into proper perspective, it is the only known synthetic surface material to which the toe pads of a gecko fail to stick. This quality makes it suitable for manufacturing parts that need to resist friction, such as gears and ball bearings . This material was eventually introduced to American households by Marion Trozzolo, founder of Laboratory Plasticware Fabricators. While Trozzolo had been producing Teflon®-coated scientific tools for a number of years, he became inspired by a French engineer who found it such an effective non-stick coating for his fishing gear that he later treated his wife’s pots and pans with it. While this experiment led to the production of cookware known as Tefal (T-Fal®) in France in the mid-1950s, Trozzolo became the first U.S. producer of Teflon®-coated cookware. In fact, "The Happy Pan," launched in 1961, earned a place of historical significance in the Smithsonian Institute and Trozzolo a name of distinction in the Plastics Hall of Fame. Ad</code> \|
	\| <code>The legend of ‘Lohengrin’ comes from which European country?</code> \| <code>Christmas traditions in Europe Christmas traditions in Europe Christmas The origin and the name given to this celebration are different depending on the country. For exemple, for the French word Noël definitely comes from the Latin word natalis(birth). The masses of Christ, held by English evangelists in December, gave birth to the English word "Christmas". "The Holy Night" is translated in German as Weihnacht...Taking place in the last few days of December, this holiday is not celebrated in the same way in every country. There are many symbols attached to this holiday in Europe, and each country has kept its own identity and traditions, while enriching them with influences form various other sources. This diversity and richness prove the importance given by Europeans to the Christmas holiday. Here are some exemples... Advent, its crown and its calendar... Advent corresponds to the four-week period that precedes "the arrival"(adventus in Latin) of baby Jesus, that is Christmas. In certain parts of Germany, Advent begins on the 11th November, on Saint Martin's Day. Depending on the country, various saints (Saint Martin, Saint Catherine, Saint Eligius, Saint Barbe, Saint Nicholas or Saint Lucia) are honoured in a meaningful way during this period. These celebrations sometimes become more important than Christmas itself. The Advent Crown The Advent Crown, made of woven fir branches and four candles, representing the four seasons of the year, appeared quite late in the Protestant regions of Germany. It reached Scandinavia before spreading to various other countries. The four candles are lit one by one, on each of the four Sundays before Christmas. The Advent Calendar The Advent Calendar is a tradition of German origin aimed to encourage children to be patient until Christmas. Thus, in order to feel that they have less time to wait, children are given an Advent Calendar at the beginning of December, which has twenty four little doors. Every evening, they open one door, the last one being opened on Christmas Eve, just before the arrival of Santa Claus. Originally, the closed doors hid pious images that have been replaced nowadays with sweets. The first Advent Calendar is thought to date back to 1851. The Christmas tree The evergreen Christmas tree, like ivy and holly, is the symbol of eternal life. This tradition is first mentioned in the 16th century, in Alsace; but as early as the 11th century, the houses seem to have been decorated with "greenery taken from trees". Very early on, the Christmas tree was covered with various decorations and candles to light it up when Christmas came. In Hungary for exemple, the tree is decorated with biscuits, sweets and chocolates, which can be eaten from December 24, making sure that the coloured wrappers are not removed, so as not to leave the tree bare. In the 18th century, the Christmas tree reached the whole of Germany, and then spread to many other countries. However, certain countries, such as Italy and Spain, were long reluctant to adopt this tradition. In Greece, the Christmas tree does not exist, but people grow a Christmas rose called Ellebore. The Christmas crib The Christmas crib, which reminds us of the Nativity, first appeared in Italy and underwent considerable development in other southern Catholic countries of Europe (Spain, Portugal) as well as in France and Southern Germany after the 13th century. In the Early Middle Ages, cribs were set up in churches and liturgical games (Nativity games) were organised on Christmas night. Set up in a cave, or more modestly, in a stable, the traditional crib gradually became commonplace in the homes of churchgoers. At that time, it included only the main characters: baby Jesus, Mary, Joseph, the shepherds, the three Wise Men, the Angel Gabriel, not to mention the donkey and the bullock. However, in certain countries, other characters are traditionally included in the Nativity scene. This is particulary the case in Poland, where national heroes, represented by small figurines, are included alongside the crib characters. More</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### quora_pairs

	* Dataset: [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
	* Size: 23,520 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 13.67 tokens</li><li>max: 43 tokens</li></ul> \| <ul><li>min: 6 tokens</li><li>mean: 13.94 tokens</li><li>max: 46 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| <code>What is difference between stock and shares?</code> \| <code>What is the difference between a stock and a share?</code> \|
	\| <code>How can I troubleshoot my Belkin router?</code> \| <code>What are some ways I can troubleshoot my belkin router?</code> \|
	\| <code>Why should I vote for Hillary Clinton?</code> \| <code>Something simple, yet important. Why should I vote for Hillary Clinton?</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### gooaq_pairs

	* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
	* Size: 15,399 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 11.43 tokens</li><li>max: 22 tokens</li></ul> \| <ul><li>min: 14 tokens</li><li>mean: 57.01 tokens</li><li>max: 146 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>is ginger ale bad for you?</code> \| <code>However, not too many know that drinking ginger ale also has its drawbacks. The University of Maryland Medical Center has said that drinking too much of this beverage could actually lead to gastrointestinal ailments. Heavy drinkers may experience heartburn and even diarrhea.</code> \|
	\| <code>how to do shradh at home?</code> \| <code>Collect all the contents of the puja in a vessel and carry it on the head to the nearest water body (lake, river or sea) and chant “Idam Pindam Gayaar-pitho Asthu. After this, remove the grass ring. Take bath and visit temples. Do puja at home in the regular altar.</code> \|
	\| <code>how long can a woman have chlamydia and not know it?</code> \| <code>Most people who have chlamydia don't notice any symptoms. If you do get symptoms, these usually appear between 1 and 3 weeks after having unprotected sex with an infected person. For some people they don't develop until many months later. Sometimes the symptoms can disappear after a few days.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### gooaq_pairs2

	* Dataset: [gooaq_pairs2](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
	* Size: 6,600 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 11.46 tokens</li><li>max: 19 tokens</li></ul> \| <ul><li>min: 13 tokens</li><li>mean: 57.89 tokens</li><li>max: 132 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>is yucca good for cholesterol?</code> \| <code>In addition to lowering cholesterol, the regular consumption of yucca can help fight heart disease by reducing that oxidative stress (the imbalance between free radicals and antioxidants) placed on the cardiovascular system.</code> \|
	\| <code>what is the maximum amount you can collect in social security?</code> \| <code>The maximum monthly Social Security benefit that an individual can receive per month in 2020 is $3,790 for someone who files at age 70. For someone at full retirement age, the maximum amount is $3,011, and for someone aged 62, the maximum amount is $2,265.</code> \|
	\| <code>how can you tell the difference between real and fake airpods?</code> \| <code>The shapes vary on authentic AirPods as well, but they should always come in an oval shape. Fakes, on the other hand, seem to get this diffuser shape closer to a circle, rather than an oval. Besides, highlighted in a square you will notice the difference in the two white strips that can be noticed at the bottom.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### mrpc_pairs

	* Dataset: [mrpc_pairs](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
	* Size: 2,474 training samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 11 tokens</li><li>mean: 26.56 tokens</li><li>max: 51 tokens</li></ul> \| <ul><li>min: 12 tokens</li><li>mean: 26.59 tokens</li><li>max: 52 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-------------------------------------------------------------------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>And if estimates hold , it will mark the first time in history that five films grossed more than $ 20 million each in one weekend .</code> \| <code>It is also the first time in history that five films grossed more than $ 20 million each in one weekend .</code> \|
	\| <code>General Huweirini , who has worked closely with the US , was wounded in the shooting on December 4 , US officials said .</code> \| <code>Huweirini , who has worked closely with U.S. officials , was wounded in an attack Dec. 4 , the U.S. officials said .</code> \|
	\| <code>While robbery appeared to be the motive , the suspects drove off before taking anything .</code> \| <code>While robbery appeared to be the motive , the suspects fled before they could take anything , he said .</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### simple_wiki_pairs

	* Dataset: [simple_wiki_pairs](https://huggingface.co/datasets/sentence-transformers/simple-wiki) at [60fd9b4](https://huggingface.co/datasets/sentence-transformers/simple-wiki/tree/60fd9b4680642ace0e2604cc2de44d376df419a7)
	* Size: 12,600 training samples
	* Columns: <code>text</code> and <code>simplified</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| text \| simplified \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 32.92 tokens</li><li>max: 174 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 28.68 tokens</li><li>max: 114 tokens</li></ul> \|
	* Samples:
	\| text \| simplified \|
	\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Most nesting sites are on islands in the northern and southern regions of the Great Barrier Reef , with 1.4 â `` 1.7 million birds using the sites to breed .</code> \| <code>Most nesting sites are on islands in the northern and southern regions of the Great Barrier Reef . About 1.7 million birds use the sites to breed .</code> \|
	\| <code>This cemetery and one in Winchester , Virginia were both dedicated on the same day , with each group thinking that they were the first confederate cemetery .</code> \| <code>This cemetery and one in Winchester , Virginia were both dedicated on the same day . Each group thinking that they were the first confederate cemetery .</code> \|
	\| <code>The second half was conducted over the internet on WWF 's official website , WWF.com .</code> \| <code>The second half , or the supplemental draft , was conducted over the internet on WWF 's official website , WWF.com .</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### simple_wiki_pairs2

	* Dataset: [simple_wiki_pairs2](https://huggingface.co/datasets/sentence-transformers/simple-wiki) at [60fd9b4](https://huggingface.co/datasets/sentence-transformers/simple-wiki/tree/60fd9b4680642ace0e2604cc2de44d376df419a7)
	* Size: 5,400 training samples
	* Columns: <code>text</code> and <code>simplified</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| text \| simplified \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 33.7 tokens</li><li>max: 308 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 28.27 tokens</li><li>max: 108 tokens</li></ul> \|
	* Samples:
	\| text \| simplified \|
	\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Safeway would close Food Barn or rebrand stores as Safeway before the decade was over .</code> \| <code>Safeway stores also have a Safeway ATM Network .</code> \|
	\| <code>He succeeded Dick Cheney . Biden is the first United States Vice President from Delaware and the first Roman Catholic to attain that office .</code> \| <code>When Biden became Vice President , he said he would do things differently from Dick Cheney , who had been Vice President before him .</code> \|
	\| <code>It was called Lake Rezaiyeh in the early 1930s after Reza Shah Pahlavi , but the lake was renamed ` Urmia ' in the late 1970s .</code> \| <code>In the early years of the 20th century , it was named Rezaiyeh Lake after the name of Reza Pahlavi , the king of Iran . After the Islamic Revolution , its name was changed back to Urmia Lake .</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesRankingLoss",
	"n_layers_per_step": 3,
	"last_layer_weight": 0.25,
	"prior_layers_weight": 2.5,
	"kl_div_weight": 1.5,
	"kl_temperature": 0.5
	}
	```

	### Evaluation Datasets

	#### nli-pairs

	* Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
	* Size: 640 evaluation samples
	* Columns: <code>anchor</code> and <code>positive</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| positive \|
	\|:--------\|:---------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 17.8 tokens</li><li>max: 51 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 9.64 tokens</li><li>max: 25 tokens</li></ul> \|
	* Samples:
	\| anchor \| positive \|
	\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:------------------------------------------------------------\|
	\| <code>Two women are embracing while holding to go packages.</code> \| <code>Two woman are holding packages.</code> \|
	\| <code>Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.</code> \| <code>Two kids in numbered jerseys wash their hands.</code> \|
	\| <code>A man selling donuts to a customer during a world exhibition event held in the city of Angeles</code> \| <code>A man selling donuts to a customer.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### vitaminc-pairs

	* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
	* Size: 108 evaluation samples
	* Columns: <code>claim</code> and <code>evidence</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| claim \| evidence \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 9 tokens</li><li>mean: 21.36 tokens</li><li>max: 41 tokens</li></ul> \| <ul><li>min: 11 tokens</li><li>mean: 36.11 tokens</li><li>max: 79 tokens</li></ul> \|
	* Samples:
	\| claim \| evidence \|
	\|:------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>Dragon Con had over 5000 guests .</code> \| <code>Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .</code> \|
	\| <code>COVID-19 has reached more than 185 countries .</code> \| <code>As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .</code> \|
	\| <code>In March , Italy had 3.6x times more cases of coronavirus than China .</code> \| <code>As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### negation-triplets

	* Dataset: [negation-triplets](https://huggingface.co/datasets/jinaai/negation-dataset-v2)
	* Size: 89 evaluation samples
	* Columns: <code>anchor</code>, <code>entailment</code>, and <code>negative</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| anchor \| entailment \| negative \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 13.76 tokens</li><li>max: 19 tokens</li></ul> \| <ul><li>min: 9 tokens</li><li>mean: 13.24 tokens</li><li>max: 21 tokens</li></ul> \| <ul><li>min: 10 tokens</li><li>mean: 13.52 tokens</li><li>max: 22 tokens</li></ul> \|
	* Samples:
	\| anchor \| entailment \| negative \|
	\|:----------------------------------------------------------\|:---------------------------------------------------------------------\|:------------------------------------------------------------------\|
	\| <code>a man with a bike at a marina</code> \| <code>A man stands next to his bike at the marina.</code> \| <code>A man stands away from his bike at the marina.</code> \|
	\| <code>A black cat is inside a white toilet.</code> \| <code>A black cat drinking water from a toilet</code> \| <code>A black cat drinking water from a dog bowl</code> \|
	\| <code>A children play area absent of any children.</code> \| <code>A craft supply room with craft supplies and containers.</code> \| <code>An empty room with no craft supplies and containers.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "TripletLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 1,
	"prior_layers_weight": 1,
	"kl_div_weight": 0.5,
	"kl_temperature": 1.1
	}
	```

	#### qnli-contrastive

	* Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \| label \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|:-----------------------------\|
	\| type \| string \| string \| int \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 14.41 tokens</li><li>max: 26 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 37.94 tokens</li><li>max: 115 tokens</li></ul> \| <ul><li>0: 100.00%</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \| label \|
	\|:--------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------\|
	\| <code>What came into force after the new constitution was herald?</code> \| <code>As of that day, the new constitution heralding the Second Republic came into force.</code> \| <code>0</code> \|
	\| <code>What is the first major city in the stream of the Rhine?</code> \| <code>The most important tributaries in this area are the Ill below of Strasbourg, the Neckar in Mannheim and the Main across from Mainz.</code> \| <code>0</code> \|
	\| <code>What is the minimum required if you want to teach in Canada?</code> \| <code>In most provinces a second Bachelor's Degree such as a Bachelor of Education is required to become a qualified teacher.</code> \| <code>0</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "OnlineContrastiveLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 1,
	"prior_layers_weight": 1,
	"kl_div_weight": 0.5,
	"kl_temperature": 1.1
	}
	```

	#### scitail-pairs-qa

	* Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 128 evaluation samples
	* Columns: <code>sentence2</code> and <code>sentence1</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence2 \| sentence1 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 16.59 tokens</li><li>max: 29 tokens</li></ul> \| <ul><li>min: 8 tokens</li><li>mean: 15.62 tokens</li><li>max: 30 tokens</li></ul> \|
	* Samples:
	\| sentence2 \| sentence1 \|
	\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>The core is the sun's innermost layer.</code> \| <code>What is the suns innermost layer called?</code> \|
	\| <code>Heirloom seeds come from plants that were traditionally grown in human populations, as opposed to the seeds used for large-scale agricultural production.</code> \| <code>What type of seeds come from plants that were traditionally grown in human populations, as opposed to the seeds used for large-scale agricultural production?</code> \|
	\| <code>Multiple fission is more often observed among protists.</code> \| <code>Where is multiple fission more often observed?</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### scitail-pairs-pos

	* Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 22.98 tokens</li><li>max: 61 tokens</li></ul> \| <ul><li>min: 8 tokens</li><li>mean: 15.52 tokens</li><li>max: 36 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------\|
	\| <code>An introduction to atoms and elements, compounds, atomic structure and bonding, the molecule and chemical reactions.</code> \| <code>Replace another in a molecule happens to atoms during a substitution reaction.</code> \|
	\| <code>Wavelength The distance between two consecutive points on a sinusoidal wave that are in phase;</code> \| <code>Wavelength is the distance between two corresponding points of adjacent waves called.</code> \|
	\| <code>humans normally have 23 pairs of chromosomes.</code> \| <code>Humans typically have 23 pairs pairs of chromosomes.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### xsum-pairs

	* Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 26 tokens</li><li>mean: 233.24 tokens</li><li>max: 512 tokens</li></ul> \| <ul><li>min: 7 tokens</li><li>mean: 26.44 tokens</li><li>max: 55 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------\|
	\| <code>The former ABC cinema in the centre of Tunbridge Wells has been derelict for nearly 14 years despite a number of demolition and redevelopment plans.<br>Work was due to start on Monday but work to make safe the electricity and gas supplies has delayed demolition.<br>A former dental surgery on the site will pulled down first, with the project due to take 12 weeks.<br>Liberal Democrat councillor Ben Chapelard, who has campaigned for the redevelopment of the site, has previously called it the town's "number one grot spot".<br>In February, the Conservative-led borough council issued a demolition notice on the owners.</code> \| <code>Demolition of a derelict town centre cinema in Kent, branded as a "grot spot" is due to start this week.</code> \|
	\| <code>International ballroom champion Shirley Ballas has been named as the new head judge on Strictly Come Dancing.<br>Nicknamed the Queen of Latin, the 56-year-old will replace Len Goodman when the BBC show returns this autumn.<br>Highly regarded in the world of ballroom, she has numerous titles to her name and is also the mother to Dancing with the Stars professional champion Mark Ballas.<br>Strictly professional Anton Du Beke and former judge Arlene Phillips were among the favourites to take on the head judge role.<br>Although not widely known to UK audiences, Ballas has been frequently seen on Dancing with the Stars, the US version of Strictly - on which Mark appeared for 10 years - giving masterclasses and commentary.<br>"I am so excited and over the moon to have been given this wonderful opportunity," she said.<br>"I can't wait to get in to the ballroom and be part of the incredible and respected judging panel. Strictly is so loved by the British public, I have always been a massive fan. I just can't wait!"<br>Read more:<br>Follow us on Facebook, on Twitter @BBCNewsEnts, or on Instagram at bbcnewsents. If you have a story suggestion email [email protected].<br>Get news from the BBC in your inbox, each weekday morning</code> \| <code>It's the announcement everyone has been waiting a year for.</code> \|
	\| <code>The Prime Minister told BBC Radio Kent that he "wasn't particularly happy" about what he had heard about her.<br>"I don't think she's making a very good fist of her job... the people of Kent elected her, they can un-elect her at the next available opportunity."<br>The BBC has been unable to contact Mrs Barnes for her reaction to his comments, as she is away on holiday.<br>Mr Cameron's remarks follow a Channel 4 documentary last month, in which Mrs Barnes admitted she should not have taken part.<br>The "fly-on-the-wall" TV programme Meet the Police Commissioner, saw Mrs Barnes talk about her Â£85,000-a-year role.<br>At points in the broadcast, she struggled to explain an approach to policing priorities called "the onion", brought her dogs into the office and failed to write her title correctly on a whiteboard.<br>Mrs Barnes was also criticised last year after she appointed teenager Paris Brown as Kent's first youth commissioner.<br>Ms Brown later resigned over comments she had posted on Twitter.<br>However, earlier this month, within days of the Channel 4 programme, it was claimed Ms Brown's replacement had been involved in a relationship with 50-year-old former county councillor and youth leader Robert Burgess.<br>Kerry Boyd, 20, has since not undertaken any public engagements.<br>The BBC has also learned that Mrs Barnes will still be on holiday when a new policing model for Kent is introduced on Tuesday.<br>When asked about this, the Prime Minister said that while she was responsible for her own movements, "that doesn't sound particularly impressive".<br>"As I've said, I don't think she has impressed in this role," Mr Cameron added.</code> \| <code>Ann Barnes has failed to impress in her role as Kent's Police and Crime Commissioner, David Cameron has said.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### compression-pairs

	* Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 12 tokens</li><li>mean: 30.67 tokens</li><li>max: 84 tokens</li></ul> \| <ul><li>min: 5 tokens</li><li>mean: 10.03 tokens</li><li>max: 21 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------\|
	\| <code>Actor Joshua Jackson has banned his girlfriend Diane Kruger from watching his early work, because he suffered from acne and bloating during his teenage years.</code> \| <code>Joshua Jackson bans Diane Kruger from watching his early work</code> \|
	\| <code>The latest figures from the NCAA show that football and men's basketball players are graduating in record numbers.</code> \| <code>Football, basketball players graduating in record numbers</code> \|
	\| <code>The US economy shrank at a worse-than-expected annualised 6.1% pace at the start of this year as sharp cutbacks by businesses and the biggest drop in US exports in 40 years overwhelmed a rebound in consumer spending.</code> \| <code>US economy shrinks</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### sciq_pairs

	* Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 17.14 tokens</li><li>max: 38 tokens</li></ul> \| <ul><li>min: 2 tokens</li><li>mean: 81.41 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-----------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>What do scientists collect to test a hypothesis?</code> \| <code>Scientists collect evidence to test a hypothesis. The evidence may refute the hypothesis. In that case, it will be thrown out. The evidence may support the hypothesis. The scientists will then gather more evidence. The scientists will accept the hypothesis if: (1) There is no significant evidence to refute the hypothesis. (2) There is an enormous amount of evidence to support the hypothesis. The hypothesis may then become a theory.</code> \|
	\| <code>What is a force that opposes motion between any surfaces that are touching?</code> \| <code>Friction is a force that opposes motion between any surfaces that are touching.</code> \|
	\| <code>What is the largest mammal on earth?</code> \| <code>Biological organization exists at all levels in organisms. It can be seen at the smallest level, in the molecules that make up such compounds as DNA and proteins, to the largest level, in an organism such as a blue whale, the largest mammal on Earth. Similarly, single celled prokaryotes and eukaryotes show order in the way their cells are arranged. Single-celled organisms such as an amoeba are free-floating and independent-living. Their single-celled "bodies" are able to carry out all the processes of life such as metabolism and respiration without help from other cells.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### qasc_pairs

	* Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 11.36 tokens</li><li>max: 23 tokens</li></ul> \| <ul><li>min: 20 tokens</li><li>mean: 35.0 tokens</li><li>max: 64 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:----------------------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>What are vascular plants that produce seeds in cones?</code> \| <code>Gymnosperms are vascular plants that produce seeds in cones.. Gymnosperms are the plants that are known as evergreens.. Evergreens are vascular plants that produce seeds in cones</code> \|
	\| <code>What do producers turn sunlight into?</code> \| <code>Producers make food from inorganic molecules.. Autotrophs absorb sunlight energy and transfer inorganic mineral nutrients into organic molecules.. Producers make sunlight energy into food.</code> \|
	\| <code>What kind of animal breathes by converting oxygen in water into oxygen in blood?</code> \| <code>breathing is when a gill converts from oxygen in water into oxygen in blood. All fish have gills .. Fish breathe by converting oxygen in water into oxygen in blood.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### qasc_facts_sym

	* Dataset: [qasc_facts_sym](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
	* Size: 128 evaluation samples
	* Columns: <code>combinedfact</code> and <code>facts</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| combinedfact \| facts \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 5 tokens</li><li>mean: 11.28 tokens</li><li>max: 25 tokens</li></ul> \| <ul><li>min: 12 tokens</li><li>mean: 24.46 tokens</li><li>max: 43 tokens</li></ul> \|
	* Samples:
	\| combinedfact \| facts \|
	\|:------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>cells converting oxygen and carbohydrates into carbon dioxide, water, and energy is a requirement for life</code> \| <code>cellular respiration is when a cell converts from oxygen and carbohydrates into carbon dioxide, water, and energy. Cellular respiration is a requirement for life..</code> \|
	\| <code>rocks interacting with wind over long periods of time can form sediment</code> \| <code>rocks interacting with wind over long periods of time causes weathering. When a rock undergoes erosion and weathering, it breaks down to form sediments..</code> \|
	\| <code>aging is natural</code> \| <code>Aging is associated with the death of cells.. Cell death is natural..</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	#### openbookqa_pairs

	* Dataset: openbookqa_pairs
	* Size: 128 evaluation samples
	* Columns: <code>question</code> and <code>fact</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| question \| fact \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 3 tokens</li><li>mean: 13.82 tokens</li><li>max: 45 tokens</li></ul> \| <ul><li>min: 4 tokens</li><li>mean: 11.68 tokens</li><li>max: 28 tokens</li></ul> \|
	* Samples:
	\| question \| fact \|
	\|:-----------------------------------------------------------------------\|:-----------------------------------------------------------------------------\|
	\| <code>The thermal production of a stove is generically used for</code> \| <code>a stove generates heat for cooking usually</code> \|
	\| <code>What creates a valley?</code> \| <code>a valley is formed by a river flowing</code> \|
	\| <code>when it turns day and night on a planet, what cause this?</code> \| <code>a planet rotating causes cycles of day and night on that planet</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### msmarco_pairs

	* Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:---------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 4 tokens</li><li>mean: 9.05 tokens</li><li>max: 16 tokens</li></ul> \| <ul><li>min: 17 tokens</li><li>mean: 80.83 tokens</li><li>max: 187 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>polygenic trait meaning</code> \| <code>A polygenic trait, is a trait that nonallelic genes control. These traits result from one or more genes contributing to the phenotype. An individual's physical appearance is determined by chromosomal inheritance and genotypic ratio.This phenomenon is known as Mendel's Laws of Inheritance.olygenic trait molding is done mostly with the environment and genes. More than one gene determines these traits, with each gene giving a small, yet additive effect. Multifactorial traits do not exhibit Mendelian ratios, and are determined between genes, or a gene, and the environment.</code> \|
	\| <code>name four viruses that can cause diseases that are often fatal quizlet</code> \| <code>Some diseases caused by viruses are poliomyelitis, influenza, measles caused by rubeola virus, chikenguniya caused by chikanguniya virus, chickenpox caused by varicella zoster â¦ virus, AIDS caused by human immunodeficiency virus, and the common cold.</code> \|
	\| <code>how much does a caddie make at the masters</code> \| <code>A survey by Salary Expert shows that, on average, caddie masters earn less than $40,000 a year except for those in San Francisco, Manhattan and Los Angeles. Forbes sports writer Matt Woolsey explains that PGA caddies earn an average of $1,000 each week.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### nq_pairs

	* Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 10 tokens</li><li>mean: 11.62 tokens</li><li>max: 21 tokens</li></ul> \| <ul><li>min: 24 tokens</li><li>mean: 131.67 tokens</li><li>max: 325 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>what is the relation between nanometer and meter</code> \| <code>Nanometre The nanometre (International spelling as used by the International Bureau of Weights and Measures; SI symbol: nm) or nanometer (American spelling) is a unit of length in the metric system, equal to one billionth (short scale) of a metre (6991100000000000000♠0.000000001 m). The name combines the SI prefix nano- (from the Ancient Greek νάνος, nanos, "dwarf") with the parent unit name metre (from Greek μέτρον, metrοn, "unit of measurement"). It can be written in scientific notation as 6991100000000000000♠1×10−9 m, in engineering notation as 1 E−9 m, and is simply 1/7009100000000000000♠1000000000 metres. One nanometre equals ten ångströms. When used as a prefix for something other than a unit of measure (as in "nanoscience"), nano refers to nanotechnology, or phenomena typically occurring on a scale of nanometres (see nanoscopic scale).[1]</code> \|
	\| <code>what does the star on the chile flag represent</code> \| <code>Flag of Chile The star may represent a guide to progress and honor while other interpretations refer to its reference to an independent state; blue symbolizes the sky and the Pacific Ocean, white is for the snow-covered Andes, and red stands for the blood spilled to achieve independence.[2]</code> \|
	\| <code>when did the crazy horse memorial project began</code> \| <code>Crazy Horse Memorial Henry Standing Bear ("Mato Naji"), an Oglala Lakota chief, and well-known statesman and elder in the Native American community, recruited and commissioned Polish-American sculptor Korczak Ziolkowski to build the Crazy Horse Memorial in the Black Hills of South Dakota. In October 1931, Luther Standing Bear, Henry's older brother, wrote sculptor Gutzon Borglum, who was carving the heads of four American presidents at Mount Rushmore. Luther suggested that it would be "most fitting to have the face of Crazy Horse sculpted there. Crazy Horse is the real patriot of the Sioux tribe and the only one worthy to place by the side of Washington and Lincoln." Borglum never replied.[7] Thereafter, Henry Standing Bear began a campaign to have Borglum carve an image of Crazy Horse on Mt. Rushmore.[8] In summer of 1935, Standing Bear, frustrated over the stalled Crazy Horse project, wrote to James H. Cook, a long time friend of Chief Red Cloud's "I am struggling hopelessly with this because I am without funds, no employment and no assistance from any Indian or White."[9]</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### trivia_pairs

	* Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 17.51 tokens</li><li>max: 35 tokens</li></ul> \| <ul><li>min: 38 tokens</li><li>mean: 448.58 tokens</li><li>max: 512 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-----------------------------------------------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>The tomb of which English king is in Worcester Cathedral?</code> \| <code>Worcester Cathedral - The Tomb of King John WORCESTER CATHEDRAL WICKED KING JOHN'S FINAL RESTING PLACE Worcester Cathedral, Worcester, Worcestershire King John (1167-1216), the fourth and youngest son of Henry 11, was the archetypal wicked king, whose record of rebellion and intrigue against his brother, Richard 1st, led contemporary historian William of Newburgh (1135-1198), to denounce him as "nature’s enemy." His bullying manner and excessive taxation provoked the powerful English barons to rebel against him, and force him to seal Magna Carta. Later hailed as a declaration of English liberties, it was at the time, little more than a criticism of his style of government and, as such, he had no intention of adhering to its terms. His reign ended with England wracked by civil war. But one place he had reverence for was Worcester and, as he lay dying, he made a codicil to his will ordering that he was to be buried in its cathedral, between the tombs of its two saints, St Oswald and St Wulfstan. Their bones were long ago dispersed, but the tomb of "evil" King John can still exists. The marble top of his tomb is the lid of his original coffin, and thought to be the oldest royal effigy in England. The tomb itself has been opened several times, shedding light upon a legend concerning the Kings final days. It is said that John, realising that the chances of him attaining heaven were limited, gave orders that his corpse was to be dressed in the garb of a monk. Thus attired, he hoped to hoodwink his way into Paradise. When the tomb was opened in 1797, the remnants of an ancient cowl were, supposedly, found wrapped around his skull! HEREFORDSHIRE AND</code> \|
	\| <code>Which Oscar-winning actress was born on exactly the same day as actress Lindsay Wagner?</code> \| <code>#219 Meryl Streep / Alan Osmond / Lindsay Wagner – 22 June 1949 \| Born On The Same Day Born On The Same Day Posted on January 29, 2011 by Born On The Same Day Meryl Streep Mary Louise “Meryl” Streep (born June 22, 1949) is an American actress who has worked in theatre, television and film. She is widely regarded as one of the most talented and respected actors of the modern era. Streep has received 16 Academy Award nominations, winning two, and 25 Golden Globe nominations, winning seven, more nominations than any other actor in the history of either award. Her work has also earned her two Emmy Awards, two Screen Actors Guild Awards, a Cannes Film Festival award, four New York Film Critics Circle Awards, five Grammy Award nominations, a BAFTA award, an Australian Film Institute Award and a Tony Award nomination, amongst others. She was awarded the American Film Institute’s Lifetime Achievement Award in 2004. Alan Osmond Alan Ralph Osmond (born on June 22, 1949 in Ogden, Utah, United States) was a member of the family musical group The Osmonds. He was the oldest of the seven siblings who could sing, as the two oldest brothers, Virl and Tom, are hearing impaired. During much of the Osmonds’ career, Alan was the leader of the group. Today he performs only rarely because he has multiple sclerosis. Lindsay Wagner Lindsay Jean Wagner (born June 22, 1949) is an American actress. She is probably most widely known for her portrayal of Jaime Sommers in the 1970s television series The Bionic Woman (for which she won an Emmy award). Links:</code> \|
	\| <code>Lipshen is the name of the cat in which Roald Dahl book?</code> \| <code>The Witches - Roald Dahl Roald Dahl Published in 1983 Synopsis Roald Dahl's The Witches tells the story of a brave young boy and his Norwegian grandmother as they battle England's witches. Background Background Witches absolutely detest children. To a witch, a child smells like dogs' droppings. And now the Grand High Witch is planning to get rid of every child in England - can anybody stop them? The Witches tells the story of a brave young boy and his Norwegian grandmother as they battle against England's child-hating witches. It continues to feature in lists dedicated to the scariest children's books more than 30 years after it was first published. Especially around Halloween. When he was a child himself, Roald Dahl used to spend every summer holiday with his family in Norway, where he was inspired by bedtime stories of witches and magic. He wrote about these holidays in Boy: Tales of Childhood. It is also said that the grandmother in The Witches was partially inspired by Roald's own mother. Roald dedicated the book to his wife, Liccy. A film version of the story, starring Angelica Huston as the witches' leader The Grand High Witch, was released in 1990. The main difference between the film and the original story is the ending - in the book, there is no spell cast to change the boy's state back to what it was before the witches found him. The film also gives its central character the name Luke, whereas in the book we don't find out the name of either the boy who narrates the story or his grandmother. In 1983, the year it was published, The Witches won three awards: The New York Times Outstanding Books Award, The Federation of Children's Book Groups Award and The Whitbread Award. The Witches helped inspire Boy</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### quora_pairs

	* Dataset: [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
	* Size: 1,600 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 6 tokens</li><li>mean: 13.34 tokens</li><li>max: 40 tokens</li></ul> \| <ul><li>min: 6 tokens</li><li>mean: 13.52 tokens</li><li>max: 47 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:-------------------------------------------------------------\|:--------------------------------------------------------------------\|
	\| <code>Why is "Sazae San" anime underrated in America?</code> \| <code>Why is "Sazae-San" anime underrated in America?</code> \|
	\| <code>Why's watching and playing snooker different?</code> \| <code>Why would watching snooker be different to playing it?</code> \|
	\| <code>Who is wrong in the Israel-Palestine conflict?</code> \| <code>Which side is right in the Israel-Palestine conflict?</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.5,
	"prior_layers_weight": 1.5,
	"kl_div_weight": 1.25,
	"kl_temperature": 0.75
	}
	```

	#### gooaq_pairs

	* Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 8 tokens</li><li>mean: 11.47 tokens</li><li>max: 19 tokens</li></ul> \| <ul><li>min: 24 tokens</li><li>mean: 59.1 tokens</li><li>max: 112 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:------------------------------------------------------------\|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>what are the three types of symbiosis quizlet?</code> \| <code>['parasitism. A relationship between two organisms where one benefits and the other is harmed.', 'commensalism. A relationship between two organisms where one benefits and the other is unharmed.', 'mutualism. A relationship between two organisms where both benefit.']</code> \|
	\| <code>are beet greens safe to eat?</code> \| <code>Here's a tip: when you're washing and peeling the beets, and you trim off the green leafy tops, don't toss them away! The greens and the stems are edible, and make a great substitute for any green such as spinach, swiss chard, and bok choy. They can be steamed, sauteed, braised, added to soups, and eaten raw.</code> \|
	\| <code>is it true vampire diaries is leaving netflix?</code> \| <code>According to The Verge, Netflix has a deal with The CW's parent companies, which secures that series such as (Riverdale, The Flash, and more) end up on Netflix after their season finales. ... Overall, while The Vampire Diaries will remain part of the Netflix queue, its future on the streaming service may not be permanent.</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "GISTEmbedLoss",
	"n_layers_per_step": 1,
	"last_layer_weight": 1.5,
	"prior_layers_weight": 0.5,
	"kl_div_weight": 0.33,
	"kl_temperature": 1.5
	}
	```

	#### mrpc_pairs

	* Dataset: [mrpc_pairs](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
	* Size: 128 evaluation samples
	* Columns: <code>sentence1</code> and <code>sentence2</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence1 \| sentence2 \|
	\|:--------\|:-----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 14 tokens</li><li>mean: 26.86 tokens</li><li>max: 40 tokens</li></ul> \| <ul><li>min: 14 tokens</li><li>mean: 26.91 tokens</li><li>max: 41 tokens</li></ul> \|
	* Samples:
	\| sentence1 \| sentence2 \|
	\|:--------------------------------------------------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>He said the foodservice pie business doesn 't fit the company 's long-term growth strategy .</code> \| <code>" The foodservice pie business does not fit our long-term growth strategy .</code> \|
	\| <code>The AFL-CIO is waiting until October to decide if it will endorse a candidate .</code> \| <code>The AFL-CIO announced Wednesday that it will decide in October whether to endorse a candidate before the primaries .</code> \|
	\| <code>Wal-Mart said it would check all of its million-plus domestic workers to ensure they were legally employed .</code> \| <code>It has also said it would review all of its domestic employees more than 1 million to ensure they have legal status .</code> \|
	* Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesSymmetricRankingLoss",
	"n_layers_per_step": -1,
	"last_layer_weight": 0.75,
	"prior_layers_weight": 1.75,
	"kl_div_weight": 0.75,
	"kl_temperature": 0.9
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `eval_strategy`: steps
	- `per_device_train_batch_size`: 42
	- `per_device_eval_batch_size`: 42
	- `learning_rate`: 3.5e-05
	- `weight_decay`: 5e-05
	- `num_train_epochs`: 4
	- `lr_scheduler_type`: cosine_with_restarts
	- `lr_scheduler_kwargs`: {'num_cycles': 3}
	- `warmup_ratio`: 0.15
	- `save_safetensors`: False
	- `fp16`: True
	- `push_to_hub`: True
	- `hub_model_id`: bobox/DeBERTa-ST-AllLayers-v3.5-checkpoints-tmp
	- `hub_strategy`: all_checkpoints
	- `batch_sampler`: no_duplicates

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: steps
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 42
	- `per_device_eval_batch_size`: 42
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 1
	- `eval_accumulation_steps`: None
	- `learning_rate`: 3.5e-05
	- `weight_decay`: 5e-05
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 4
	- `max_steps`: -1
	- `lr_scheduler_type`: cosine_with_restarts
	- `lr_scheduler_kwargs`: {'num_cycles': 3}
	- `warmup_ratio`: 0.15
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: False
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: True
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: False
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: True
	- `resume_from_checkpoint`: None
	- `hub_model_id`: bobox/DeBERTa-ST-AllLayers-v3.5-checkpoints-tmp
	- `hub_strategy`: all_checkpoints
	- `hub_private_repo`: False
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: False
	- `batch_sampler`: no_duplicates
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	\| Epoch \| Step \| Training Loss \| vitaminc-pairs loss \| quora pairs loss \| negation-triplets loss \| openbookqa pairs loss \| mrpc pairs loss \| msmarco pairs loss \| xsum-pairs loss \| nli-pairs loss \| compression-pairs loss \| scitail-pairs-qa loss \| sciq pairs loss \| scitail-pairs-pos loss \| gooaq pairs loss \| qnli-contrastive loss \| nq pairs loss \| qasc pairs loss \| qasc facts sym loss \| trivia pairs loss \| Vitaminc_max_ap \| mrpc_max_ap \| negation_max_accuracy \| sts-test_spearman_cosine \|
	\|:------:\|:----:\|:-------------:\|:-------------------:\|:----------------:\|:----------------------:\|:---------------------:\|:---------------:\|:------------------:\|:---------------:\|:--------------:\|:----------------------:\|:---------------------:\|:---------------:\|:----------------------:\|:----------------:\|:---------------------:\|:-------------:\|:---------------:\|:-------------------:\|:-----------------:\|:---------------:\|:-----------:\|:---------------------:\|:------------------------:\|
	\| None \| 0 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| 0.5180 \| 0.8169 \| 1.0 \| 0.0710 \|
	\| 0.0200 \| 154 \| 10.0783 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0400 \| 308 \| 7.9365 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0600 \| 462 \| 7.0986 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.0800 \| 616 \| 6.0384 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1000 \| 770 \| 5.2434 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1200 \| 924 \| 4.4737 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1401 \| 1078 \| 3.953 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1601 \| 1232 \| 3.7847 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.1801 \| 1386 \| 3.3807 \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \| - \|
	\| 0.2001 \| 1540 \| 3.3067 \| 5.9961 \| 1.3539 \| 4.7618 \| 3.7795 \| 0.4998 \| 3.2895 \| 1.7732 \| 2.8770 \| 0.9280 \| 0.2923 \| 0.7724 \| 1.1874 \| 3.3780 \| 3.1273 \| 3.1534 \| 2.3403 \| 1.3450 \| 3.8939 \| 0.5406 \| 0.8185 \| 1.0 \| 0.7388 \|


	### Framework Versions
	- Python: 3.10.12
	- Sentence Transformers: 3.0.1
	- Transformers: 4.42.4
	- PyTorch: 2.3.1+cu121
	- Accelerate: 0.32.1
	- Datasets: 2.20.0
	- Tokenizers: 0.19.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### AdaptiveLayerLoss
	```bibtex
	@misc{li20242d,
	title={2D Matryoshka Sentence Embeddings},
	author={Xianming Li and Zongxi Li and Jing Li and Haoran Xie and Qing Li},
	year={2024},
	eprint={2402.14776},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	#### GISTEmbedLoss
	```bibtex
	@misc{solatorio2024gistembed,
	title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
	author={Aivin V. Solatorio},
	year={2024},
	eprint={2402.16829},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```

	#### TripletLoss
	```bibtex
	@misc{hermans2017defense,
	title={In Defense of the Triplet Loss for Person Re-Identification},
	author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
	year={2017},
	eprint={1703.07737},
	archivePrefix={arXiv},
	primaryClass={cs.CV}
	}
	```

	#### MultipleNegativesRankingLoss
	```bibtex
	@misc{henderson2017efficient,
	title={Efficient Natural Language Response Suggestion for Smart Reply},
	author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
	year={2017},
	eprint={1705.00652},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->