metadata
base_model: microsoft/deberta-v3-small
datasets:
- tals/vitaminc
language:
- en
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- dot_accuracy
- dot_accuracy_threshold
- dot_f1
- dot_f1_threshold
- dot_precision
- dot_recall
- dot_ap
- manhattan_accuracy
- manhattan_accuracy_threshold
- manhattan_f1
- manhattan_f1_threshold
- manhattan_precision
- manhattan_recall
- manhattan_ap
- euclidean_accuracy
- euclidean_accuracy_threshold
- euclidean_f1
- euclidean_f1_threshold
- euclidean_precision
- euclidean_recall
- euclidean_ap
- max_accuracy
- max_accuracy_threshold
- max_f1
- max_f1_threshold
- max_precision
- max_recall
- max_ap
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:225247
- loss:CachedGISTEmbedLoss
widget:
- source_sentence: what is exfo toolbox
sentences:
- >-
Eye dilation from eye drops used for examination of the eye usually
lasts from 4 to 24 hours, depending upon the strength of the drop and
upon the individual patient.
- >-
Garden Grove is a city in northern Orange County in the U.S. state of
California, 34 miles (55 km) south of Los Angeles. The population was
170,883 at the 2010 United States Census. State Route 22, also known as
the Garden Grove Freeway, passes through the city in an east-west
direction.
- >-
EXFO ToolBox Office is a product that offers you a collection of viewers
and analyzers. It enables you to manage and analyze results acquired
from fiber optic test modules and instruments.
- source_sentence: >-
More than 273 people have died from the 2019-20 coronavirus outside
mainland China .
sentences:
- >-
More than 3,700 people have died : around 3,100 in mainland China and
around 550 in all other countries combined .
- >-
More than 3,200 people have died : almost 3,000 in mainland China and
around 275 in other countries .
- more than 4,900 deaths have been attributed to COVID-19 .
- source_sentence: >-
Ultrasound, a diagnostic technology, uses high-frequency vibrations
transmitted into any tissue in contact with the transducer.
sentences:
- >-
What diagnostic technology uses high-frequency vibrations transmitted
into any tissue in contact with the transducer?
- The abnormal cells cannot carry oxygen properly and can get stuck where?
- What type of organism is a bacteria?
- source_sentence: >-
When you add moles of gas to a baloon by blowing it up, the volume
increases.
sentences:
- What shape is the lens of the eye?
- >-
What happens to the volume of a balloon when you add moles of gas to it
by blowing up?
- >-
Most turtle bodies are covered by a special bony or cartilaginous shell
developed from their what?
- source_sentence: >-
What was the name of eleven rulers of the 19th and 20th Egyptian
dynasties?
sentences:
- >-
Airlines Yugoslavia 1968 - 1968 Renamed ^ Comments : Aviogenex was
formed on 21May1968 as Genex Airlines. Restarted under current name on
30Apr1969 & liquidated in Feb2015 ^ Genealogy : Genex Airlines
>Aviogenex 1968 - 1986 Renamed ^ Comments : Adria Airways was formed on
14Mar1961 & operations started on 30Jun1961 as Adria Airways, renamed to
Inex in 1968 and back to Adria again in 1986. National airline of
Slovenia ^ Genealogy : Adria Airways >Inex Adria Airways >Adria Airways
JAT (Jugoslovenski Aerotransport) 1947 - 2003 Renamed ^ Comments : Air
Serbia was founded as Aeroput on 17Jun1927, renamed to JAT on 01Apr1947.
Started ops on 15Apr1947, Renamed again on 08Aug2003 to JAT Airways &
reformed as Air Serbia on 26Oct2013 ^ Genealogy : Aeroput >JAT
(Jugoslovenski Aerotransport) >JAT Airways >Air Serbia Jugoslovenski
Aerotransport
- >-
List of Rulers of Ancient Egypt and Nubia | Lists of Rulers | Heilbrunn
Timeline of Art History | The Metropolitan Museum of Art The
Metropolitan Museum of Art List of Rulers of Ancient Egypt and Nubia See
works of art 30.8.234 52.127.4 Our knowledge of the succession of
Egyptian kings is based on kinglists kept by the ancient Egyptians
themselves. The most famous are the Palermo Stone, which covers the
period from the earliest dynasties to the middle of Dynasty 5; the
Abydos Kinglist, which Seti I had carved on his temple at Abydos; and
the Turin Canon, a papyrus that covers the period from the earliest
dynasties to the reign of Ramesses II. All are incomplete or
fragmentary. We also rely on the History of Egypt written by Manetho in
the third century B.C. A priest in the temple at Heliopolis, Manetho had
access to many original sources and it was he who divided the kings into
the thirty dynasties we use today. It is to this structure of dynasties
and listed kings that we now attempt to link an absolute chronology of
dates in terms of our own calendrical system. The process is made
difficult by the fragmentary condition of the kinglists and by
differences in the calendrical years used at various times. Some
astronomical observations from the ancient Egyptians have survived,
allowing us to calculate absolute dates within a margin of error.
Synchronisms with the other civilizations of the ancient world are also
of limited use.
- >-
What is the "Jack Sprat" nursery rhyme? | Reference.com What is the
"Jack Sprat" nursery rhyme? A: Quick Answer "Jack Sprat" is a
traditional English nursery rhyme whose main verse says, "Jack Sprat
could eat no fat. His wife could eat no lean. And so between them both,
you see, they licked the platter clean." Though it was likely sung by
children long before, "Jack Sprat" was first published around 1765 in
the compilation "Mother Goose's Melody." Full Answer According to
Rhymes.org, a U.K. website devoted to nursery rhyme lyrics and origins,
the "Jack Sprat" nursery rhyme has its origins in British history. In
one interpretation, Jack Sprat was King Charles I, who ruled England in
the early part of the 17th century, and his wife was Queen Henrietta
Maria. Parliament refused to finance the king's war with Spain, which
made him lean. However, the queen fattened the coffers by levying an
illegal war tax. In an alternative version, the "Jack Sprat" nursery
rhyme is linked to King Richard and his brother John of the Robin Hood
legend. Jack Sprat was King John, the usurper who tried to take over the
crown when King Richard went off to fight in the Crusades in the 12th
century. When King Richard was captured, John had to raise a ransom to
rescue him, leaving the country lean. The wife was Joan, daughter of the
Earl of Gloucester, the greedy wife of King John. However, after King
Richard died and John became king, he had his marriage with Joan
annulled.
model-index:
- name: SentenceTransformer based on microsoft/deberta-v3-small
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.7673854808079448
name: Pearson Cosine
- type: spearman_cosine
value: 0.7776198286738142
name: Spearman Cosine
- type: pearson_manhattan
value: 0.782368447545155
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7720687033298573
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7882638792170585
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7775073687564514
name: Spearman Euclidean
- type: pearson_dot
value: 0.7669147371310585
name: Pearson Dot
- type: spearman_dot
value: 0.7762894632049069
name: Spearman Dot
- type: pearson_max
value: 0.7882638792170585
name: Pearson Max
- type: spearman_max
value: 0.7776198286738142
name: Spearman Max
- task:
type: binary-classification
name: Binary Classification
dataset:
name: allNLI dev
type: allNLI-dev
metrics:
- type: cosine_accuracy
value: 0.708984375
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.8714957237243652
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.5913043478260869
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7768557071685791
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.4738675958188153
name: Cosine Precision
- type: cosine_recall
value: 0.7861271676300579
name: Cosine Recall
- type: cosine_ap
value: 0.5644305887001508
name: Cosine Ap
- type: dot_accuracy
value: 0.7109375
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 674.426025390625
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.5913043478260869
name: Dot F1
- type: dot_f1_threshold
value: 603.435302734375
name: Dot F1 Threshold
- type: dot_precision
value: 0.4738675958188153
name: Dot Precision
- type: dot_recall
value: 0.7861271676300579
name: Dot Recall
- type: dot_ap
value: 0.5664868031504724
name: Dot Ap
- type: manhattan_accuracy
value: 0.7109375
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 294.4728088378906
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.5935483870967742
name: Manhattan F1
- type: manhattan_f1_threshold
value: 401.1482849121094
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.4726027397260274
name: Manhattan Precision
- type: manhattan_recall
value: 0.7976878612716763
name: Manhattan Recall
- type: manhattan_ap
value: 0.5642688421649988
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.7109375
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 14.565500259399414
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.5913043478260869
name: Euclidean F1
- type: euclidean_f1_threshold
value: 18.60409164428711
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.4738675958188153
name: Euclidean Precision
- type: euclidean_recall
value: 0.7861271676300579
name: Euclidean Recall
- type: euclidean_ap
value: 0.5645557227019772
name: Euclidean Ap
- type: max_accuracy
value: 0.7109375
name: Max Accuracy
- type: max_accuracy_threshold
value: 674.426025390625
name: Max Accuracy Threshold
- type: max_f1
value: 0.5935483870967742
name: Max F1
- type: max_f1_threshold
value: 603.435302734375
name: Max F1 Threshold
- type: max_precision
value: 0.4738675958188153
name: Max Precision
- type: max_recall
value: 0.7976878612716763
name: Max Recall
- type: max_ap
value: 0.5664868031504724
name: Max Ap
- task:
type: binary-classification
name: Binary Classification
dataset:
name: Qnli dev
type: Qnli-dev
metrics:
- type: cosine_accuracy
value: 0.6796875
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.7726649045944214
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.6925675675675677
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7317887544631958
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.5758426966292135
name: Cosine Precision
- type: cosine_recall
value: 0.8686440677966102
name: Cosine Recall
- type: cosine_ap
value: 0.7302564198016936
name: Cosine Ap
- type: dot_accuracy
value: 0.67578125
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 598.0419921875
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.6912751677852348
name: Dot F1
- type: dot_f1_threshold
value: 565.4718017578125
name: Dot F1 Threshold
- type: dot_precision
value: 0.5722222222222222
name: Dot Precision
- type: dot_recall
value: 0.8728813559322034
name: Dot Recall
- type: dot_ap
value: 0.7300462025003271
name: Dot Ap
- type: manhattan_accuracy
value: 0.6796875
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 404.8309020996094
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.6933333333333332
name: Manhattan F1
- type: manhattan_f1_threshold
value: 444.99224853515625
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.5714285714285714
name: Manhattan Precision
- type: manhattan_recall
value: 0.8813559322033898
name: Manhattan Recall
- type: manhattan_ap
value: 0.7369214156436785
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.6796875
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 18.790739059448242
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.6934306569343065
name: Euclidean F1
- type: euclidean_f1_threshold
value: 19.35132598876953
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.6089743589743589
name: Euclidean Precision
- type: euclidean_recall
value: 0.8050847457627118
name: Euclidean Recall
- type: euclidean_ap
value: 0.7307381840067684
name: Euclidean Ap
- type: max_accuracy
value: 0.6796875
name: Max Accuracy
- type: max_accuracy_threshold
value: 598.0419921875
name: Max Accuracy Threshold
- type: max_f1
value: 0.6934306569343065
name: Max F1
- type: max_f1_threshold
value: 565.4718017578125
name: Max F1 Threshold
- type: max_precision
value: 0.6089743589743589
name: Max Precision
- type: max_recall
value: 0.8813559322033898
name: Max Recall
- type: max_ap
value: 0.7369214156436785
name: Max Ap
SentenceTransformer based on microsoft/deberta-v3-small
This is a sentence-transformers model finetuned from microsoft/deberta-v3-small. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: microsoft/deberta-v3-small
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
(1): AdvancedWeightedPooling(
(linear_cls): Linear(in_features=768, out_features=768, bias=True)
(linear_mean): Linear(in_features=768, out_features=768, bias=True)
(mha): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(layernorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_cls): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_mean): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa3-s-CustomPoolin-v3-step1")
# Run inference
sentences = [
'What was the name of eleven rulers of the 19th and 20th Egyptian dynasties?',
'List of Rulers of Ancient Egypt and Nubia | Lists of Rulers | Heilbrunn Timeline of Art History | The Metropolitan Museum of Art The Metropolitan Museum of Art List of Rulers of Ancient Egypt and Nubia See works of art 30.8.234 52.127.4 Our knowledge of the succession of Egyptian kings is based on kinglists kept by the ancient Egyptians themselves. The most famous are the Palermo Stone, which covers the period from the earliest dynasties to the middle of Dynasty 5; the Abydos Kinglist, which Seti I had carved on his temple at Abydos; and the Turin Canon, a papyrus that covers the period from the earliest dynasties to the reign of Ramesses II. All are incomplete or fragmentary. We also rely on the History of Egypt written by Manetho in the third century B.C. A priest in the temple at Heliopolis, Manetho had access to many original sources and it was he who divided the kings into the thirty dynasties we use today. It is to this structure of dynasties and listed kings that we now attempt to link an absolute chronology of dates in terms of our own calendrical system. The process is made difficult by the fragmentary condition of the kinglists and by differences in the calendrical years used at various times. Some astronomical observations from the ancient Egyptians have survived, allowing us to calculate absolute dates within a margin of error. Synchronisms with the other civilizations of the ancient world are also of limited use.',
'What is the "Jack Sprat" nursery rhyme? | Reference.com What is the "Jack Sprat" nursery rhyme? A: Quick Answer "Jack Sprat" is a traditional English nursery rhyme whose main verse says, "Jack Sprat could eat no fat. His wife could eat no lean. And so between them both, you see, they licked the platter clean." Though it was likely sung by children long before, "Jack Sprat" was first published around 1765 in the compilation "Mother Goose\'s Melody." Full Answer According to Rhymes.org, a U.K. website devoted to nursery rhyme lyrics and origins, the "Jack Sprat" nursery rhyme has its origins in British history. In one interpretation, Jack Sprat was King Charles I, who ruled England in the early part of the 17th century, and his wife was Queen Henrietta Maria. Parliament refused to finance the king\'s war with Spain, which made him lean. However, the queen fattened the coffers by levying an illegal war tax. In an alternative version, the "Jack Sprat" nursery rhyme is linked to King Richard and his brother John of the Robin Hood legend. Jack Sprat was King John, the usurper who tried to take over the crown when King Richard went off to fight in the Crusades in the 12th century. When King Richard was captured, John had to raise a ransom to rescue him, leaving the country lean. The wife was Joan, daughter of the Earl of Gloucester, the greedy wife of King John. However, after King Richard died and John became king, he had his marriage with Joan annulled.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts-test
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.7674 |
spearman_cosine | 0.7776 |
pearson_manhattan | 0.7824 |
spearman_manhattan | 0.7721 |
pearson_euclidean | 0.7883 |
spearman_euclidean | 0.7775 |
pearson_dot | 0.7669 |
spearman_dot | 0.7763 |
pearson_max | 0.7883 |
spearman_max | 0.7776 |
Binary Classification
- Dataset:
allNLI-dev
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.709 |
cosine_accuracy_threshold | 0.8715 |
cosine_f1 | 0.5913 |
cosine_f1_threshold | 0.7769 |
cosine_precision | 0.4739 |
cosine_recall | 0.7861 |
cosine_ap | 0.5644 |
dot_accuracy | 0.7109 |
dot_accuracy_threshold | 674.426 |
dot_f1 | 0.5913 |
dot_f1_threshold | 603.4353 |
dot_precision | 0.4739 |
dot_recall | 0.7861 |
dot_ap | 0.5665 |
manhattan_accuracy | 0.7109 |
manhattan_accuracy_threshold | 294.4728 |
manhattan_f1 | 0.5935 |
manhattan_f1_threshold | 401.1483 |
manhattan_precision | 0.4726 |
manhattan_recall | 0.7977 |
manhattan_ap | 0.5643 |
euclidean_accuracy | 0.7109 |
euclidean_accuracy_threshold | 14.5655 |
euclidean_f1 | 0.5913 |
euclidean_f1_threshold | 18.6041 |
euclidean_precision | 0.4739 |
euclidean_recall | 0.7861 |
euclidean_ap | 0.5646 |
max_accuracy | 0.7109 |
max_accuracy_threshold | 674.426 |
max_f1 | 0.5935 |
max_f1_threshold | 603.4353 |
max_precision | 0.4739 |
max_recall | 0.7977 |
max_ap | 0.5665 |
Binary Classification
- Dataset:
Qnli-dev
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.6797 |
cosine_accuracy_threshold | 0.7727 |
cosine_f1 | 0.6926 |
cosine_f1_threshold | 0.7318 |
cosine_precision | 0.5758 |
cosine_recall | 0.8686 |
cosine_ap | 0.7303 |
dot_accuracy | 0.6758 |
dot_accuracy_threshold | 598.042 |
dot_f1 | 0.6913 |
dot_f1_threshold | 565.4718 |
dot_precision | 0.5722 |
dot_recall | 0.8729 |
dot_ap | 0.73 |
manhattan_accuracy | 0.6797 |
manhattan_accuracy_threshold | 404.8309 |
manhattan_f1 | 0.6933 |
manhattan_f1_threshold | 444.9922 |
manhattan_precision | 0.5714 |
manhattan_recall | 0.8814 |
manhattan_ap | 0.7369 |
euclidean_accuracy | 0.6797 |
euclidean_accuracy_threshold | 18.7907 |
euclidean_f1 | 0.6934 |
euclidean_f1_threshold | 19.3513 |
euclidean_precision | 0.609 |
euclidean_recall | 0.8051 |
euclidean_ap | 0.7307 |
max_accuracy | 0.6797 |
max_accuracy_threshold | 598.042 |
max_f1 | 0.6934 |
max_f1_threshold | 565.4718 |
max_precision | 0.609 |
max_recall | 0.8814 |
max_ap | 0.7369 |
Training Details
Evaluation Dataset
vitaminc-pairs
- Dataset: vitaminc-pairs at be6febb
- Size: 128 evaluation samples
- Columns:
claim
andevidence
- Approximate statistics based on the first 128 samples:
claim evidence type string string details - min: 9 tokens
- mean: 21.42 tokens
- max: 41 tokens
- min: 11 tokens
- mean: 35.55 tokens
- max: 79 tokens
- Samples:
claim evidence Dragon Con had over 5000 guests .
Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .
COVID-19 has reached more than 185 countries .
As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .
In March , Italy had 3.6x times more cases of coronavirus than China .
As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .
- Loss:
CachedGISTEmbedLoss
with these parameters:{'guide': SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ), 'temperature': 0.025}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 100per_device_eval_batch_size
: 256gradient_accumulation_steps
: 2lr_scheduler_type
: cosine_with_min_lrlr_scheduler_kwargs
: {'num_cycles': 0.5, 'min_lr': 1.6666666666666667e-05}warmup_ratio
: 0.33save_safetensors
: Falsefp16
: Truepush_to_hub
: Truehub_model_id
: bobox/DeBERTa3-s-CustomPoolin-v3-step1-checkpoints-tmphub_strategy
: all_checkpointsbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 100per_device_eval_batch_size
: 256per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 2eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: cosine_with_min_lrlr_scheduler_kwargs
: {'num_cycles': 0.5, 'min_lr': 1.6666666666666667e-05}warmup_ratio
: 0.33warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Falsesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: bobox/DeBERTa3-s-CustomPoolin-v3-step1-checkpoints-tmphub_strategy
: all_checkpointshub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | vitaminc-pairs loss | negation-triplets loss | scitail-pairs-pos loss | scitail-pairs-qa loss | xsum-pairs loss | sciq pairs loss | qasc pairs loss | openbookqa pairs loss | msmarco pairs loss | nq pairs loss | trivia pairs loss | gooaq pairs loss | paws-pos loss | global dataset loss | sts-test_spearman_cosine | allNLI-dev_max_ap | Qnli-dev_max_ap |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.0168 | 8 | 10.2928 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0336 | 16 | 9.2166 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0504 | 24 | 9.4858 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0672 | 32 | 10.6143 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.0840 | 40 | 8.7553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1008 | 48 | 10.9939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1176 | 56 | 7.6039 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1345 | 64 | 5.9498 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1513 | 72 | 7.3051 | 3.2988 | 3.9604 | 1.9818 | 2.1997 | 6.0515 | 0.6095 | 6.3199 | 4.8391 | 6.4886 | 6.6406 | 6.4894 | 6.1527 | 2.0082 | 4.9577 | 0.3066 | 0.3444 | 0.5627 |
0.1681 | 80 | 8.3034 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.1849 | 88 | 7.6669 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2017 | 96 | 6.6415 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2185 | 104 | 5.7797 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2353 | 112 | 5.8361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2521 | 120 | 5.3339 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2689 | 128 | 5.5908 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.2857 | 136 | 5.3209 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3025 | 144 | 5.5359 | 3.3310 | 3.8580 | 1.4769 | 1.6994 | 5.4819 | 0.5385 | 5.2021 | 4.4410 | 5.3419 | 5.5506 | 5.6972 | 5.3376 | 1.4170 | 3.9169 | 0.2954 | 0.3795 | 0.6317 |
0.3193 | 152 | 5.4713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3361 | 160 | 4.9368 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3529 | 168 | 4.6594 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3697 | 176 | 4.8392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.3866 | 184 | 4.414 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4034 | 192 | 4.891 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4202 | 200 | 4.4553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4370 | 208 | 3.9729 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4538 | 216 | 3.7705 | 3.2468 | 3.6435 | 0.7890 | 0.7356 | 3.9327 | 0.4082 | 3.7175 | 3.5404 | 3.5351 | 4.0506 | 3.9953 | 3.6074 | 0.4195 | 2.4726 | 0.3791 | 0.4133 | 0.6779 |
0.4706 | 224 | 3.8409 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.4874 | 232 | 3.7894 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5042 | 240 | 3.3523 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5210 | 248 | 3.2407 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5378 | 256 | 3.3203 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5546 | 264 | 2.8457 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5714 | 272 | 2.4181 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.5882 | 280 | 3.4589 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6050 | 288 | 2.8203 | 3.1119 | 3.1485 | 0.4531 | 0.2652 | 2.6895 | 0.2656 | 2.5542 | 2.7523 | 2.6600 | 3.1773 | 3.2099 | 2.7316 | 0.2006 | 1.6342 | 0.5257 | 0.4717 | 0.7078 |
0.6218 | 296 | 2.4697 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6387 | 304 | 2.4654 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6555 | 312 | 2.4236 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6723 | 320 | 2.2879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.6891 | 328 | 2.2145 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7059 | 336 | 1.8464 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7227 | 344 | 2.0086 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7395 | 352 | 2.0635 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7563 | 360 | 1.8584 | 3.3202 | 2.5793 | 0.3434 | 0.1618 | 1.6759 | 0.1834 | 1.6454 | 2.1257 | 2.1938 | 2.5316 | 2.4558 | 2.0596 | 0.0984 | 1.2206 | 0.6610 | 0.5199 | 0.7119 |
0.7731 | 368 | 2.0286 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.7899 | 376 | 1.9389 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8067 | 384 | 1.7453 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8235 | 392 | 1.6629 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8403 | 400 | 1.2724 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8571 | 408 | 1.7824 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8739 | 416 | 1.5826 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.8908 | 424 | 1.1971 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9076 | 432 | 1.5228 | 3.3624 | 2.1952 | 0.3006 | 0.1223 | 1.1091 | 0.1582 | 1.2383 | 1.8664 | 1.7434 | 2.3959 | 2.0697 | 1.7563 | 0.0766 | 1.0193 | 0.7292 | 0.5194 | 0.7126 |
0.9244 | 440 | 1.3323 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9412 | 448 | 1.5124 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9580 | 456 | 1.5565 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9748 | 464 | 1.3672 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
0.9916 | 472 | 1.0382 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.0084 | 480 | 1.0626 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.0252 | 488 | 1.3539 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.0420 | 496 | 1.1723 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.0588 | 504 | 1.4235 | 3.4031 | 1.9759 | 0.2554 | 0.0814 | 0.9034 | 0.1378 | 1.1603 | 1.7589 | 1.5608 | 2.1230 | 1.7719 | 1.6633 | 0.0720 | 0.9380 | 0.7523 | 0.5297 | 0.7129 |
1.0756 | 512 | 1.2283 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.0924 | 520 | 1.2455 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.1092 | 528 | 1.4265 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.1261 | 536 | 1.296 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.1429 | 544 | 0.8763 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.1597 | 552 | 1.5678 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.1765 | 560 | 1.2548 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.1933 | 568 | 1.3731 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.2101 | 576 | 1.3023 | 3.3815 | 1.8740 | 0.2373 | 0.0769 | 0.7711 | 0.1237 | 0.9432 | 1.6871 | 1.5070 | 1.9947 | 1.6041 | 1.5579 | 0.0721 | 0.8661 | 0.7642 | 0.5412 | 0.7159 |
1.2269 | 584 | 0.8135 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.2437 | 592 | 1.0259 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.2605 | 600 | 1.1896 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.2773 | 608 | 1.0532 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.2941 | 616 | 1.3221 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.3109 | 624 | 1.3136 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.3277 | 632 | 1.2238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.3445 | 640 | 1.2407 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.3613 | 648 | 1.2245 | 3.4717 | 1.7962 | 0.2242 | 0.0488 | 0.7472 | 0.1108 | 0.9272 | 1.6692 | 1.3845 | 1.9117 | 1.3410 | 1.4387 | 0.0701 | 0.8505 | 0.7680 | 0.5471 | 0.7227 |
1.3782 | 656 | 1.0428 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.3950 | 664 | 1.1391 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.4118 | 672 | 1.2632 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.4286 | 680 | 0.9403 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.4454 | 688 | 0.7571 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.4622 | 696 | 0.9436 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.4790 | 704 | 1.1239 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.4958 | 712 | 0.9499 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.5126 | 720 | 1.0945 | 3.6495 | 1.6693 | 0.2157 | 0.0492 | 0.6830 | 0.1049 | 0.9140 | 1.5967 | 1.4397 | 1.7394 | 1.3303 | 1.4334 | 0.0603 | 0.8185 | 0.7815 | 0.5606 | 0.7098 |
1.5294 | 728 | 1.1161 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.5462 | 736 | 1.0056 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.5630 | 744 | 1.1743 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.5798 | 752 | 0.9153 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.5966 | 760 | 1.1589 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.6134 | 768 | 0.9187 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.6303 | 776 | 0.6937 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.6471 | 784 | 0.9704 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.6639 | 792 | 0.7343 | 3.5442 | 1.6493 | 0.2208 | 0.0249 | 0.6152 | 0.0969 | 0.7111 | 1.5369 | 1.4058 | 1.7066 | 1.2784 | 1.3419 | 0.0585 | 0.7827 | 0.7749 | 0.5627 | 0.7284 |
1.6807 | 800 | 1.2878 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.6975 | 808 | 0.9898 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.7143 | 816 | 0.7613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.7311 | 824 | 0.9612 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.7479 | 832 | 1.1524 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.7647 | 840 | 0.827 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.7815 | 848 | 1.1898 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.7983 | 856 | 1.0117 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.8151 | 864 | 0.7019 | 3.4544 | 1.6149 | 0.2035 | 0.0181 | 0.5525 | 0.0999 | 0.6641 | 1.5456 | 1.3911 | 1.7188 | 1.2547 | 1.3517 | 0.0562 | 0.7473 | 0.7684 | 0.5697 | 0.7329 |
1.8319 | 872 | 0.8352 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.8487 | 880 | 0.7836 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.8655 | 888 | 1.0187 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.8824 | 896 | 0.74 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.8992 | 904 | 0.7263 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.9160 | 912 | 0.8073 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.9328 | 920 | 0.8185 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.9496 | 928 | 1.0992 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1.9664 | 936 | 0.9973 | 3.5110 | 1.5776 | 0.2035 | 0.0250 | 0.5881 | 0.0934 | 0.6719 | 1.5059 | 1.2970 | 1.6186 | 1.1815 | 1.2714 | 0.0564 | 0.7213 | 0.7799 | 0.5544 | 0.7341 |
1.9832 | 944 | 0.6662 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.0 | 952 | 0.533 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.0168 | 960 | 0.7712 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.0336 | 968 | 0.6879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.0504 | 976 | 0.7975 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.0672 | 984 | 0.873 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.0840 | 992 | 0.7995 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.1008 | 1000 | 1.0119 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.1176 | 1008 | 0.6317 | 3.6778 | 1.5845 | 0.2102 | 0.0228 | 0.5851 | 0.0977 | 0.6411 | 1.4752 | 1.2992 | 1.6314 | 1.1260 | 1.2683 | 0.0556 | 0.7329 | 0.7693 | 0.5614 | 0.7274 |
2.1345 | 1016 | 0.72 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.1513 | 1024 | 0.9418 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.1681 | 1032 | 0.7848 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.1849 | 1040 | 0.6965 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.2017 | 1048 | 1.0447 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.2185 | 1056 | 0.6361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.2353 | 1064 | 0.6837 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.2521 | 1072 | 0.5713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.2689 | 1080 | 0.8193 | 3.6399 | 1.5565 | 0.2069 | 0.0213 | 0.5440 | 0.0904 | 0.6057 | 1.4815 | 1.2856 | 1.6441 | 1.1469 | 1.2540 | 0.0543 | 0.7216 | 0.7765 | 0.5599 | 0.7322 |
2.2857 | 1088 | 0.9754 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.3025 | 1096 | 0.8932 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.3193 | 1104 | 0.8716 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.3361 | 1112 | 0.8787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.3529 | 1120 | 0.9529 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.3697 | 1128 | 0.775 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.3866 | 1136 | 0.6178 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.4034 | 1144 | 0.8384 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.4202 | 1152 | 0.9425 | 3.5672 | 1.5244 | 0.2111 | 0.0162 | 0.5593 | 0.0893 | 0.5759 | 1.4933 | 1.2703 | 1.5815 | 1.1202 | 1.2132 | 0.0531 | 0.7058 | 0.7730 | 0.5635 | 0.7350 |
2.4370 | 1160 | 0.4551 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.4538 | 1168 | 0.6392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.4706 | 1176 | 0.8341 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.4874 | 1184 | 0.7392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.5042 | 1192 | 0.7646 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.5210 | 1200 | 0.8613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.5378 | 1208 | 0.7585 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.5546 | 1216 | 1.0611 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.5714 | 1224 | 0.6506 | 3.6439 | 1.5040 | 0.2125 | 0.0162 | 0.5282 | 0.0863 | 0.5858 | 1.5073 | 1.2444 | 1.5493 | 1.1014 | 1.2073 | 0.0532 | 0.7022 | 0.7774 | 0.5647 | 0.7328 |
2.5882 | 1232 | 0.8525 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.6050 | 1240 | 0.6304 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.6218 | 1248 | 0.6354 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.6387 | 1256 | 0.6583 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.6555 | 1264 | 0.5964 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.6723 | 1272 | 0.818 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.6891 | 1280 | 0.8635 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.7059 | 1288 | 0.6389 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.7227 | 1296 | 0.6819 | 3.6131 | 1.5104 | 0.2084 | 0.0148 | 0.5229 | 0.0854 | 0.5588 | 1.4963 | 1.2766 | 1.5679 | 1.0982 | 1.2203 | 0.0529 | 0.7059 | 0.7762 | 0.5659 | 0.7355 |
2.7395 | 1304 | 0.7878 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.7563 | 1312 | 0.7638 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.7731 | 1320 | 0.8885 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.7899 | 1328 | 0.8184 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.8067 | 1336 | 0.7472 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.8235 | 1344 | 0.7012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.8403 | 1352 | 0.4622 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.8571 | 1360 | 0.846 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.8739 | 1368 | 0.8308 | 3.6224 | 1.5088 | 0.2084 | 0.0148 | 0.5118 | 0.0858 | 0.5523 | 1.4941 | 1.2756 | 1.5808 | 1.0925 | 1.2114 | 0.0521 | 0.7022 | 0.7765 | 0.5662 | 0.7366 |
2.8908 | 1376 | 0.5334 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.9076 | 1384 | 0.7893 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.9244 | 1392 | 0.6897 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.9412 | 1400 | 0.7803 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.9580 | 1408 | 0.841 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.9748 | 1416 | 0.787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
2.9916 | 1424 | 0.5861 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
3.0 | 1428 | - | 3.6139 | 1.5071 | 0.2084 | 0.0150 | 0.5124 | 0.0862 | 0.5532 | 1.4924 | 1.2700 | 1.5806 | 1.0905 | 1.2081 | 0.0519 | 0.6997 | 0.7776 | 0.5665 | 0.7369 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.2.0
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}