bobox commited on Aug 25, 2024

Commit

6b3d1b1

verified ·

1 Parent(s): fe62935

Training in progress, step 110, checkpoint

Browse files

Files changed (17) hide show

checkpoint-110/1_Pooling/config.json +10 -0
checkpoint-110/README.md +964 -0
checkpoint-110/added_tokens.json +3 -0
checkpoint-110/config.json +35 -0
checkpoint-110/config_sentence_transformers.json +10 -0
checkpoint-110/modules.json +14 -0
checkpoint-110/optimizer.pt +3 -0
checkpoint-110/pytorch_model.bin +3 -0
checkpoint-110/rng_state.pth +3 -0
checkpoint-110/scheduler.pt +3 -0
checkpoint-110/sentence_bert_config.json +4 -0
checkpoint-110/special_tokens_map.json +15 -0
checkpoint-110/spm.model +3 -0
checkpoint-110/tokenizer.json +0 -0
checkpoint-110/tokenizer_config.json +58 -0
checkpoint-110/trainer_state.json +0 -0
checkpoint-110/training_args.bin +3 -0

checkpoint-110/1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

checkpoint-110/README.md ADDED Viewed

	@@ -0,0 +1,964 @@

+---
+base_model: microsoft/deberta-v3-small
+datasets: []
+language: []
+library_name: sentence-transformers
+metrics:
+- pearson_cosine
+- spearman_cosine
+- pearson_manhattan
+- spearman_manhattan
+- pearson_euclidean
+- spearman_euclidean
+- pearson_dot
+- spearman_dot
+- pearson_max
+- spearman_max
+- cosine_accuracy
+- cosine_accuracy_threshold
+- cosine_f1
+- cosine_f1_threshold
+- cosine_precision
+- cosine_recall
+- cosine_ap
+- dot_accuracy
+- dot_accuracy_threshold
+- dot_f1
+- dot_f1_threshold
+- dot_precision
+- dot_recall
+- dot_ap
+- manhattan_accuracy
+- manhattan_accuracy_threshold
+- manhattan_f1
+- manhattan_f1_threshold
+- manhattan_precision
+- manhattan_recall
+- manhattan_ap
+- euclidean_accuracy
+- euclidean_accuracy_threshold
+- euclidean_f1
+- euclidean_f1_threshold
+- euclidean_precision
+- euclidean_recall
+- euclidean_ap
+- max_accuracy
+- max_accuracy_threshold
+- max_f1
+- max_f1_threshold
+- max_precision
+- max_recall
+- max_ap
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:116445
+- loss:CachedGISTEmbedLoss
+widget:
+- source_sentence: what is the main purpose of the brain
+  sentences:
+  - Brain Physiologically, the function of the brain is to exert centralized control
+    over the other organs of the body. The brain acts on the rest of the body both
+    by generating patterns of muscle activity and by driving the secretion of chemicals
+    called hormones. This centralized control allows rapid and coordinated responses
+    to changes in the environment. Some basic types of responsiveness such as reflexes
+    can be mediated by the spinal cord or peripheral ganglia, but sophisticated purposeful
+    control of behavior based on complex sensory input requires the information integrating
+    capabilities of a centralized brain.
+  - How do scientists know that some mountains were once at the bottom of an ocean?
+  - The Smiths Wiki | Fandom powered by Wikia Share Ad blocker interference detected!
+    Wikia is a free-to-use site that makes money from advertising. We have a modified
+    experience for viewers using ad blockers Wikia is not accessible if you’ve made
+    further modifications. Remove the custom ad blocker rule(s) and the page will
+    load as expected. The Smiths were an English rock band formed in Manchester in
+    1982. Based on the songwriting partnership of Morrissey (vocals) and Johnny Marr
+    (guitar), the band also included Andy Rourke (bass), Mike Joyce (drums) and for
+    a brief time Craig Gannon (rhythm guitar). Critics have called them one of the
+    most important alternative rock bands to emerge from the British independent music
+    scene of the 1980s,and the group has had major influence on subsequent artists.
+    Morrissey's lovelorn tales of alienation found an audience amongst youth culture
+    bored by the ubiquitous synthesiser-pop bands of the early 1980s, while Marr's
+    complex melodies helped return guitar-based music to popularity. The group were
+    signed to the independent record label Rough Trade Records , for whom they released
+    four studio albums and several compilations, as well as numerous non-LP singles.
+    Although they had limited commercial success outside the UK while they were still
+    together, and never released a single that charted higher than number 10 in their
+    home country, The Smiths won a growing following, and they remain cult and commercial
+    favourites. The band broke up in 1987 amid disagreements between Morrissey and
+    Marr and has turned down several offers to reform. Welcome to The Smiths Wiki
+- source_sentence: There were 29 Muslims fatalities in the Cave of the Patriarchs
+    massacre .
+  sentences:
+  - In August , after the end of the war in June 1902 , Higgins Southampton left the
+    `` SSBavarian '' and returned to Cape Town the following month .
+  - Between 29 and 52 Muslims were killed and more than 100 others wounded . [   Settlers
+    remember gunman Goldstein ; Hebron riots continue ] .
+  - 29 Muslims were killed and more than 100 others wounded . [   Settlers remember
+    gunman Goldstein ; Hebron riots continue ] .
+- source_sentence: are tabby cats all male?
+  sentences:
+  - Did you know orange tabby cats are typically male? In fact, up to 80 percent of
+    orange tabbies are male, making orange female cats a bit of a rarity. According
+    to the BBC's Focus Magazine, the ginger gene in cats works a little differently
+    compared to humans; it is on the X chromosome.
+  - Shawnee Trails Council was formed from the merger of the Four Rivers Council and
+    the Audubon Council .
+  - 'A picture of a modern looking kitchen area
+    '
+- source_sentence: Aamir Khan agreed to act immediately after reading Mehra 's screenplay
+    in `` Rang De Basanti '' .
+  sentences:
+  - Chris Rea —   Free listening, videos, concerts, stats and photos at Last.fm singer-songwriter
+    Christopher Anton Rea (pronounced Ree-ah), born 4 March 1951, is a singer, songwriter,
+    and guitarist from Middlesbrough, England. Rea's recording career began in 1978.
+    Although he almost immediately had a US hit single with "Fool (If You Think It's
+    Over)", Rea's initial focus was on continental Europe, releasing eight albums
+    in the 1980s. It wasn't until 1985's Shamrock Diaries and the songs "Stainsby
+    Girls" and "Josephine," that UK audiences began to take notice of him. Follow
+    up albums… read more
+  - "Healthy Fast Food Meal No. 1. Grilled Chicken Sandwich and Fruit Cup (Chick-fil-A)\
+    \ Several fast food chains offer a grilled chicken sandwich. The trick is ordering\
+    \ it without mayo or creamy sauce, and making sure itâ\x80\x99s served with a\
+    \ whole grain bun."
+  - Aamir Khan agreed to act in `` Rang De Basanti '' immediately after reading Mehra
+    's script .
+- source_sentence: 'A man wearing a blue bow tie and a fedora hat in a car. '
+  sentences:
+  - A man takes a photo of himself wearing a bowtie and hat
+  - Scientists explain the world based on what?
+  - 'County of Angus - definition of County of Angus by The Free Dictionary County
+    of Angus - definition of County of Angus by The Free Dictionary http://www.thefreedictionary.com/County+of+Angus
+     (ăng′gəs) n. Any of a breed of hornless beef cattle that originated in Scotland
+    and are usually black but also occur in a red variety. Also called Black Angus.
+    [After Angus, former county of Scotland.] Angus (ˈæŋɡəs) n (Placename) a council
+    area of E Scotland on the North Sea: the historical county of Angus became part
+    of Tayside region in 1975; reinstated as a unitary authority (excluding City of
+    Dundee) in 1996. Administrative centre: Forfar. Pop: 107 520 (2003 est). Area:
+    2181 sq km (842 sq miles) An•gus'
+model-index:
+- name: SentenceTransformer based on microsoft/deberta-v3-small
+  results:
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: sts test
+      type: sts-test
+    metrics:
+    - type: pearson_cosine
+      value: 0.7489263204555723
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: 0.7626005619606424
+      name: Spearman Cosine
+    - type: pearson_manhattan
+      value: 0.7591990025704353
+      name: Pearson Manhattan
+    - type: spearman_manhattan
+      value: 0.7477882076989188
+      name: Spearman Manhattan
+    - type: pearson_euclidean
+      value: 0.7622787611500085
+      name: Pearson Euclidean
+    - type: spearman_euclidean
+      value: 0.7539243664071233
+      name: Spearman Euclidean
+    - type: pearson_dot
+      value: 0.6493790443582248
+      name: Pearson Dot
+    - type: spearman_dot
+      value: 0.6306412644605037
+      name: Spearman Dot
+    - type: pearson_max
+      value: 0.7622787611500085
+      name: Pearson Max
+    - type: spearman_max
+      value: 0.7626005619606424
+      name: Spearman Max
+  - task:
+      type: binary-classification
+      name: Binary Classification
+    dataset:
+      name: allNLI dev
+      type: allNLI-dev
+    metrics:
+    - type: cosine_accuracy
+      value: 0.7109375
+      name: Cosine Accuracy
+    - type: cosine_accuracy_threshold
+      value: 0.916961669921875
+      name: Cosine Accuracy Threshold
+    - type: cosine_f1
+      value: 0.5853658536585366
+      name: Cosine F1
+    - type: cosine_f1_threshold
+      value: 0.8279993534088135
+      name: Cosine F1 Threshold
+    - type: cosine_precision
+      value: 0.4748201438848921
+      name: Cosine Precision
+    - type: cosine_recall
+      value: 0.7630057803468208
+      name: Cosine Recall
+    - type: cosine_ap
+      value: 0.5495769497490841
+      name: Cosine Ap
+    - type: dot_accuracy
+      value: 0.671875
+      name: Dot Accuracy
+    - type: dot_accuracy_threshold
+      value: 481.2850646972656
+      name: Dot Accuracy Threshold
+    - type: dot_f1
+      value: 0.549165120593692
+      name: Dot F1
+    - type: dot_f1_threshold
+      value: 381.15167236328125
+      name: Dot F1 Threshold
+    - type: dot_precision
+      value: 0.40437158469945356
+      name: Dot Precision
+    - type: dot_recall
+      value: 0.8554913294797688
+      name: Dot Recall
+    - type: dot_ap
+      value: 0.45293867777170244
+      name: Dot Ap
+    - type: manhattan_accuracy
+      value: 0.71484375
+      name: Manhattan Accuracy
+    - type: manhattan_accuracy_threshold
+      value: 186.7671356201172
+      name: Manhattan Accuracy Threshold
+    - type: manhattan_f1
+      value: 0.5696465696465696
+      name: Manhattan F1
+    - type: manhattan_f1_threshold
+      value: 268.783935546875
+      name: Manhattan F1 Threshold
+    - type: manhattan_precision
+      value: 0.4448051948051948
+      name: Manhattan Precision
+    - type: manhattan_recall
+      value: 0.791907514450867
+      name: Manhattan Recall
+    - type: manhattan_ap
+      value: 0.5511647333663136
+      name: Manhattan Ap
+    - type: euclidean_accuracy
+      value: 0.71484375
+      name: Euclidean Accuracy
+    - type: euclidean_accuracy_threshold
+      value: 8.915003776550293
+      name: Euclidean Accuracy Threshold
+    - type: euclidean_f1
+      value: 0.574074074074074
+      name: Euclidean F1
+    - type: euclidean_f1_threshold
+      value: 12.812746047973633
+      name: Euclidean F1 Threshold
+    - type: euclidean_precision
+      value: 0.47876447876447875
+      name: Euclidean Precision
+    - type: euclidean_recall
+      value: 0.7167630057803468
+      name: Euclidean Recall
+    - type: euclidean_ap
+      value: 0.5535962824434967
+      name: Euclidean Ap
+    - type: max_accuracy
+      value: 0.71484375
+      name: Max Accuracy
+    - type: max_accuracy_threshold
+      value: 481.2850646972656
+      name: Max Accuracy Threshold
+    - type: max_f1
+      value: 0.5853658536585366
+      name: Max F1
+    - type: max_f1_threshold
+      value: 381.15167236328125
+      name: Max F1 Threshold
+    - type: max_precision
+      value: 0.47876447876447875
+      name: Max Precision
+    - type: max_recall
+      value: 0.8554913294797688
+      name: Max Recall
+    - type: max_ap
+      value: 0.5535962824434967
+      name: Max Ap
+  - task:
+      type: binary-classification
+      name: Binary Classification
+    dataset:
+      name: Qnli dev
+      type: Qnli-dev
+    metrics:
+    - type: cosine_accuracy
+      value: 0.681640625
+      name: Cosine Accuracy
+    - type: cosine_accuracy_threshold
+      value: 0.8160840272903442
+      name: Cosine Accuracy Threshold
+    - type: cosine_f1
+      value: 0.6917562724014337
+      name: Cosine F1
+    - type: cosine_f1_threshold
+      value: 0.7854001522064209
+      name: Cosine F1 Threshold
+    - type: cosine_precision
+      value: 0.5993788819875776
+      name: Cosine Precision
+    - type: cosine_recall
+      value: 0.8177966101694916
+      name: Cosine Recall
+    - type: cosine_ap
+      value: 0.7109982147608755
+      name: Cosine Ap
+    - type: dot_accuracy
+      value: 0.6484375
+      name: Dot Accuracy
+    - type: dot_accuracy_threshold
+      value: 392.5464782714844
+      name: Dot Accuracy Threshold
+    - type: dot_f1
+      value: 0.6688311688311689
+      name: Dot F1
+    - type: dot_f1_threshold
+      value: 368.7878723144531
+      name: Dot F1 Threshold
+    - type: dot_precision
+      value: 0.5421052631578948
+      name: Dot Precision
+    - type: dot_recall
+      value: 0.8728813559322034
+      name: Dot Recall
+    - type: dot_ap
+      value: 0.6053421534358263
+      name: Dot Ap
+    - type: manhattan_accuracy
+      value: 0.685546875
+      name: Manhattan Accuracy
+    - type: manhattan_accuracy_threshold
+      value: 244.63809204101562
+      name: Manhattan Accuracy Threshold
+    - type: manhattan_f1
+      value: 0.6938053097345133
+      name: Manhattan F1
+    - type: manhattan_f1_threshold
+      value: 295.4796142578125
+      name: Manhattan F1 Threshold
+    - type: manhattan_precision
+      value: 0.5957446808510638
+      name: Manhattan Precision
+    - type: manhattan_recall
+      value: 0.8305084745762712
+      name: Manhattan Recall
+    - type: manhattan_ap
+      value: 0.7216536349653324
+      name: Manhattan Ap
+    - type: euclidean_accuracy
+      value: 0.6875
+      name: Euclidean Accuracy
+    - type: euclidean_accuracy_threshold
+      value: 13.026724815368652
+      name: Euclidean Accuracy Threshold
+    - type: euclidean_f1
+      value: 0.689407540394973
+      name: Euclidean F1
+    - type: euclidean_f1_threshold
+      value: 14.538017272949219
+      name: Euclidean F1 Threshold
+    - type: euclidean_precision
+      value: 0.5981308411214953
+      name: Euclidean Precision
+    - type: euclidean_recall
+      value: 0.8135593220338984
+      name: Euclidean Recall
+    - type: euclidean_ap
+      value: 0.7181091181717016
+      name: Euclidean Ap
+    - type: max_accuracy
+      value: 0.6875
+      name: Max Accuracy
+    - type: max_accuracy_threshold
+      value: 392.5464782714844
+      name: Max Accuracy Threshold
+    - type: max_f1
+      value: 0.6938053097345133
+      name: Max F1
+    - type: max_f1_threshold
+      value: 368.7878723144531
+      name: Max F1 Threshold
+    - type: max_precision
+      value: 0.5993788819875776
+      name: Max Precision
+    - type: max_recall
+      value: 0.8728813559322034
+      name: Max Recall
+    - type: max_ap
+      value: 0.7216536349653324
+      name: Max Ap
+---
+# SentenceTransformer based on microsoft/deberta-v3-small
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) on the bobox/enhanced_nli-50_k dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) <!-- at revision a36c739020e01763fe789b4b85e2df55d6180012 -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 768 tokens
+- **Similarity Function:** Cosine Similarity
+- **Training Dataset:**
+    - bobox/enhanced_nli-50_k
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
+  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("bobox/DeBERTa-small-ST-UnifiedDatasets-baseline-checkpoints-tmp")
+# Run inference
+sentences = [
+    'A man wearing a blue bow tie and a fedora hat in a car. ',
+    'A man takes a photo of himself wearing a bowtie and hat',
+    'County of Angus - definition of County of Angus by The Free Dictionary County of Angus - definition of County of Angus by The Free Dictionary http://www.thefreedictionary.com/County+of+Angus \xa0(ăng′gəs) n. Any of a breed of hornless beef cattle that originated in Scotland and are usually black but also occur in a red variety. Also called Black Angus. [After Angus, former county of Scotland.] Angus (ˈæŋɡəs) n (Placename) a council area of E Scotland on the North Sea: the historical county of Angus became part of Tayside region in 1975; reinstated as a unitary authority (excluding City of Dundee) in 1996. Administrative centre: Forfar. Pop: 107 520 (2003 est). Area: 2181 sq km (842 sq miles) An•gus',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 768]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Semantic Similarity
+* Dataset: `sts-test`
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| pearson_cosine      | 0.7489     |
+| **spearman_cosine** | **0.7626** |
+| pearson_manhattan   | 0.7592     |
+| spearman_manhattan  | 0.7478     |
+| pearson_euclidean   | 0.7623     |
+| spearman_euclidean  | 0.7539     |
+| pearson_dot         | 0.6494     |
+| spearman_dot        | 0.6306     |
+| pearson_max         | 0.7623     |
+| spearman_max        | 0.7626     |
+#### Binary Classification
+* Dataset: `allNLI-dev`
+* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
+| Metric                       | Value      |
+|:-----------------------------|:-----------|
+| cosine_accuracy              | 0.7109     |
+| cosine_accuracy_threshold    | 0.917      |
+| cosine_f1                    | 0.5854     |
+| cosine_f1_threshold          | 0.828      |
+| cosine_precision             | 0.4748     |
+| cosine_recall                | 0.763      |
+| cosine_ap                    | 0.5496     |
+| dot_accuracy                 | 0.6719     |
+| dot_accuracy_threshold       | 481.2851   |
+| dot_f1                       | 0.5492     |
+| dot_f1_threshold             | 381.1517   |
+| dot_precision                | 0.4044     |
+| dot_recall                   | 0.8555     |
+| dot_ap                       | 0.4529     |
+| manhattan_accuracy           | 0.7148     |
+| manhattan_accuracy_threshold | 186.7671   |
+| manhattan_f1                 | 0.5696     |
+| manhattan_f1_threshold       | 268.7839   |
+| manhattan_precision          | 0.4448     |
+| manhattan_recall             | 0.7919     |
+| manhattan_ap                 | 0.5512     |
+| euclidean_accuracy           | 0.7148     |
+| euclidean_accuracy_threshold | 8.915      |
+| euclidean_f1                 | 0.5741     |
+| euclidean_f1_threshold       | 12.8127    |
+| euclidean_precision          | 0.4788     |
+| euclidean_recall             | 0.7168     |
+| euclidean_ap                 | 0.5536     |
+| max_accuracy                 | 0.7148     |
+| max_accuracy_threshold       | 481.2851   |
+| max_f1                       | 0.5854     |
+| max_f1_threshold             | 381.1517   |
+| max_precision                | 0.4788     |
+| max_recall                   | 0.8555     |
+| **max_ap**                   | **0.5536** |
+#### Binary Classification
+* Dataset: `Qnli-dev`
+* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
+| Metric                       | Value      |
+|:-----------------------------|:-----------|
+| cosine_accuracy              | 0.6816     |
+| cosine_accuracy_threshold    | 0.8161     |
+| cosine_f1                    | 0.6918     |
+| cosine_f1_threshold          | 0.7854     |
+| cosine_precision             | 0.5994     |
+| cosine_recall                | 0.8178     |
+| cosine_ap                    | 0.711      |
+| dot_accuracy                 | 0.6484     |
+| dot_accuracy_threshold       | 392.5465   |
+| dot_f1                       | 0.6688     |
+| dot_f1_threshold             | 368.7879   |
+| dot_precision                | 0.5421     |
+| dot_recall                   | 0.8729     |
+| dot_ap                       | 0.6053     |
+| manhattan_accuracy           | 0.6855     |
+| manhattan_accuracy_threshold | 244.6381   |
+| manhattan_f1                 | 0.6938     |
+| manhattan_f1_threshold       | 295.4796   |
+| manhattan_precision          | 0.5957     |
+| manhattan_recall             | 0.8305     |
+| manhattan_ap                 | 0.7217     |
+| euclidean_accuracy           | 0.6875     |
+| euclidean_accuracy_threshold | 13.0267    |
+| euclidean_f1                 | 0.6894     |
+| euclidean_f1_threshold       | 14.538     |
+| euclidean_precision          | 0.5981     |
+| euclidean_recall             | 0.8136     |
+| euclidean_ap                 | 0.7181     |
+| max_accuracy                 | 0.6875     |
+| max_accuracy_threshold       | 392.5465   |
+| max_f1                       | 0.6938     |
+| max_f1_threshold             | 368.7879   |
+| max_precision                | 0.5994     |
+| max_recall                   | 0.8729     |
+| **max_ap**                   | **0.7217** |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### bobox/enhanced_nli-50_k
+* Dataset: bobox/enhanced_nli-50_k
+* Size: 116,445 training samples
+* Columns: <code>sentence1</code> and <code>sentence2</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence1                                                                          | sentence2                                                                          |
+  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                             |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 33.67 tokens</li><li>max: 338 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 51.48 tokens</li><li>max: 512 tokens</li></ul> |
+* Samples:
+  | sentence1                                                            | sentence2                                                                                                                                                                                                                                  |
+  |:---------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>who is darnell from my name is earl</code>                     | <code>Eddie Steeples Eddie Steeples (born November 25, 1973)[1] is an American actor known for his roles as the "Rubberband Man" in an advertising campaign for OfficeMax, and as Darnell Turner on the NBC sitcom My Name Is Earl.</code> |
+  | <code>Ferrell and the Chili Peppers toured together in 2013 .</code> | <code>Ferrell and the Chili Peppers wrapped up I 'm With You World Tour in April 2013 .</code>                                                                                                                                             |
+  | <code>Cells have four cycles.</code>                                 | <code>How many cycles do cells have?</code>                                                                                                                                                                                                |
+* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
+  ```json
+  {'guide': SentenceTransformer(
+    (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+    (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+    (2): Normalize()
+  ), 'temperature': 0.025}
+  ```
+### Evaluation Dataset
+#### bobox/enhanced_nli-50_k
+* Dataset: bobox/enhanced_nli-50_k
+* Size: 1,506 evaluation samples
+* Columns: <code>sentence1</code> and <code>sentence2</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | sentence1                                                                          | sentence2                                                                          |
+  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                             |
+  | details | <ul><li>min: 3 tokens</li><li>mean: 32.36 tokens</li><li>max: 341 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 61.99 tokens</li><li>max: 431 tokens</li></ul> |
+* Samples:
+  | sentence1                                                                                                                                     | sentence2                                                                                                                                                                                 |
+  |:----------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Interestingly, snakes use their forked tongues to smell.</code>                                                                         | <code>Snakes use their tongue to smell things.</code>                                                                                                                                     |
+  | <code>Soil is a renewable resource that can take thousand of years to form.</code>                                                            | <code>What is a renewable resource that can take thousand of years to form?</code>                                                                                                        |
+  | <code>As of March 22 , there were more than 321,000 cases with over 13,600 deaths and more than 96,000 recoveries reported worldwide .</code> | <code>As of 22 March , more than 321,000 cases of COVID-19 have been reported in over 180 countries and territories , resulting in more than 13,600 deaths and 96,000 recoveries .</code> |
+* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
+  ```json
+  {'guide': SentenceTransformer(
+    (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
+    (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+    (2): Normalize()
+  ), 'temperature': 0.025}
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 640
+- `per_device_eval_batch_size`: 128
+- `learning_rate`: 3.75e-05
+- `weight_decay`: 0.0005
+- `lr_scheduler_type`: cosine_with_min_lr
+- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 7.499999999999999e-06}
+- `warmup_ratio`: 0.33
+- `save_safetensors`: False
+- `fp16`: True
+- `push_to_hub`: True
+- `hub_model_id`: bobox/DeBERTa-small-ST-UnifiedDatasets-baseline-checkpoints-tmp
+- `hub_strategy`: all_checkpoints
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 640
+- `per_device_eval_batch_size`: 128
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 3.75e-05
+- `weight_decay`: 0.0005
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 3
+- `max_steps`: -1
+- `lr_scheduler_type`: cosine_with_min_lr
+- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 7.499999999999999e-06}
+- `warmup_ratio`: 0.33
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: False
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: True
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: True
+- `resume_from_checkpoint`: None
+- `hub_model_id`: bobox/DeBERTa-small-ST-UnifiedDatasets-baseline-checkpoints-tmp
+- `hub_strategy`: all_checkpoints
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `eval_use_gather_object`: False
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+<details><summary>Click to expand</summary>
+| Epoch  | Step | Training Loss | loss   | Qnli-dev_max_ap | allNLI-dev_max_ap | sts-test_spearman_cosine |
+|:------:|:----:|:-------------:|:------:|:---------------:|:-----------------:|:------------------------:|
+| 0.0055 | 1    | 8.8159        | -      | -               | -                 | -                        |
+| 0.0110 | 2    | 9.1259        | -      | -               | -                 | -                        |
+| 0.0165 | 3    | 8.9017        | -      | -               | -                 | -                        |
+| 0.0220 | 4    | 9.1969        | -      | -               | -                 | -                        |
+| 0.0275 | 5    | 9.3716        | 1.3746 | 0.6067          | 0.3706            | 0.1943                   |
+| 0.0330 | 6    | 9.0425        | -      | -               | -                 | -                        |
+| 0.0385 | 7    | 8.7309        | -      | -               | -                 | -                        |
+| 0.0440 | 8    | 9.0123        | -      | -               | -                 | -                        |
+| 0.0495 | 9    | 8.8095        | -      | -               | -                 | -                        |
+| 0.0549 | 10   | 9.3194        | 1.3227 | 0.6089          | 0.3721            | 0.1976                   |
+| 0.0604 | 11   | 8.9873        | -      | -               | -                 | -                        |
+| 0.0659 | 12   | 8.5575        | -      | -               | -                 | -                        |
+| 0.0714 | 13   | 8.8096        | -      | -               | -                 | -                        |
+| 0.0769 | 14   | 8.0996        | -      | -               | -                 | -                        |
+| 0.0824 | 15   | 8.1942        | 1.2244 | 0.6140          | 0.3743            | 0.2085                   |
+| 0.0879 | 16   | 8.1654        | -      | -               | -                 | -                        |
+| 0.0934 | 17   | 7.7336        | -      | -               | -                 | -                        |
+| 0.0989 | 18   | 7.9535        | -      | -               | -                 | -                        |
+| 0.1044 | 19   | 7.9322        | -      | -               | -                 | -                        |
+| 0.1099 | 20   | 7.6812        | 1.1301 | 0.6199          | 0.3790            | 0.2233                   |
+| 0.1154 | 21   | 7.551         | -      | -               | -                 | -                        |
+| 0.1209 | 22   | 7.3788        | -      | -               | -                 | -                        |
+| 0.1264 | 23   | 7.1746        | -      | -               | -                 | -                        |
+| 0.1319 | 24   | 7.1849        | -      | -               | -                 | -                        |
+| 0.1374 | 25   | 7.1085        | 1.0723 | 0.6195          | 0.3852            | 0.2357                   |
+| 0.1429 | 26   | 7.3926        | -      | -               | -                 | -                        |
+| 0.1484 | 27   | 7.1817        | -      | -               | -                 | -                        |
+| 0.1538 | 28   | 7.239         | -      | -               | -                 | -                        |
+| 0.1593 | 29   | 7.0023        | -      | -               | -                 | -                        |
+| 0.1648 | 30   | 6.9898        | 1.0282 | 0.6215          | 0.3898            | 0.2477                   |
+| 0.1703 | 31   | 6.9776        | -      | -               | -                 | -                        |
+| 0.1758 | 32   | 6.8088        | -      | -               | -                 | -                        |
+| 0.1813 | 33   | 6.8916        | -      | -               | -                 | -                        |
+| 0.1868 | 34   | 6.6931        | -      | -               | -                 | -                        |
+| 0.1923 | 35   | 6.5707        | 0.9846 | 0.6253          | 0.3952            | 0.2608                   |
+| 0.1978 | 36   | 6.6231        | -      | -               | -                 | -                        |
+| 0.2033 | 37   | 6.4951        | -      | -               | -                 | -                        |
+| 0.2088 | 38   | 6.4607        | -      | -               | -                 | -                        |
+| 0.2143 | 39   | 6.4504        | -      | -               | -                 | -                        |
+| 0.2198 | 40   | 6.3649        | 0.9314 | 0.6299          | 0.4041            | 0.2738                   |
+| 0.2253 | 41   | 6.2244        | -      | -               | -                 | -                        |
+| 0.2308 | 42   | 6.007         | -      | -               | -                 | -                        |
+| 0.2363 | 43   | 5.977         | -      | -               | -                 | -                        |
+| 0.2418 | 44   | 6.0748        | -      | -               | -                 | -                        |
+| 0.2473 | 45   | 5.7946        | 0.8549 | 0.6404          | 0.4116            | 0.2847                   |
+| 0.2527 | 46   | 5.8751        | -      | -               | -                 | -                        |
+| 0.2582 | 47   | 5.543         | -      | -               | -                 | -                        |
+| 0.2637 | 48   | 5.5511        | -      | -               | -                 | -                        |
+| 0.2692 | 49   | 5.411         | -      | -               | -                 | -                        |
+| 0.2747 | 50   | 5.378         | 0.7943 | 0.6557          | 0.4159            | 0.2866                   |
+| 0.2802 | 51   | 5.3831        | -      | -               | -                 | -                        |
+| 0.2857 | 52   | 4.9729        | -      | -               | -                 | -                        |
+| 0.2912 | 53   | 5.0425        | -      | -               | -                 | -                        |
+| 0.2967 | 54   | 4.9446        | -      | -               | -                 | -                        |
+| 0.3022 | 55   | 4.9288        | 0.7178 | 0.6679          | 0.4273            | 0.3132                   |
+| 0.3077 | 56   | 4.8434        | -      | -               | -                 | -                        |
+| 0.3132 | 57   | 4.6914        | -      | -               | -                 | -                        |
+| 0.3187 | 58   | 4.5254        | -      | -               | -                 | -                        |
+| 0.3242 | 59   | 4.6734        | -      | -               | -                 | -                        |
+| 0.3297 | 60   | 4.2421        | 0.6202 | 0.6684          | 0.4423            | 0.3580                   |
+| 0.3352 | 61   | 4.2234        | -      | -               | -                 | -                        |
+| 0.3407 | 62   | 4.0225        | -      | -               | -                 | -                        |
+| 0.3462 | 63   | 4.0034        | -      | -               | -                 | -                        |
+| 0.3516 | 64   | 3.994         | -      | -               | -                 | -                        |
+| 0.3571 | 65   | 3.651         | 0.5489 | 0.6750          | 0.4569            | 0.4014                   |
+| 0.3626 | 66   | 3.9308        | -      | -               | -                 | -                        |
+| 0.3681 | 67   | 3.8694        | -      | -               | -                 | -                        |
+| 0.3736 | 68   | 3.7159        | -      | -               | -                 | -                        |
+| 0.3791 | 69   | 3.6499        | -      | -               | -                 | -                        |
+| 0.3846 | 70   | 3.4749        | 0.4923 | 0.6734          | 0.4701            | 0.4465                   |
+| 0.3901 | 71   | 3.3356        | -      | -               | -                 | -                        |
+| 0.3956 | 72   | 3.4768        | -      | -               | -                 | -                        |
+| 0.4011 | 73   | 3.2748        | -      | -               | -                 | -                        |
+| 0.4066 | 74   | 3.2789        | -      | -               | -                 | -                        |
+| 0.4121 | 75   | 2.9815        | 0.4422 | 0.6759          | 0.4747            | 0.4924                   |
+| 0.4176 | 76   | 3.2356        | -      | -               | -                 | -                        |
+| 0.4231 | 77   | 2.946         | -      | -               | -                 | -                        |
+| 0.4286 | 78   | 2.8888        | -      | -               | -                 | -                        |
+| 0.4341 | 79   | 2.8992        | -      | -               | -                 | -                        |
+| 0.4396 | 80   | 2.9901        | 0.4040 | 0.6786          | 0.4781            | 0.5478                   |
+| 0.4451 | 81   | 2.6608        | -      | -               | -                 | -                        |
+| 0.4505 | 82   | 2.831         | -      | -               | -                 | -                        |
+| 0.4560 | 83   | 2.5503        | -      | -               | -                 | -                        |
+| 0.4615 | 84   | 2.8576        | -      | -               | -                 | -                        |
+| 0.4670 | 85   | 2.5726        | 0.3711 | 0.6858          | 0.4898            | 0.6134                   |
+| 0.4725 | 86   | 2.7197        | -      | -               | -                 | -                        |
+| 0.4780 | 87   | 2.5123        | -      | -               | -                 | -                        |
+| 0.4835 | 88   | 2.553         | -      | -               | -                 | -                        |
+| 0.4890 | 89   | 2.4862        | -      | -               | -                 | -                        |
+| 0.4945 | 90   | 2.491         | 0.3450 | 0.6997          | 0.5077            | 0.6668                   |
+| 0.5    | 91   | 2.3648        | -      | -               | -                 | -                        |
+| 0.5055 | 92   | 2.3788        | -      | -               | -                 | -                        |
+| 0.5110 | 93   | 2.3758        | -      | -               | -                 | -                        |
+| 0.5165 | 94   | 2.3319        | -      | -               | -                 | -                        |
+| 0.5220 | 95   | 2.2336        | 0.3238 | 0.7048          | 0.5252            | 0.7018                   |
+| 0.5275 | 96   | 2.3036        | -      | -               | -                 | -                        |
+| 0.5330 | 97   | 2.3034        | -      | -               | -                 | -                        |
+| 0.5385 | 98   | 2.207         | -      | -               | -                 | -                        |
+| 0.5440 | 99   | 2.1732        | -      | -               | -                 | -                        |
+| 0.5495 | 100  | 2.1743        | 0.3036 | 0.7091          | 0.5418            | 0.7272                   |
+| 0.5549 | 101  | 2.086         | -      | -               | -                 | -                        |
+| 0.5604 | 102  | 2.0223        | -      | -               | -                 | -                        |
+| 0.5659 | 103  | 2.0878        | -      | -               | -                 | -                        |
+| 0.5714 | 104  | 1.9475        | -      | -               | -                 | -                        |
+| 0.5769 | 105  | 2.1524        | 0.2853 | 0.7159          | 0.5499            | 0.7489                   |
+| 0.5824 | 106  | 1.9393        | -      | -               | -                 | -                        |
+| 0.5879 | 107  | 2.1308        | -      | -               | -                 | -                        |
+| 0.5934 | 108  | 1.9469        | -      | -               | -                 | -                        |
+| 0.5989 | 109  | 1.8683        | -      | -               | -                 | -                        |
+| 0.6044 | 110  | 1.8167        | 0.2702 | 0.7217          | 0.5536            | 0.7626                   |
+</details>
+### Framework Versions
+- Python: 3.10.14
+- Sentence Transformers: 3.0.1
+- Transformers: 4.44.0
+- PyTorch: 2.4.0
+- Accelerate: 0.33.0
+- Datasets: 2.21.0
+- Tokenizers: 0.19.1
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

checkpoint-110/added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "[MASK]": 128000
+}

checkpoint-110/config.json ADDED Viewed

	@@ -0,0 +1,35 @@

+{
+  "_name_or_path": "microsoft/deberta-v3-small",
+  "architectures": [
+    "DebertaV2Model"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-07,
+  "max_position_embeddings": 512,
+  "max_relative_positions": -1,
+  "model_type": "deberta-v2",
+  "norm_rel_ebd": "layer_norm",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 6,
+  "pad_token_id": 0,
+  "pooler_dropout": 0,
+  "pooler_hidden_act": "gelu",
+  "pooler_hidden_size": 768,
+  "pos_att_type": [
+    "p2c",
+    "c2p"
+  ],
+  "position_biased_input": false,
+  "position_buckets": 256,
+  "relative_attention": true,
+  "share_att_key": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.0",
+  "type_vocab_size": 0,
+  "vocab_size": 128100
+}

checkpoint-110/config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.0.1",
+    "transformers": "4.44.0",
+    "pytorch": "2.4.0"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

checkpoint-110/modules.json ADDED Viewed

	@@ -0,0 +1,14 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  }
+]

checkpoint-110/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b8b58baa1d148c2570e59d52cab7516e156bb31762ea2e676cc136a49116b0af
+size 1130520122

checkpoint-110/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:738f0a7ea7064dc1fad40f06348ccc1b270737b5df295320877dfeb122ea18a9
+size 565251810

checkpoint-110/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58f811f94539aa733ba4ef861adb95e7c49fb89154fee4002503dcf3153081b7
+size 14244

checkpoint-110/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6c14f6669285b589459a92ce501e1b7bb3e1c10d97d299ec8dab14ebb69f66e0
+size 1064

checkpoint-110/sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": false
+}

checkpoint-110/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "bos_token": "[CLS]",
+  "cls_token": "[CLS]",
+  "eos_token": "[SEP]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

checkpoint-110/spm.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
+size 2464616

checkpoint-110/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-110/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,58 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128000": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "[CLS]",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "eos_token": "[SEP]",
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "sp_model_kwargs": {},
+  "split_by_punct": false,
+  "tokenizer_class": "DebertaV2Tokenizer",
+  "unk_token": "[UNK]",
+  "vocab_type": "spm"
+}

checkpoint-110/trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-110/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b50a4a92b5eb29f5d9b19f9e1060fdd6af0a02268cb16ba6bb85ab82bb7ddd6b
+size 5752