Leo1212
/

longformer-base-4096-sentence-transformers-best

@@ -8,6 +8,22 @@ datasets:
 language:
 - en
 library_name: sentence-transformers
 pipeline_tag: sentence-similarity
 tags:
 - sentence-transformers
@@ -48,6 +64,105 @@ widget:
   - It is meant to stimulate root growth - in particular to stimulate the creation
     of roots.
   - A person folds a piece of paper.
 ---
 # SentenceTransformer based on allenai/longformer-base-4096
@@ -144,6 +259,56 @@ You can finetune this model on your own dataset.
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
 <!--
 ## Bias, Risks and Limitations
@@ -425,9 +590,9 @@ You can finetune this model on your own dataset.
 - `overwrite_output_dir`: True
 - `eval_strategy`: steps
-- `learning_rate`: 0.00015722057717478097
 - `num_train_epochs`: 10
-- `warmup_steps`: 2
 - `load_best_model_at_end`: True
 #### All Hyperparameters
@@ -444,7 +609,7 @@ You can finetune this model on your own dataset.
 - `gradient_accumulation_steps`: 1
 - `eval_accumulation_steps`: None
 - `torch_empty_cache_steps`: None
-- `learning_rate`: 0.00015722057717478097
 - `weight_decay`: 0.0
 - `adam_beta1`: 0.9
 - `adam_beta2`: 0.999
@@ -455,7 +620,7 @@ You can finetune this model on your own dataset.
 - `lr_scheduler_type`: linear
 - `lr_scheduler_kwargs`: {}
 - `warmup_ratio`: 0.0
-- `warmup_steps`: 2
 - `log_level`: passive
 - `log_level_replica`: warning
 - `log_on_each_node`: True
@@ -549,10 +714,11 @@ You can finetune this model on your own dataset.
 </details>
 ### Training Logs
-| Epoch  | Step | Training Loss |
-|:------:|:----:|:-------------:|
-| 0.0487 | 200  | 3.0766        |
-| 0.0973 | 400  | 3.3862        |
 ### Framework Versions

 language:
 - en
 library_name: sentence-transformers
+metrics:
+- pearson_cosine
+- spearman_cosine
+- pearson_manhattan
+- spearman_manhattan
+- pearson_euclidean
+- spearman_euclidean
+- pearson_dot
+- spearman_dot
+- pearson_max
+- spearman_max
+- cosine_accuracy
+- dot_accuracy
+- manhattan_accuracy
+- euclidean_accuracy
+- max_accuracy
 pipeline_tag: sentence-similarity
 tags:
 - sentence-transformers
   - It is meant to stimulate root growth - in particular to stimulate the creation
     of roots.
   - A person folds a piece of paper.
+model-index:
+- name: SentenceTransformer based on allenai/longformer-base-4096
+  results:
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: sts dev
+      type: sts-dev
+    metrics:
+    - type: pearson_cosine
+      value: .nan
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: .nan
+      name: Spearman Cosine
+    - type: pearson_manhattan
+      value: 0.1953366031192939
+      name: Pearson Manhattan
+    - type: spearman_manhattan
+      value: 0.18628029922412706
+      name: Spearman Manhattan
+    - type: pearson_euclidean
+      value: 0.12038330059026879
+      name: Pearson Euclidean
+    - type: spearman_euclidean
+      value: 0.11701423250889276
+      name: Spearman Euclidean
+    - type: pearson_dot
+      value: -0.020898059060793592
+      name: Pearson Dot
+    - type: spearman_dot
+      value: -0.019267171663208498
+      name: Spearman Dot
+    - type: pearson_max
+      value: .nan
+      name: Pearson Max
+    - type: spearman_max
+      value: .nan
+      name: Spearman Max
+  - task:
+      type: triplet
+      name: Triplet
+    dataset:
+      name: triplet dev
+      type: triplet-dev
+    metrics:
+    - type: cosine_accuracy
+      value: 0.5089611178614823
+      name: Cosine Accuracy
+    - type: dot_accuracy
+      value: 0.24939246658566222
+      name: Dot Accuracy
+    - type: manhattan_accuracy
+      value: 0.511543134872418
+      name: Manhattan Accuracy
+    - type: euclidean_accuracy
+      value: 0.5103280680437424
+      name: Euclidean Accuracy
+    - type: max_accuracy
+      value: 0.511543134872418
+      name: Max Accuracy
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: label accuracy dev
+      type: label-accuracy-dev
+    metrics:
+    - type: pearson_cosine
+      value: .nan
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: .nan
+      name: Spearman Cosine
+    - type: pearson_manhattan
+      value: 0.049476403113581605
+      name: Pearson Manhattan
+    - type: spearman_manhattan
+      value: 0.05279290870444774
+      name: Spearman Manhattan
+    - type: pearson_euclidean
+      value: 0.03906753540286213
+      name: Pearson Euclidean
+    - type: spearman_euclidean
+      value: 0.04333503769885663
+      name: Spearman Euclidean
+    - type: pearson_dot
+      value: -0.011658647110881755
+      name: Pearson Dot
+    - type: spearman_dot
+      value: -0.009275521591297707
+      name: Spearman Dot
+    - type: pearson_max
+      value: .nan
+      name: Pearson Max
+    - type: spearman_max
+      value: .nan
+      name: Spearman Max
 ---
 # SentenceTransformer based on allenai/longformer-base-4096
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
+## Evaluation
+### Metrics
+#### Semantic Similarity
+* Dataset: `sts-dev`
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric             | Value   |
+|:-------------------|:--------|
+| pearson_cosine     | nan     |
+| spearman_cosine    | nan     |
+| pearson_manhattan  | 0.1953  |
+| spearman_manhattan | 0.1863  |
+| pearson_euclidean  | 0.1204  |
+| spearman_euclidean | 0.117   |
+| pearson_dot        | -0.0209 |
+| spearman_dot       | -0.0193 |
+| pearson_max        | nan     |
+| **spearman_max**   | **nan** |
+#### Triplet
+* Dataset: `triplet-dev`
+* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
+| Metric             | Value      |
+|:-------------------|:-----------|
+| cosine_accuracy    | 0.509      |
+| dot_accuracy       | 0.2494     |
+| manhattan_accuracy | 0.5115     |
+| euclidean_accuracy | 0.5103     |
+| **max_accuracy**   | **0.5115** |
+#### Semantic Similarity
+* Dataset: `label-accuracy-dev`
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric             | Value   |
+|:-------------------|:--------|
+| pearson_cosine     | nan     |
+| spearman_cosine    | nan     |
+| pearson_manhattan  | 0.0495  |
+| spearman_manhattan | 0.0528  |
+| pearson_euclidean  | 0.0391  |
+| spearman_euclidean | 0.0433  |
+| pearson_dot        | -0.0117 |
+| spearman_dot       | -0.0093 |
+| pearson_max        | nan     |
+| **spearman_max**   | **nan** |
 <!--
 ## Bias, Risks and Limitations
 - `overwrite_output_dir`: True
 - `eval_strategy`: steps
+- `learning_rate`: 3.304439853025411e-05
 - `num_train_epochs`: 10
+- `warmup_steps`: 1
 - `load_best_model_at_end`: True
 #### All Hyperparameters
 - `gradient_accumulation_steps`: 1
 - `eval_accumulation_steps`: None
 - `torch_empty_cache_steps`: None
+- `learning_rate`: 3.304439853025411e-05
 - `weight_decay`: 0.0
 - `adam_beta1`: 0.9
 - `adam_beta2`: 0.999
 - `lr_scheduler_type`: linear
 - `lr_scheduler_kwargs`: {}
 - `warmup_ratio`: 0.0
+- `warmup_steps`: 1
 - `log_level`: passive
 - `log_level_replica`: warning
 - `log_on_each_node`: True
 </details>
 ### Training Logs
+| Epoch  | Step | Training Loss | stsb loss | quora loss | all-nli-triplet loss | natural-questions loss | label-accuracy-dev_spearman_max | sts-dev_spearman_max | triplet-dev_max_accuracy |
+|:------:|:----:|:-------------:|:---------:|:----------:|:--------------------:|:----------------------:|:-------------------------------:|:--------------------:|:------------------------:|
+| 0.0487 | 200  | 3.3109        | -         | -          | -                    | -                      | -                               | -                    | -                        |
+| 0.0973 | 400  | 3.5823        | -         | -          | -                    | -                      | -                               | -                    | -                        |
+| 0.1217 | 500  | -             | 4.7553    | 2.7670     | 3.4649               | 2.7670                 | nan                             | nan                  | 0.5115                   |
 ### Framework Versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bbb5bce69afd7c602d494d87b143ad75ae94aac9d62a949bc27448d73486d9d5
 size 594668880

 version https://git-lfs.github.com/spec/v1
+oid sha256:045c131132c7123db999814287f8b2d08c841dacc9bc6aa11413997282d31ac7
 size 594668880