Leo1212 commited on
Commit
dce6d43
1 Parent(s): 703142c

Add new SentenceTransformer model.

Browse files
Files changed (2) hide show
  1. README.md +174 -8
  2. model.safetensors +1 -1
README.md CHANGED
@@ -8,6 +8,22 @@ datasets:
8
  language:
9
  - en
10
  library_name: sentence-transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  pipeline_tag: sentence-similarity
12
  tags:
13
  - sentence-transformers
@@ -48,6 +64,105 @@ widget:
48
  - It is meant to stimulate root growth - in particular to stimulate the creation
49
  of roots.
50
  - A person folds a piece of paper.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ---
52
 
53
  # SentenceTransformer based on allenai/longformer-base-4096
@@ -144,6 +259,56 @@ You can finetune this model on your own dataset.
144
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
145
  -->
146
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  <!--
148
  ## Bias, Risks and Limitations
149
 
@@ -425,9 +590,9 @@ You can finetune this model on your own dataset.
425
 
426
  - `overwrite_output_dir`: True
427
  - `eval_strategy`: steps
428
- - `learning_rate`: 0.00015722057717478097
429
  - `num_train_epochs`: 10
430
- - `warmup_steps`: 2
431
  - `load_best_model_at_end`: True
432
 
433
  #### All Hyperparameters
@@ -444,7 +609,7 @@ You can finetune this model on your own dataset.
444
  - `gradient_accumulation_steps`: 1
445
  - `eval_accumulation_steps`: None
446
  - `torch_empty_cache_steps`: None
447
- - `learning_rate`: 0.00015722057717478097
448
  - `weight_decay`: 0.0
449
  - `adam_beta1`: 0.9
450
  - `adam_beta2`: 0.999
@@ -455,7 +620,7 @@ You can finetune this model on your own dataset.
455
  - `lr_scheduler_type`: linear
456
  - `lr_scheduler_kwargs`: {}
457
  - `warmup_ratio`: 0.0
458
- - `warmup_steps`: 2
459
  - `log_level`: passive
460
  - `log_level_replica`: warning
461
  - `log_on_each_node`: True
@@ -549,10 +714,11 @@ You can finetune this model on your own dataset.
549
  </details>
550
 
551
  ### Training Logs
552
- | Epoch | Step | Training Loss |
553
- |:------:|:----:|:-------------:|
554
- | 0.0487 | 200 | 3.0766 |
555
- | 0.0973 | 400 | 3.3862 |
 
556
 
557
 
558
  ### Framework Versions
 
8
  language:
9
  - en
10
  library_name: sentence-transformers
11
+ metrics:
12
+ - pearson_cosine
13
+ - spearman_cosine
14
+ - pearson_manhattan
15
+ - spearman_manhattan
16
+ - pearson_euclidean
17
+ - spearman_euclidean
18
+ - pearson_dot
19
+ - spearman_dot
20
+ - pearson_max
21
+ - spearman_max
22
+ - cosine_accuracy
23
+ - dot_accuracy
24
+ - manhattan_accuracy
25
+ - euclidean_accuracy
26
+ - max_accuracy
27
  pipeline_tag: sentence-similarity
28
  tags:
29
  - sentence-transformers
 
64
  - It is meant to stimulate root growth - in particular to stimulate the creation
65
  of roots.
66
  - A person folds a piece of paper.
67
+ model-index:
68
+ - name: SentenceTransformer based on allenai/longformer-base-4096
69
+ results:
70
+ - task:
71
+ type: semantic-similarity
72
+ name: Semantic Similarity
73
+ dataset:
74
+ name: sts dev
75
+ type: sts-dev
76
+ metrics:
77
+ - type: pearson_cosine
78
+ value: .nan
79
+ name: Pearson Cosine
80
+ - type: spearman_cosine
81
+ value: .nan
82
+ name: Spearman Cosine
83
+ - type: pearson_manhattan
84
+ value: 0.1953366031192939
85
+ name: Pearson Manhattan
86
+ - type: spearman_manhattan
87
+ value: 0.18628029922412706
88
+ name: Spearman Manhattan
89
+ - type: pearson_euclidean
90
+ value: 0.12038330059026879
91
+ name: Pearson Euclidean
92
+ - type: spearman_euclidean
93
+ value: 0.11701423250889276
94
+ name: Spearman Euclidean
95
+ - type: pearson_dot
96
+ value: -0.020898059060793592
97
+ name: Pearson Dot
98
+ - type: spearman_dot
99
+ value: -0.019267171663208498
100
+ name: Spearman Dot
101
+ - type: pearson_max
102
+ value: .nan
103
+ name: Pearson Max
104
+ - type: spearman_max
105
+ value: .nan
106
+ name: Spearman Max
107
+ - task:
108
+ type: triplet
109
+ name: Triplet
110
+ dataset:
111
+ name: triplet dev
112
+ type: triplet-dev
113
+ metrics:
114
+ - type: cosine_accuracy
115
+ value: 0.5089611178614823
116
+ name: Cosine Accuracy
117
+ - type: dot_accuracy
118
+ value: 0.24939246658566222
119
+ name: Dot Accuracy
120
+ - type: manhattan_accuracy
121
+ value: 0.511543134872418
122
+ name: Manhattan Accuracy
123
+ - type: euclidean_accuracy
124
+ value: 0.5103280680437424
125
+ name: Euclidean Accuracy
126
+ - type: max_accuracy
127
+ value: 0.511543134872418
128
+ name: Max Accuracy
129
+ - task:
130
+ type: semantic-similarity
131
+ name: Semantic Similarity
132
+ dataset:
133
+ name: label accuracy dev
134
+ type: label-accuracy-dev
135
+ metrics:
136
+ - type: pearson_cosine
137
+ value: .nan
138
+ name: Pearson Cosine
139
+ - type: spearman_cosine
140
+ value: .nan
141
+ name: Spearman Cosine
142
+ - type: pearson_manhattan
143
+ value: 0.049476403113581605
144
+ name: Pearson Manhattan
145
+ - type: spearman_manhattan
146
+ value: 0.05279290870444774
147
+ name: Spearman Manhattan
148
+ - type: pearson_euclidean
149
+ value: 0.03906753540286213
150
+ name: Pearson Euclidean
151
+ - type: spearman_euclidean
152
+ value: 0.04333503769885663
153
+ name: Spearman Euclidean
154
+ - type: pearson_dot
155
+ value: -0.011658647110881755
156
+ name: Pearson Dot
157
+ - type: spearman_dot
158
+ value: -0.009275521591297707
159
+ name: Spearman Dot
160
+ - type: pearson_max
161
+ value: .nan
162
+ name: Pearson Max
163
+ - type: spearman_max
164
+ value: .nan
165
+ name: Spearman Max
166
  ---
167
 
168
  # SentenceTransformer based on allenai/longformer-base-4096
 
259
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
260
  -->
261
 
262
+ ## Evaluation
263
+
264
+ ### Metrics
265
+
266
+ #### Semantic Similarity
267
+ * Dataset: `sts-dev`
268
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
269
+
270
+ | Metric | Value |
271
+ |:-------------------|:--------|
272
+ | pearson_cosine | nan |
273
+ | spearman_cosine | nan |
274
+ | pearson_manhattan | 0.1953 |
275
+ | spearman_manhattan | 0.1863 |
276
+ | pearson_euclidean | 0.1204 |
277
+ | spearman_euclidean | 0.117 |
278
+ | pearson_dot | -0.0209 |
279
+ | spearman_dot | -0.0193 |
280
+ | pearson_max | nan |
281
+ | **spearman_max** | **nan** |
282
+
283
+ #### Triplet
284
+ * Dataset: `triplet-dev`
285
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
286
+
287
+ | Metric | Value |
288
+ |:-------------------|:-----------|
289
+ | cosine_accuracy | 0.509 |
290
+ | dot_accuracy | 0.2494 |
291
+ | manhattan_accuracy | 0.5115 |
292
+ | euclidean_accuracy | 0.5103 |
293
+ | **max_accuracy** | **0.5115** |
294
+
295
+ #### Semantic Similarity
296
+ * Dataset: `label-accuracy-dev`
297
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
298
+
299
+ | Metric | Value |
300
+ |:-------------------|:--------|
301
+ | pearson_cosine | nan |
302
+ | spearman_cosine | nan |
303
+ | pearson_manhattan | 0.0495 |
304
+ | spearman_manhattan | 0.0528 |
305
+ | pearson_euclidean | 0.0391 |
306
+ | spearman_euclidean | 0.0433 |
307
+ | pearson_dot | -0.0117 |
308
+ | spearman_dot | -0.0093 |
309
+ | pearson_max | nan |
310
+ | **spearman_max** | **nan** |
311
+
312
  <!--
313
  ## Bias, Risks and Limitations
314
 
 
590
 
591
  - `overwrite_output_dir`: True
592
  - `eval_strategy`: steps
593
+ - `learning_rate`: 3.304439853025411e-05
594
  - `num_train_epochs`: 10
595
+ - `warmup_steps`: 1
596
  - `load_best_model_at_end`: True
597
 
598
  #### All Hyperparameters
 
609
  - `gradient_accumulation_steps`: 1
610
  - `eval_accumulation_steps`: None
611
  - `torch_empty_cache_steps`: None
612
+ - `learning_rate`: 3.304439853025411e-05
613
  - `weight_decay`: 0.0
614
  - `adam_beta1`: 0.9
615
  - `adam_beta2`: 0.999
 
620
  - `lr_scheduler_type`: linear
621
  - `lr_scheduler_kwargs`: {}
622
  - `warmup_ratio`: 0.0
623
+ - `warmup_steps`: 1
624
  - `log_level`: passive
625
  - `log_level_replica`: warning
626
  - `log_on_each_node`: True
 
714
  </details>
715
 
716
  ### Training Logs
717
+ | Epoch | Step | Training Loss | stsb loss | quora loss | all-nli-triplet loss | natural-questions loss | label-accuracy-dev_spearman_max | sts-dev_spearman_max | triplet-dev_max_accuracy |
718
+ |:------:|:----:|:-------------:|:---------:|:----------:|:--------------------:|:----------------------:|:-------------------------------:|:--------------------:|:------------------------:|
719
+ | 0.0487 | 200 | 3.3109 | - | - | - | - | - | - | - |
720
+ | 0.0973 | 400 | 3.5823 | - | - | - | - | - | - | - |
721
+ | 0.1217 | 500 | - | 4.7553 | 2.7670 | 3.4649 | 2.7670 | nan | nan | 0.5115 |
722
 
723
 
724
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bbb5bce69afd7c602d494d87b143ad75ae94aac9d62a949bc27448d73486d9d5
3
  size 594668880
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:045c131132c7123db999814287f8b2d08c841dacc9bc6aa11413997282d31ac7
3
  size 594668880