tessimago commited on
Commit
791076d
·
verified ·
1 Parent(s): 4dfea1c

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,787 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-large-en-v1.5
3
+ library_name: sentence-transformers
4
+ metrics:
5
+ - cosine_accuracy@1
6
+ - cosine_accuracy@3
7
+ - cosine_accuracy@5
8
+ - cosine_accuracy@10
9
+ - cosine_precision@1
10
+ - cosine_precision@3
11
+ - cosine_precision@5
12
+ - cosine_precision@10
13
+ - cosine_recall@1
14
+ - cosine_recall@3
15
+ - cosine_recall@5
16
+ - cosine_recall@10
17
+ - cosine_ndcg@10
18
+ - cosine_mrr@10
19
+ - cosine_map@100
20
+ pipeline_tag: sentence-similarity
21
+ tags:
22
+ - sentence-transformers
23
+ - sentence-similarity
24
+ - feature-extraction
25
+ - generated_from_trainer
26
+ - dataset_size:1024
27
+ - loss:MultipleNegativesRankingLoss
28
+ widget:
29
+ - source_sentence: After rescue, survivors may require hospital treatment. This must
30
+ be provided as quickly as possible. The SMC should consider having ambulance and
31
+ hospital facilities ready.
32
+ sentences:
33
+ - What should the SMC consider having ready after a rescue?
34
+ - What is critical for mass rescue operations?
35
+ - What can computer programs do to relieve the search planner of computational burden?
36
+ - source_sentence: SMCs conduct communication searches when facts are needed to supplement
37
+ initially reported information. Efforts are continued to contact the craft, to
38
+ find out more about a possible distress situation, and to prepare for or to avoid
39
+ a search effort. Section 3.5 has more information on communication searches.MEDICO
40
+ Communications
41
+ sentences:
42
+ - What is generally produced by dead-reckoning navigation alone for search aircraft?
43
+ - What should be the widths of rectangular areas to be covered with a PS pattern
44
+ and the lengths of rectangular areas to be covered with a CS pattern?
45
+ - What is the purpose of SMCs conducting communication searches?
46
+ - source_sentence: 'SAR facilities include designated SRUs and other resources which
47
+ can be used to conduct or support SAR operations. An SRU is a unit composed of
48
+ trained personnel and provided with equipment suitable for the expeditious and
49
+ efficient conduct of search and rescue. An SRU can be an air, maritime, or land-based
50
+ facility. Facilities selected as SRUs should be able to reach the scene of distress
51
+ quickly and, in particular, be suitable for one or more of the following operations:–
52
+ providing assistance to prevent or reduce the severity of accidents and the hardship
53
+ of survivors, e.g., escorting an aircraft, standing by a sinking vessel;– conducting
54
+ a search;– delivering supplies and survival equipment to the scene;– rescuing
55
+ survivors;– providing food, medical or other initial needs of survivors; and–
56
+ delivering the survivors to a place of safety. '
57
+ sentences:
58
+ - What are the types of SAR facilities that can be used to conduct or support SAR
59
+ operations?
60
+ - What is the scenario in which a simulated communication search is carried out
61
+ and an air search is planned?
62
+ - What is discussed in detail in various other places in this Manual?
63
+ - source_sentence: Support facilities enable the operational response resources (e.g.,
64
+ the RCC and SRUs) to provide the SAR services. Without the supporting resources,
65
+ the operational resources cannot sustain effective operations. There is a wide
66
+ range of support facilities and services, which include the following:Training
67
+ facilities Facility maintenanceCommunications facilities Management functionsNavigation
68
+ systems Research and developmentSAR data providers (SDPs) PlanningMedical facilities
69
+ ExercisesAircraft landing fields Refuelling servicesVoluntary services (e.g.,
70
+ Red Cross) Critical incident stress counsellors Computer resources
71
+ sentences:
72
+ - How many ways are there to train SAR specialists and teams?
73
+ - What types of support facilities are mentioned in the context?
74
+ - What is the duration of a prolonged blast?
75
+ - source_sentence: 'Sound funding decisions arise out of accurate assessments made
76
+ of the SAR system. To measure the performance or effectiveness of a SAR system
77
+ usually requires collecting information or statistics and establishing agreed-upon
78
+ goals. All pertinent information should be collected, including where the system
79
+ failed to perform as it should have; failures and successes provide valuable information
80
+ in assessing effectiveness and determining means to improve. '
81
+ sentences:
82
+ - What is required to measure the performance or effectiveness of a SAR system?
83
+ - What is the purpose of having an SRR?
84
+ - What is the effect of decreasing track spacing on the area that can be searched?
85
+ model-index:
86
+ - name: SentenceTransformer based on BAAI/bge-large-en-v1.5
87
+ results:
88
+ - task:
89
+ type: information-retrieval
90
+ name: Information Retrieval
91
+ dataset:
92
+ name: dim 768
93
+ type: dim_768
94
+ metrics:
95
+ - type: cosine_accuracy@1
96
+ value: 0.7719298245614035
97
+ name: Cosine Accuracy@1
98
+ - type: cosine_accuracy@3
99
+ value: 0.9298245614035088
100
+ name: Cosine Accuracy@3
101
+ - type: cosine_accuracy@5
102
+ value: 0.956140350877193
103
+ name: Cosine Accuracy@5
104
+ - type: cosine_accuracy@10
105
+ value: 1.0
106
+ name: Cosine Accuracy@10
107
+ - type: cosine_precision@1
108
+ value: 0.7719298245614035
109
+ name: Cosine Precision@1
110
+ - type: cosine_precision@3
111
+ value: 0.3099415204678363
112
+ name: Cosine Precision@3
113
+ - type: cosine_precision@5
114
+ value: 0.1912280701754386
115
+ name: Cosine Precision@5
116
+ - type: cosine_precision@10
117
+ value: 0.1
118
+ name: Cosine Precision@10
119
+ - type: cosine_recall@1
120
+ value: 0.7719298245614035
121
+ name: Cosine Recall@1
122
+ - type: cosine_recall@3
123
+ value: 0.9298245614035088
124
+ name: Cosine Recall@3
125
+ - type: cosine_recall@5
126
+ value: 0.956140350877193
127
+ name: Cosine Recall@5
128
+ - type: cosine_recall@10
129
+ value: 1.0
130
+ name: Cosine Recall@10
131
+ - type: cosine_ndcg@10
132
+ value: 0.8884520476480379
133
+ name: Cosine Ndcg@10
134
+ - type: cosine_mrr@10
135
+ value: 0.8524470899470901
136
+ name: Cosine Mrr@10
137
+ - type: cosine_map@100
138
+ value: 0.85244708994709
139
+ name: Cosine Map@100
140
+ - task:
141
+ type: information-retrieval
142
+ name: Information Retrieval
143
+ dataset:
144
+ name: dim 512
145
+ type: dim_512
146
+ metrics:
147
+ - type: cosine_accuracy@1
148
+ value: 0.7543859649122807
149
+ name: Cosine Accuracy@1
150
+ - type: cosine_accuracy@3
151
+ value: 0.9122807017543859
152
+ name: Cosine Accuracy@3
153
+ - type: cosine_accuracy@5
154
+ value: 0.956140350877193
155
+ name: Cosine Accuracy@5
156
+ - type: cosine_accuracy@10
157
+ value: 0.9912280701754386
158
+ name: Cosine Accuracy@10
159
+ - type: cosine_precision@1
160
+ value: 0.7543859649122807
161
+ name: Cosine Precision@1
162
+ - type: cosine_precision@3
163
+ value: 0.304093567251462
164
+ name: Cosine Precision@3
165
+ - type: cosine_precision@5
166
+ value: 0.1912280701754386
167
+ name: Cosine Precision@5
168
+ - type: cosine_precision@10
169
+ value: 0.09912280701754386
170
+ name: Cosine Precision@10
171
+ - type: cosine_recall@1
172
+ value: 0.7543859649122807
173
+ name: Cosine Recall@1
174
+ - type: cosine_recall@3
175
+ value: 0.9122807017543859
176
+ name: Cosine Recall@3
177
+ - type: cosine_recall@5
178
+ value: 0.956140350877193
179
+ name: Cosine Recall@5
180
+ - type: cosine_recall@10
181
+ value: 0.9912280701754386
182
+ name: Cosine Recall@10
183
+ - type: cosine_ndcg@10
184
+ value: 0.8791120820747885
185
+ name: Cosine Ndcg@10
186
+ - type: cosine_mrr@10
187
+ value: 0.8425438596491228
188
+ name: Cosine Mrr@10
189
+ - type: cosine_map@100
190
+ value: 0.8431704260651629
191
+ name: Cosine Map@100
192
+ - task:
193
+ type: information-retrieval
194
+ name: Information Retrieval
195
+ dataset:
196
+ name: dim 256
197
+ type: dim_256
198
+ metrics:
199
+ - type: cosine_accuracy@1
200
+ value: 0.7456140350877193
201
+ name: Cosine Accuracy@1
202
+ - type: cosine_accuracy@3
203
+ value: 0.8947368421052632
204
+ name: Cosine Accuracy@3
205
+ - type: cosine_accuracy@5
206
+ value: 0.9385964912280702
207
+ name: Cosine Accuracy@5
208
+ - type: cosine_accuracy@10
209
+ value: 0.9649122807017544
210
+ name: Cosine Accuracy@10
211
+ - type: cosine_precision@1
212
+ value: 0.7456140350877193
213
+ name: Cosine Precision@1
214
+ - type: cosine_precision@3
215
+ value: 0.2982456140350877
216
+ name: Cosine Precision@3
217
+ - type: cosine_precision@5
218
+ value: 0.18771929824561406
219
+ name: Cosine Precision@5
220
+ - type: cosine_precision@10
221
+ value: 0.09649122807017543
222
+ name: Cosine Precision@10
223
+ - type: cosine_recall@1
224
+ value: 0.7456140350877193
225
+ name: Cosine Recall@1
226
+ - type: cosine_recall@3
227
+ value: 0.8947368421052632
228
+ name: Cosine Recall@3
229
+ - type: cosine_recall@5
230
+ value: 0.9385964912280702
231
+ name: Cosine Recall@5
232
+ - type: cosine_recall@10
233
+ value: 0.9649122807017544
234
+ name: Cosine Recall@10
235
+ - type: cosine_ndcg@10
236
+ value: 0.8623224236283672
237
+ name: Cosine Ndcg@10
238
+ - type: cosine_mrr@10
239
+ value: 0.8287628794207742
240
+ name: Cosine Mrr@10
241
+ - type: cosine_map@100
242
+ value: 0.8310819942011893
243
+ name: Cosine Map@100
244
+ - task:
245
+ type: information-retrieval
246
+ name: Information Retrieval
247
+ dataset:
248
+ name: dim 128
249
+ type: dim_128
250
+ metrics:
251
+ - type: cosine_accuracy@1
252
+ value: 0.7017543859649122
253
+ name: Cosine Accuracy@1
254
+ - type: cosine_accuracy@3
255
+ value: 0.8245614035087719
256
+ name: Cosine Accuracy@3
257
+ - type: cosine_accuracy@5
258
+ value: 0.8771929824561403
259
+ name: Cosine Accuracy@5
260
+ - type: cosine_accuracy@10
261
+ value: 0.9385964912280702
262
+ name: Cosine Accuracy@10
263
+ - type: cosine_precision@1
264
+ value: 0.7017543859649122
265
+ name: Cosine Precision@1
266
+ - type: cosine_precision@3
267
+ value: 0.27485380116959063
268
+ name: Cosine Precision@3
269
+ - type: cosine_precision@5
270
+ value: 0.17543859649122803
271
+ name: Cosine Precision@5
272
+ - type: cosine_precision@10
273
+ value: 0.09385964912280703
274
+ name: Cosine Precision@10
275
+ - type: cosine_recall@1
276
+ value: 0.7017543859649122
277
+ name: Cosine Recall@1
278
+ - type: cosine_recall@3
279
+ value: 0.8245614035087719
280
+ name: Cosine Recall@3
281
+ - type: cosine_recall@5
282
+ value: 0.8771929824561403
283
+ name: Cosine Recall@5
284
+ - type: cosine_recall@10
285
+ value: 0.9385964912280702
286
+ name: Cosine Recall@10
287
+ - type: cosine_ndcg@10
288
+ value: 0.8146917044508328
289
+ name: Cosine Ndcg@10
290
+ - type: cosine_mrr@10
291
+ value: 0.7757031467557786
292
+ name: Cosine Mrr@10
293
+ - type: cosine_map@100
294
+ value: 0.7788889950899075
295
+ name: Cosine Map@100
296
+ - task:
297
+ type: information-retrieval
298
+ name: Information Retrieval
299
+ dataset:
300
+ name: dim 64
301
+ type: dim_64
302
+ metrics:
303
+ - type: cosine_accuracy@1
304
+ value: 0.6228070175438597
305
+ name: Cosine Accuracy@1
306
+ - type: cosine_accuracy@3
307
+ value: 0.7543859649122807
308
+ name: Cosine Accuracy@3
309
+ - type: cosine_accuracy@5
310
+ value: 0.7894736842105263
311
+ name: Cosine Accuracy@5
312
+ - type: cosine_accuracy@10
313
+ value: 0.8596491228070176
314
+ name: Cosine Accuracy@10
315
+ - type: cosine_precision@1
316
+ value: 0.6228070175438597
317
+ name: Cosine Precision@1
318
+ - type: cosine_precision@3
319
+ value: 0.25146198830409355
320
+ name: Cosine Precision@3
321
+ - type: cosine_precision@5
322
+ value: 0.15789473684210523
323
+ name: Cosine Precision@5
324
+ - type: cosine_precision@10
325
+ value: 0.08596491228070174
326
+ name: Cosine Precision@10
327
+ - type: cosine_recall@1
328
+ value: 0.6228070175438597
329
+ name: Cosine Recall@1
330
+ - type: cosine_recall@3
331
+ value: 0.7543859649122807
332
+ name: Cosine Recall@3
333
+ - type: cosine_recall@5
334
+ value: 0.7894736842105263
335
+ name: Cosine Recall@5
336
+ - type: cosine_recall@10
337
+ value: 0.8596491228070176
338
+ name: Cosine Recall@10
339
+ - type: cosine_ndcg@10
340
+ value: 0.7406737402395112
341
+ name: Cosine Ndcg@10
342
+ - type: cosine_mrr@10
343
+ value: 0.703104984683932
344
+ name: Cosine Mrr@10
345
+ - type: cosine_map@100
346
+ value: 0.71092932980045
347
+ name: Cosine Map@100
348
+ ---
349
+
350
+ # SentenceTransformer based on BAAI/bge-large-en-v1.5
351
+
352
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
353
+
354
+ ## Model Details
355
+
356
+ ### Model Description
357
+ - **Model Type:** Sentence Transformer
358
+ - **Base model:** [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) <!-- at revision d4aa6901d3a41ba39fb536a557fa166f842b0e09 -->
359
+ - **Maximum Sequence Length:** 512 tokens
360
+ - **Output Dimensionality:** 1024 tokens
361
+ - **Similarity Function:** Cosine Similarity
362
+ - **Training Dataset:**
363
+ - json
364
+ <!-- - **Language:** Unknown -->
365
+ <!-- - **License:** Unknown -->
366
+
367
+ ### Model Sources
368
+
369
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
370
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
371
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
372
+
373
+ ### Full Model Architecture
374
+
375
+ ```
376
+ SentenceTransformer(
377
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
378
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
379
+ (2): Normalize()
380
+ )
381
+ ```
382
+
383
+ ## Usage
384
+
385
+ ### Direct Usage (Sentence Transformers)
386
+
387
+ First install the Sentence Transformers library:
388
+
389
+ ```bash
390
+ pip install -U sentence-transformers
391
+ ```
392
+
393
+ Then you can load this model and run inference.
394
+ ```python
395
+ from sentence_transformers import SentenceTransformer
396
+
397
+ # Download from the 🤗 Hub
398
+ model = SentenceTransformer("tessimago/bge-large-repmus-cross_entropy")
399
+ # Run inference
400
+ sentences = [
401
+ 'Sound funding decisions arise out of accurate assessments made of the SAR system. To measure the performance or effectiveness of a SAR system usually requires collecting information or statistics and establishing agreed-upon goals. All pertinent information should be collected, including where the system failed to perform as it should have; failures and successes provide valuable information in assessing effectiveness and determining means to improve. ',
402
+ 'What is required to measure the performance or effectiveness of a SAR system?',
403
+ 'What is the effect of decreasing track spacing on the area that can be searched?',
404
+ ]
405
+ embeddings = model.encode(sentences)
406
+ print(embeddings.shape)
407
+ # [3, 1024]
408
+
409
+ # Get the similarity scores for the embeddings
410
+ similarities = model.similarity(embeddings, embeddings)
411
+ print(similarities.shape)
412
+ # [3, 3]
413
+ ```
414
+
415
+ <!--
416
+ ### Direct Usage (Transformers)
417
+
418
+ <details><summary>Click to see the direct usage in Transformers</summary>
419
+
420
+ </details>
421
+ -->
422
+
423
+ <!--
424
+ ### Downstream Usage (Sentence Transformers)
425
+
426
+ You can finetune this model on your own dataset.
427
+
428
+ <details><summary>Click to expand</summary>
429
+
430
+ </details>
431
+ -->
432
+
433
+ <!--
434
+ ### Out-of-Scope Use
435
+
436
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
437
+ -->
438
+
439
+ ## Evaluation
440
+
441
+ ### Metrics
442
+
443
+ #### Information Retrieval
444
+ * Dataset: `dim_768`
445
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
446
+
447
+ | Metric | Value |
448
+ |:--------------------|:-----------|
449
+ | cosine_accuracy@1 | 0.7719 |
450
+ | cosine_accuracy@3 | 0.9298 |
451
+ | cosine_accuracy@5 | 0.9561 |
452
+ | cosine_accuracy@10 | 1.0 |
453
+ | cosine_precision@1 | 0.7719 |
454
+ | cosine_precision@3 | 0.3099 |
455
+ | cosine_precision@5 | 0.1912 |
456
+ | cosine_precision@10 | 0.1 |
457
+ | cosine_recall@1 | 0.7719 |
458
+ | cosine_recall@3 | 0.9298 |
459
+ | cosine_recall@5 | 0.9561 |
460
+ | cosine_recall@10 | 1.0 |
461
+ | cosine_ndcg@10 | 0.8885 |
462
+ | cosine_mrr@10 | 0.8524 |
463
+ | **cosine_map@100** | **0.8524** |
464
+
465
+ #### Information Retrieval
466
+ * Dataset: `dim_512`
467
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
468
+
469
+ | Metric | Value |
470
+ |:--------------------|:-----------|
471
+ | cosine_accuracy@1 | 0.7544 |
472
+ | cosine_accuracy@3 | 0.9123 |
473
+ | cosine_accuracy@5 | 0.9561 |
474
+ | cosine_accuracy@10 | 0.9912 |
475
+ | cosine_precision@1 | 0.7544 |
476
+ | cosine_precision@3 | 0.3041 |
477
+ | cosine_precision@5 | 0.1912 |
478
+ | cosine_precision@10 | 0.0991 |
479
+ | cosine_recall@1 | 0.7544 |
480
+ | cosine_recall@3 | 0.9123 |
481
+ | cosine_recall@5 | 0.9561 |
482
+ | cosine_recall@10 | 0.9912 |
483
+ | cosine_ndcg@10 | 0.8791 |
484
+ | cosine_mrr@10 | 0.8425 |
485
+ | **cosine_map@100** | **0.8432** |
486
+
487
+ #### Information Retrieval
488
+ * Dataset: `dim_256`
489
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
490
+
491
+ | Metric | Value |
492
+ |:--------------------|:-----------|
493
+ | cosine_accuracy@1 | 0.7456 |
494
+ | cosine_accuracy@3 | 0.8947 |
495
+ | cosine_accuracy@5 | 0.9386 |
496
+ | cosine_accuracy@10 | 0.9649 |
497
+ | cosine_precision@1 | 0.7456 |
498
+ | cosine_precision@3 | 0.2982 |
499
+ | cosine_precision@5 | 0.1877 |
500
+ | cosine_precision@10 | 0.0965 |
501
+ | cosine_recall@1 | 0.7456 |
502
+ | cosine_recall@3 | 0.8947 |
503
+ | cosine_recall@5 | 0.9386 |
504
+ | cosine_recall@10 | 0.9649 |
505
+ | cosine_ndcg@10 | 0.8623 |
506
+ | cosine_mrr@10 | 0.8288 |
507
+ | **cosine_map@100** | **0.8311** |
508
+
509
+ #### Information Retrieval
510
+ * Dataset: `dim_128`
511
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
512
+
513
+ | Metric | Value |
514
+ |:--------------------|:-----------|
515
+ | cosine_accuracy@1 | 0.7018 |
516
+ | cosine_accuracy@3 | 0.8246 |
517
+ | cosine_accuracy@5 | 0.8772 |
518
+ | cosine_accuracy@10 | 0.9386 |
519
+ | cosine_precision@1 | 0.7018 |
520
+ | cosine_precision@3 | 0.2749 |
521
+ | cosine_precision@5 | 0.1754 |
522
+ | cosine_precision@10 | 0.0939 |
523
+ | cosine_recall@1 | 0.7018 |
524
+ | cosine_recall@3 | 0.8246 |
525
+ | cosine_recall@5 | 0.8772 |
526
+ | cosine_recall@10 | 0.9386 |
527
+ | cosine_ndcg@10 | 0.8147 |
528
+ | cosine_mrr@10 | 0.7757 |
529
+ | **cosine_map@100** | **0.7789** |
530
+
531
+ #### Information Retrieval
532
+ * Dataset: `dim_64`
533
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
534
+
535
+ | Metric | Value |
536
+ |:--------------------|:-----------|
537
+ | cosine_accuracy@1 | 0.6228 |
538
+ | cosine_accuracy@3 | 0.7544 |
539
+ | cosine_accuracy@5 | 0.7895 |
540
+ | cosine_accuracy@10 | 0.8596 |
541
+ | cosine_precision@1 | 0.6228 |
542
+ | cosine_precision@3 | 0.2515 |
543
+ | cosine_precision@5 | 0.1579 |
544
+ | cosine_precision@10 | 0.086 |
545
+ | cosine_recall@1 | 0.6228 |
546
+ | cosine_recall@3 | 0.7544 |
547
+ | cosine_recall@5 | 0.7895 |
548
+ | cosine_recall@10 | 0.8596 |
549
+ | cosine_ndcg@10 | 0.7407 |
550
+ | cosine_mrr@10 | 0.7031 |
551
+ | **cosine_map@100** | **0.7109** |
552
+
553
+ <!--
554
+ ## Bias, Risks and Limitations
555
+
556
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
557
+ -->
558
+
559
+ <!--
560
+ ### Recommendations
561
+
562
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
563
+ -->
564
+
565
+ ## Training Details
566
+
567
+ ### Training Dataset
568
+
569
+ #### json
570
+
571
+ * Dataset: json
572
+ * Size: 1,024 training samples
573
+ * Columns: <code>positive</code> and <code>anchor</code>
574
+ * Approximate statistics based on the first 1000 samples:
575
+ | | positive | anchor |
576
+ |:--------|:-------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
577
+ | type | string | string |
578
+ | details | <ul><li>min: 10 tokens</li><li>mean: 133.58 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 17.7 tokens</li><li>max: 39 tokens</li></ul> |
579
+ * Samples:
580
+ | positive | anchor |
581
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------|
582
+ | <code>The debriefing helps to ensure that all survivors are rescued, to attend to the physical welfare of each survivor, and to obtain information which may assist and improve SAR services. Proper debriefing techniques include:– due care to avoid worsening a survivor’s condition by excessive debriefing;– careful assessment of the survivor’s statements if the survivor is frightened or excited;– use of a calm voice in questioning;– avoidance of suggesting the answers when obtaining facts; and– explaining that the information requested is important for the success of the SAR operation, and possibly for future SAR operations.</code> | <code>What are some proper debriefing techniques used in SAR services?</code> |
583
+ | <code>Communicating with passengers is more difficult in remote areas where phone service may be inadequate or lacking. If phones do exist, calling the airline or shipping company may be the best way to check in and find out information. In more populated areas, local agencies may have an emergency evacuation plan or other useful plan that can be implemented.IE961E.indb 21 6/28/2013 10:29:55 AM</code> | <code>What is a good way to check in and find out information in remote areas where phone service may be inadequate or lacking?</code> |
584
+ | <code>Voice communication is the basis of telemedical advice. It allows free dialogue and contributes to the human relationship, which is crucial to any medical consultation. Text messages are a useful complement to the voice telemedical advice and add the reliability of writing. Facsimile allows the exchange of pictures or diagrams, which help to identify a symptom, describe a lesion or the method of treatment. Digital data transmissions (photographs or electrocardiogram) provide an objective and potentially crucial addition to descriptive and subjective clinical data.</code> | <code>What are the types of communication methods used in telemedical advice?</code> |
585
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
586
+ ```json
587
+ {
588
+ "scale": 20.0,
589
+ "similarity_fct": "cos_sim"
590
+ }
591
+ ```
592
+
593
+ ### Training Hyperparameters
594
+ #### Non-Default Hyperparameters
595
+
596
+ - `eval_strategy`: epoch
597
+ - `per_device_train_batch_size`: 32
598
+ - `per_device_eval_batch_size`: 16
599
+ - `gradient_accumulation_steps`: 16
600
+ - `learning_rate`: 2e-05
601
+ - `num_train_epochs`: 4
602
+ - `lr_scheduler_type`: cosine
603
+ - `warmup_ratio`: 0.1
604
+ - `bf16`: True
605
+ - `tf32`: True
606
+ - `load_best_model_at_end`: True
607
+ - `optim`: adamw_torch_fused
608
+
609
+ #### All Hyperparameters
610
+ <details><summary>Click to expand</summary>
611
+
612
+ - `overwrite_output_dir`: False
613
+ - `do_predict`: False
614
+ - `eval_strategy`: epoch
615
+ - `prediction_loss_only`: True
616
+ - `per_device_train_batch_size`: 32
617
+ - `per_device_eval_batch_size`: 16
618
+ - `per_gpu_train_batch_size`: None
619
+ - `per_gpu_eval_batch_size`: None
620
+ - `gradient_accumulation_steps`: 16
621
+ - `eval_accumulation_steps`: None
622
+ - `learning_rate`: 2e-05
623
+ - `weight_decay`: 0.0
624
+ - `adam_beta1`: 0.9
625
+ - `adam_beta2`: 0.999
626
+ - `adam_epsilon`: 1e-08
627
+ - `max_grad_norm`: 1.0
628
+ - `num_train_epochs`: 4
629
+ - `max_steps`: -1
630
+ - `lr_scheduler_type`: cosine
631
+ - `lr_scheduler_kwargs`: {}
632
+ - `warmup_ratio`: 0.1
633
+ - `warmup_steps`: 0
634
+ - `log_level`: passive
635
+ - `log_level_replica`: warning
636
+ - `log_on_each_node`: True
637
+ - `logging_nan_inf_filter`: True
638
+ - `save_safetensors`: True
639
+ - `save_on_each_node`: False
640
+ - `save_only_model`: False
641
+ - `restore_callback_states_from_checkpoint`: False
642
+ - `no_cuda`: False
643
+ - `use_cpu`: False
644
+ - `use_mps_device`: False
645
+ - `seed`: 42
646
+ - `data_seed`: None
647
+ - `jit_mode_eval`: False
648
+ - `use_ipex`: False
649
+ - `bf16`: True
650
+ - `fp16`: False
651
+ - `fp16_opt_level`: O1
652
+ - `half_precision_backend`: auto
653
+ - `bf16_full_eval`: False
654
+ - `fp16_full_eval`: False
655
+ - `tf32`: True
656
+ - `local_rank`: 0
657
+ - `ddp_backend`: None
658
+ - `tpu_num_cores`: None
659
+ - `tpu_metrics_debug`: False
660
+ - `debug`: []
661
+ - `dataloader_drop_last`: False
662
+ - `dataloader_num_workers`: 0
663
+ - `dataloader_prefetch_factor`: None
664
+ - `past_index`: -1
665
+ - `disable_tqdm`: False
666
+ - `remove_unused_columns`: True
667
+ - `label_names`: None
668
+ - `load_best_model_at_end`: True
669
+ - `ignore_data_skip`: False
670
+ - `fsdp`: []
671
+ - `fsdp_min_num_params`: 0
672
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
673
+ - `fsdp_transformer_layer_cls_to_wrap`: None
674
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
675
+ - `deepspeed`: None
676
+ - `label_smoothing_factor`: 0.0
677
+ - `optim`: adamw_torch_fused
678
+ - `optim_args`: None
679
+ - `adafactor`: False
680
+ - `group_by_length`: False
681
+ - `length_column_name`: length
682
+ - `ddp_find_unused_parameters`: None
683
+ - `ddp_bucket_cap_mb`: None
684
+ - `ddp_broadcast_buffers`: False
685
+ - `dataloader_pin_memory`: True
686
+ - `dataloader_persistent_workers`: False
687
+ - `skip_memory_metrics`: True
688
+ - `use_legacy_prediction_loop`: False
689
+ - `push_to_hub`: False
690
+ - `resume_from_checkpoint`: None
691
+ - `hub_model_id`: None
692
+ - `hub_strategy`: every_save
693
+ - `hub_private_repo`: False
694
+ - `hub_always_push`: False
695
+ - `gradient_checkpointing`: False
696
+ - `gradient_checkpointing_kwargs`: None
697
+ - `include_inputs_for_metrics`: False
698
+ - `eval_do_concat_batches`: True
699
+ - `fp16_backend`: auto
700
+ - `push_to_hub_model_id`: None
701
+ - `push_to_hub_organization`: None
702
+ - `mp_parameters`:
703
+ - `auto_find_batch_size`: False
704
+ - `full_determinism`: False
705
+ - `torchdynamo`: None
706
+ - `ray_scope`: last
707
+ - `ddp_timeout`: 1800
708
+ - `torch_compile`: False
709
+ - `torch_compile_backend`: None
710
+ - `torch_compile_mode`: None
711
+ - `dispatch_batches`: None
712
+ - `split_batches`: None
713
+ - `include_tokens_per_second`: False
714
+ - `include_num_input_tokens_seen`: False
715
+ - `neftune_noise_alpha`: None
716
+ - `optim_target_modules`: None
717
+ - `batch_eval_metrics`: False
718
+ - `batch_sampler`: batch_sampler
719
+ - `multi_dataset_batch_sampler`: proportional
720
+
721
+ </details>
722
+
723
+ ### Training Logs
724
+ | Epoch | Step | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
725
+ |:-------:|:-----:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
726
+ | 1.0 | 2 | 0.7770 | 0.8173 | 0.8316 | 0.6838 | 0.8448 |
727
+ | **2.0** | **4** | **0.7858** | **0.8221** | **0.8326** | **0.6993** | **0.8478** |
728
+ | 3.0 | 6 | 0.7801 | 0.8297 | 0.8412 | 0.7101 | 0.8517 |
729
+ | 4.0 | 8 | 0.7789 | 0.8311 | 0.8432 | 0.7109 | 0.8524 |
730
+
731
+ * The bold row denotes the saved checkpoint.
732
+
733
+ ### Framework Versions
734
+ - Python: 3.10.14
735
+ - Sentence Transformers: 3.1.0
736
+ - Transformers: 4.41.2
737
+ - PyTorch: 2.1.2+cu121
738
+ - Accelerate: 0.34.2
739
+ - Datasets: 2.19.1
740
+ - Tokenizers: 0.19.1
741
+
742
+ ## Citation
743
+
744
+ ### BibTeX
745
+
746
+ #### Sentence Transformers
747
+ ```bibtex
748
+ @inproceedings{reimers-2019-sentence-bert,
749
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
750
+ author = "Reimers, Nils and Gurevych, Iryna",
751
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
752
+ month = "11",
753
+ year = "2019",
754
+ publisher = "Association for Computational Linguistics",
755
+ url = "https://arxiv.org/abs/1908.10084",
756
+ }
757
+ ```
758
+
759
+ #### MultipleNegativesRankingLoss
760
+ ```bibtex
761
+ @misc{henderson2017efficient,
762
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
763
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
764
+ year={2017},
765
+ eprint={1705.00652},
766
+ archivePrefix={arXiv},
767
+ primaryClass={cs.CL}
768
+ }
769
+ ```
770
+
771
+ <!--
772
+ ## Glossary
773
+
774
+ *Clearly define terms in order to be accessible across audiences.*
775
+ -->
776
+
777
+ <!--
778
+ ## Model Card Authors
779
+
780
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
781
+ -->
782
+
783
+ <!--
784
+ ## Model Card Contact
785
+
786
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
787
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-large-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 4096,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 16,
24
+ "num_hidden_layers": 24,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a2e031c5bdabb4864eeb7e99d55dd3bc37b6359905d73b5d7b2ca763d1c5d1f4
3
+ size 1340612432
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff