vineet10 commited on
Commit
e2644f6
·
verified ·
1 Parent(s): 23b4bb3

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,767 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ metrics:
7
+ - cosine_accuracy@1
8
+ - cosine_accuracy@3
9
+ - cosine_accuracy@5
10
+ - cosine_accuracy@10
11
+ - cosine_precision@1
12
+ - cosine_precision@3
13
+ - cosine_precision@5
14
+ - cosine_precision@10
15
+ - cosine_recall@1
16
+ - cosine_recall@3
17
+ - cosine_recall@5
18
+ - cosine_recall@10
19
+ - cosine_ndcg@10
20
+ - cosine_mrr@10
21
+ - cosine_map@100
22
+ pipeline_tag: sentence-similarity
23
+ tags:
24
+ - sentence-transformers
25
+ - sentence-similarity
26
+ - feature-extraction
27
+ - generated_from_trainer
28
+ - dataset_size:48
29
+ - loss:MultipleNegativesRankingLoss
30
+ widget:
31
+ - source_sentence: Users are entitled to a refund for excess payments after necessary
32
+ deductions, provided that payments were not processed to a wrong account due to
33
+ user error.
34
+ sentences:
35
+ - What is the timeline for the delivery of the documentary film as outlined in this
36
+ contract?
37
+ - Under what circumstances can a user receive a refund for multiple payments made
38
+ for a single order?
39
+ - What are the Payment Terms for the Batteries?
40
+ - source_sentence: Users can contact Customer Care before confirmation to request
41
+ a refund for offline services or reschedule for online services, subject to the
42
+ platform's discretion.
43
+ sentences:
44
+ - How does Paratalks handle refund requests made before a service professional confirms
45
+ a booking?
46
+ - How should proprietary and confidential information disclosed under the Agreement
47
+ be treated by the Parties?
48
+ - When does this Agreement terminate?
49
+ - source_sentence: If there is any unreasonable delay in the refund process, the User
50
+ can report it to Customer Care at [email protected] or +91-9116768791.
51
+ sentences:
52
+ - What should a User do if there is an unreasonable delay in the refund process?
53
+ - What are the confidentiality provisions in this contract?
54
+ - What are the specified payment terms for the photography services under this contract?
55
+ - source_sentence: The refund (if permitted by the Platform) shall be processed after
56
+ deductions, which may include transaction charges levied by the bank and/or the
57
+ payment gateway, as well as any other charges incurred by the Platform for facilitating
58
+ the payment or refund.
59
+ sentences:
60
+ - What are the conditions under which a user is not entitled to a refund according
61
+ to Paratalks' refund policy?
62
+ - What is the jurisdiction and governing law applicable to this contract?
63
+ - How are refunds processed if permitted by the Platform?
64
+ - source_sentence: This Agreement shall be governed by and construed in accordance
65
+ with the laws of Indiana. Any dispute arising out of or in connection with this
66
+ Agreement shall be resolved through good faith negotiations between the Parties
67
+ and will be subject to the jurisdiction of the courts of Dania.
68
+ sentences:
69
+ - Under what condition will the User not be entitled to a refund if the payment
70
+ is processed to a wrong Account?
71
+ - What events constitute Force Majeure under this Agreement?
72
+ - Under which laws is the Battery Supply Agreement governed and how are disputes
73
+ resolved?
74
+ model-index:
75
+ - name: SentenceTransformer based on BAAI/bge-base-en-v1.5
76
+ results:
77
+ - task:
78
+ type: information-retrieval
79
+ name: Information Retrieval
80
+ dataset:
81
+ name: dim 768
82
+ type: dim_768
83
+ metrics:
84
+ - type: cosine_accuracy@1
85
+ value: 0.8333333333333334
86
+ name: Cosine Accuracy@1
87
+ - type: cosine_accuracy@3
88
+ value: 0.8333333333333334
89
+ name: Cosine Accuracy@3
90
+ - type: cosine_accuracy@5
91
+ value: 0.8333333333333334
92
+ name: Cosine Accuracy@5
93
+ - type: cosine_accuracy@10
94
+ value: 1.0
95
+ name: Cosine Accuracy@10
96
+ - type: cosine_precision@1
97
+ value: 0.8333333333333334
98
+ name: Cosine Precision@1
99
+ - type: cosine_precision@3
100
+ value: 0.27777777777777773
101
+ name: Cosine Precision@3
102
+ - type: cosine_precision@5
103
+ value: 0.16666666666666666
104
+ name: Cosine Precision@5
105
+ - type: cosine_precision@10
106
+ value: 0.09999999999999999
107
+ name: Cosine Precision@10
108
+ - type: cosine_recall@1
109
+ value: 0.8333333333333334
110
+ name: Cosine Recall@1
111
+ - type: cosine_recall@3
112
+ value: 0.8333333333333334
113
+ name: Cosine Recall@3
114
+ - type: cosine_recall@5
115
+ value: 0.8333333333333334
116
+ name: Cosine Recall@5
117
+ - type: cosine_recall@10
118
+ value: 1.0
119
+ name: Cosine Recall@10
120
+ - type: cosine_ndcg@10
121
+ value: 0.892701197851337
122
+ name: Cosine Ndcg@10
123
+ - type: cosine_mrr@10
124
+ value: 0.8611111111111112
125
+ name: Cosine Mrr@10
126
+ - type: cosine_map@100
127
+ value: 0.8611111111111112
128
+ name: Cosine Map@100
129
+ - task:
130
+ type: information-retrieval
131
+ name: Information Retrieval
132
+ dataset:
133
+ name: dim 512
134
+ type: dim_512
135
+ metrics:
136
+ - type: cosine_accuracy@1
137
+ value: 0.8333333333333334
138
+ name: Cosine Accuracy@1
139
+ - type: cosine_accuracy@3
140
+ value: 0.8333333333333334
141
+ name: Cosine Accuracy@3
142
+ - type: cosine_accuracy@5
143
+ value: 0.8333333333333334
144
+ name: Cosine Accuracy@5
145
+ - type: cosine_accuracy@10
146
+ value: 1.0
147
+ name: Cosine Accuracy@10
148
+ - type: cosine_precision@1
149
+ value: 0.8333333333333334
150
+ name: Cosine Precision@1
151
+ - type: cosine_precision@3
152
+ value: 0.27777777777777773
153
+ name: Cosine Precision@3
154
+ - type: cosine_precision@5
155
+ value: 0.16666666666666666
156
+ name: Cosine Precision@5
157
+ - type: cosine_precision@10
158
+ value: 0.09999999999999999
159
+ name: Cosine Precision@10
160
+ - type: cosine_recall@1
161
+ value: 0.8333333333333334
162
+ name: Cosine Recall@1
163
+ - type: cosine_recall@3
164
+ value: 0.8333333333333334
165
+ name: Cosine Recall@3
166
+ - type: cosine_recall@5
167
+ value: 0.8333333333333334
168
+ name: Cosine Recall@5
169
+ - type: cosine_recall@10
170
+ value: 1.0
171
+ name: Cosine Recall@10
172
+ - type: cosine_ndcg@10
173
+ value: 0.892701197851337
174
+ name: Cosine Ndcg@10
175
+ - type: cosine_mrr@10
176
+ value: 0.8611111111111112
177
+ name: Cosine Mrr@10
178
+ - type: cosine_map@100
179
+ value: 0.8611111111111112
180
+ name: Cosine Map@100
181
+ - task:
182
+ type: information-retrieval
183
+ name: Information Retrieval
184
+ dataset:
185
+ name: dim 256
186
+ type: dim_256
187
+ metrics:
188
+ - type: cosine_accuracy@1
189
+ value: 0.8333333333333334
190
+ name: Cosine Accuracy@1
191
+ - type: cosine_accuracy@3
192
+ value: 0.8333333333333334
193
+ name: Cosine Accuracy@3
194
+ - type: cosine_accuracy@5
195
+ value: 0.8333333333333334
196
+ name: Cosine Accuracy@5
197
+ - type: cosine_accuracy@10
198
+ value: 1.0
199
+ name: Cosine Accuracy@10
200
+ - type: cosine_precision@1
201
+ value: 0.8333333333333334
202
+ name: Cosine Precision@1
203
+ - type: cosine_precision@3
204
+ value: 0.27777777777777773
205
+ name: Cosine Precision@3
206
+ - type: cosine_precision@5
207
+ value: 0.16666666666666666
208
+ name: Cosine Precision@5
209
+ - type: cosine_precision@10
210
+ value: 0.09999999999999999
211
+ name: Cosine Precision@10
212
+ - type: cosine_recall@1
213
+ value: 0.8333333333333334
214
+ name: Cosine Recall@1
215
+ - type: cosine_recall@3
216
+ value: 0.8333333333333334
217
+ name: Cosine Recall@3
218
+ - type: cosine_recall@5
219
+ value: 0.8333333333333334
220
+ name: Cosine Recall@5
221
+ - type: cosine_recall@10
222
+ value: 1.0
223
+ name: Cosine Recall@10
224
+ - type: cosine_ndcg@10
225
+ value: 0.892701197851337
226
+ name: Cosine Ndcg@10
227
+ - type: cosine_mrr@10
228
+ value: 0.8611111111111112
229
+ name: Cosine Mrr@10
230
+ - type: cosine_map@100
231
+ value: 0.8611111111111112
232
+ name: Cosine Map@100
233
+ - task:
234
+ type: information-retrieval
235
+ name: Information Retrieval
236
+ dataset:
237
+ name: dim 128
238
+ type: dim_128
239
+ metrics:
240
+ - type: cosine_accuracy@1
241
+ value: 0.8333333333333334
242
+ name: Cosine Accuracy@1
243
+ - type: cosine_accuracy@3
244
+ value: 0.8333333333333334
245
+ name: Cosine Accuracy@3
246
+ - type: cosine_accuracy@5
247
+ value: 0.8333333333333334
248
+ name: Cosine Accuracy@5
249
+ - type: cosine_accuracy@10
250
+ value: 1.0
251
+ name: Cosine Accuracy@10
252
+ - type: cosine_precision@1
253
+ value: 0.8333333333333334
254
+ name: Cosine Precision@1
255
+ - type: cosine_precision@3
256
+ value: 0.27777777777777773
257
+ name: Cosine Precision@3
258
+ - type: cosine_precision@5
259
+ value: 0.16666666666666666
260
+ name: Cosine Precision@5
261
+ - type: cosine_precision@10
262
+ value: 0.09999999999999999
263
+ name: Cosine Precision@10
264
+ - type: cosine_recall@1
265
+ value: 0.8333333333333334
266
+ name: Cosine Recall@1
267
+ - type: cosine_recall@3
268
+ value: 0.8333333333333334
269
+ name: Cosine Recall@3
270
+ - type: cosine_recall@5
271
+ value: 0.8333333333333334
272
+ name: Cosine Recall@5
273
+ - type: cosine_recall@10
274
+ value: 1.0
275
+ name: Cosine Recall@10
276
+ - type: cosine_ndcg@10
277
+ value: 0.8859108127976215
278
+ name: Cosine Ndcg@10
279
+ - type: cosine_mrr@10
280
+ value: 0.8541666666666666
281
+ name: Cosine Mrr@10
282
+ - type: cosine_map@100
283
+ value: 0.8541666666666666
284
+ name: Cosine Map@100
285
+ - task:
286
+ type: information-retrieval
287
+ name: Information Retrieval
288
+ dataset:
289
+ name: dim 64
290
+ type: dim_64
291
+ metrics:
292
+ - type: cosine_accuracy@1
293
+ value: 0.8333333333333334
294
+ name: Cosine Accuracy@1
295
+ - type: cosine_accuracy@3
296
+ value: 0.8333333333333334
297
+ name: Cosine Accuracy@3
298
+ - type: cosine_accuracy@5
299
+ value: 0.8333333333333334
300
+ name: Cosine Accuracy@5
301
+ - type: cosine_accuracy@10
302
+ value: 1.0
303
+ name: Cosine Accuracy@10
304
+ - type: cosine_precision@1
305
+ value: 0.8333333333333334
306
+ name: Cosine Precision@1
307
+ - type: cosine_precision@3
308
+ value: 0.27777777777777773
309
+ name: Cosine Precision@3
310
+ - type: cosine_precision@5
311
+ value: 0.16666666666666666
312
+ name: Cosine Precision@5
313
+ - type: cosine_precision@10
314
+ value: 0.09999999999999999
315
+ name: Cosine Precision@10
316
+ - type: cosine_recall@1
317
+ value: 0.8333333333333334
318
+ name: Cosine Recall@1
319
+ - type: cosine_recall@3
320
+ value: 0.8333333333333334
321
+ name: Cosine Recall@3
322
+ - type: cosine_recall@5
323
+ value: 0.8333333333333334
324
+ name: Cosine Recall@5
325
+ - type: cosine_recall@10
326
+ value: 1.0
327
+ name: Cosine Recall@10
328
+ - type: cosine_ndcg@10
329
+ value: 0.8835049992773302
330
+ name: Cosine Ndcg@10
331
+ - type: cosine_mrr@10
332
+ value: 0.8518518518518517
333
+ name: Cosine Mrr@10
334
+ - type: cosine_map@100
335
+ value: 0.8518518518518517
336
+ name: Cosine Map@100
337
+ ---
338
+
339
+ # SentenceTransformer based on BAAI/bge-base-en-v1.5
340
+
341
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
342
+
343
+ ## Model Details
344
+
345
+ ### Model Description
346
+ - **Model Type:** Sentence Transformer
347
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
348
+ - **Maximum Sequence Length:** 512 tokens
349
+ - **Output Dimensionality:** 768 tokens
350
+ - **Similarity Function:** Cosine Similarity
351
+ <!-- - **Training Dataset:** Unknown -->
352
+ <!-- - **Language:** Unknown -->
353
+ <!-- - **License:** Unknown -->
354
+
355
+ ### Model Sources
356
+
357
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
358
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
359
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
360
+
361
+ ### Full Model Architecture
362
+
363
+ ```
364
+ SentenceTransformer(
365
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
366
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
367
+ (2): Normalize()
368
+ )
369
+ ```
370
+
371
+ ## Usage
372
+
373
+ ### Direct Usage (Sentence Transformers)
374
+
375
+ First install the Sentence Transformers library:
376
+
377
+ ```bash
378
+ pip install -U sentence-transformers
379
+ ```
380
+
381
+ Then you can load this model and run inference.
382
+ ```python
383
+ from sentence_transformers import SentenceTransformer
384
+
385
+ # Download from the 🤗 Hub
386
+ model = SentenceTransformer("vineet10/fm1")
387
+ # Run inference
388
+ sentences = [
389
+ 'This Agreement shall be governed by and construed in accordance with the laws of Indiana. Any dispute arising out of or in connection with this Agreement shall be resolved through good faith negotiations between the Parties and will be subject to the jurisdiction of the courts of Dania.',
390
+ 'Under which laws is the Battery Supply Agreement governed and how are disputes resolved?',
391
+ 'What events constitute Force Majeure under this Agreement?',
392
+ ]
393
+ embeddings = model.encode(sentences)
394
+ print(embeddings.shape)
395
+ # [3, 768]
396
+
397
+ # Get the similarity scores for the embeddings
398
+ similarities = model.similarity(embeddings, embeddings)
399
+ print(similarities.shape)
400
+ # [3, 3]
401
+ ```
402
+
403
+ <!--
404
+ ### Direct Usage (Transformers)
405
+
406
+ <details><summary>Click to see the direct usage in Transformers</summary>
407
+
408
+ </details>
409
+ -->
410
+
411
+ <!--
412
+ ### Downstream Usage (Sentence Transformers)
413
+
414
+ You can finetune this model on your own dataset.
415
+
416
+ <details><summary>Click to expand</summary>
417
+
418
+ </details>
419
+ -->
420
+
421
+ <!--
422
+ ### Out-of-Scope Use
423
+
424
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
425
+ -->
426
+
427
+ ## Evaluation
428
+
429
+ ### Metrics
430
+
431
+ #### Information Retrieval
432
+ * Dataset: `dim_768`
433
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
434
+
435
+ | Metric | Value |
436
+ |:--------------------|:-----------|
437
+ | cosine_accuracy@1 | 0.8333 |
438
+ | cosine_accuracy@3 | 0.8333 |
439
+ | cosine_accuracy@5 | 0.8333 |
440
+ | cosine_accuracy@10 | 1.0 |
441
+ | cosine_precision@1 | 0.8333 |
442
+ | cosine_precision@3 | 0.2778 |
443
+ | cosine_precision@5 | 0.1667 |
444
+ | cosine_precision@10 | 0.1 |
445
+ | cosine_recall@1 | 0.8333 |
446
+ | cosine_recall@3 | 0.8333 |
447
+ | cosine_recall@5 | 0.8333 |
448
+ | cosine_recall@10 | 1.0 |
449
+ | cosine_ndcg@10 | 0.8927 |
450
+ | cosine_mrr@10 | 0.8611 |
451
+ | **cosine_map@100** | **0.8611** |
452
+
453
+ #### Information Retrieval
454
+ * Dataset: `dim_512`
455
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
456
+
457
+ | Metric | Value |
458
+ |:--------------------|:-----------|
459
+ | cosine_accuracy@1 | 0.8333 |
460
+ | cosine_accuracy@3 | 0.8333 |
461
+ | cosine_accuracy@5 | 0.8333 |
462
+ | cosine_accuracy@10 | 1.0 |
463
+ | cosine_precision@1 | 0.8333 |
464
+ | cosine_precision@3 | 0.2778 |
465
+ | cosine_precision@5 | 0.1667 |
466
+ | cosine_precision@10 | 0.1 |
467
+ | cosine_recall@1 | 0.8333 |
468
+ | cosine_recall@3 | 0.8333 |
469
+ | cosine_recall@5 | 0.8333 |
470
+ | cosine_recall@10 | 1.0 |
471
+ | cosine_ndcg@10 | 0.8927 |
472
+ | cosine_mrr@10 | 0.8611 |
473
+ | **cosine_map@100** | **0.8611** |
474
+
475
+ #### Information Retrieval
476
+ * Dataset: `dim_256`
477
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
478
+
479
+ | Metric | Value |
480
+ |:--------------------|:-----------|
481
+ | cosine_accuracy@1 | 0.8333 |
482
+ | cosine_accuracy@3 | 0.8333 |
483
+ | cosine_accuracy@5 | 0.8333 |
484
+ | cosine_accuracy@10 | 1.0 |
485
+ | cosine_precision@1 | 0.8333 |
486
+ | cosine_precision@3 | 0.2778 |
487
+ | cosine_precision@5 | 0.1667 |
488
+ | cosine_precision@10 | 0.1 |
489
+ | cosine_recall@1 | 0.8333 |
490
+ | cosine_recall@3 | 0.8333 |
491
+ | cosine_recall@5 | 0.8333 |
492
+ | cosine_recall@10 | 1.0 |
493
+ | cosine_ndcg@10 | 0.8927 |
494
+ | cosine_mrr@10 | 0.8611 |
495
+ | **cosine_map@100** | **0.8611** |
496
+
497
+ #### Information Retrieval
498
+ * Dataset: `dim_128`
499
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
500
+
501
+ | Metric | Value |
502
+ |:--------------------|:-----------|
503
+ | cosine_accuracy@1 | 0.8333 |
504
+ | cosine_accuracy@3 | 0.8333 |
505
+ | cosine_accuracy@5 | 0.8333 |
506
+ | cosine_accuracy@10 | 1.0 |
507
+ | cosine_precision@1 | 0.8333 |
508
+ | cosine_precision@3 | 0.2778 |
509
+ | cosine_precision@5 | 0.1667 |
510
+ | cosine_precision@10 | 0.1 |
511
+ | cosine_recall@1 | 0.8333 |
512
+ | cosine_recall@3 | 0.8333 |
513
+ | cosine_recall@5 | 0.8333 |
514
+ | cosine_recall@10 | 1.0 |
515
+ | cosine_ndcg@10 | 0.8859 |
516
+ | cosine_mrr@10 | 0.8542 |
517
+ | **cosine_map@100** | **0.8542** |
518
+
519
+ #### Information Retrieval
520
+ * Dataset: `dim_64`
521
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
522
+
523
+ | Metric | Value |
524
+ |:--------------------|:-----------|
525
+ | cosine_accuracy@1 | 0.8333 |
526
+ | cosine_accuracy@3 | 0.8333 |
527
+ | cosine_accuracy@5 | 0.8333 |
528
+ | cosine_accuracy@10 | 1.0 |
529
+ | cosine_precision@1 | 0.8333 |
530
+ | cosine_precision@3 | 0.2778 |
531
+ | cosine_precision@5 | 0.1667 |
532
+ | cosine_precision@10 | 0.1 |
533
+ | cosine_recall@1 | 0.8333 |
534
+ | cosine_recall@3 | 0.8333 |
535
+ | cosine_recall@5 | 0.8333 |
536
+ | cosine_recall@10 | 1.0 |
537
+ | cosine_ndcg@10 | 0.8835 |
538
+ | cosine_mrr@10 | 0.8519 |
539
+ | **cosine_map@100** | **0.8519** |
540
+
541
+ <!--
542
+ ## Bias, Risks and Limitations
543
+
544
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
545
+ -->
546
+
547
+ <!--
548
+ ### Recommendations
549
+
550
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
551
+ -->
552
+
553
+ ## Training Details
554
+
555
+ ### Training Dataset
556
+
557
+ #### Unnamed Dataset
558
+
559
+
560
+ * Size: 48 training samples
561
+ * Columns: <code>context</code> and <code>question</code>
562
+ * Approximate statistics based on the first 1000 samples:
563
+ | | context | question |
564
+ |:--------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
565
+ | type | string | string |
566
+ | details | <ul><li>min: 18 tokens</li><li>mean: 39.58 tokens</li><li>max: 85 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 17.9 tokens</li><li>max: 32 tokens</li></ul> |
567
+ * Samples:
568
+ | context | question |
569
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|
570
+ | <code>The Client will pay a flat fee of Rs. 52,000/-, with 50% (Rs. 26,000/-) due upon signing the agreement and the remaining 50% due one week after completion of pre-production. Payment delays will result in proportional delays in data delivery and editing.</code> | <code>What are the specified payment terms for the photography services under this contract?</code> |
571
+ | <code>Users can report delays to Customer Care and expect an automatic refund within 3-4 business days if services are canceled or rescheduled by the platform.</code> | <code>What actions can a user take if the platform is unable to fulfill a successfully placed order?</code> |
572
+ | <code>Signed by James Hira, Managing Director of Electric Vehicle Battery Supplier Pvt. Ltd, and Managing Director of Best Car Manufacturer Pvt. Ltd</code> | <code>Who signed the Battery Supply Agreement on behalf of the Supplier and the Manufacturer?</code> |
573
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
574
+ ```json
575
+ {
576
+ "scale": 20.0,
577
+ "similarity_fct": "cos_sim"
578
+ }
579
+ ```
580
+
581
+ ### Training Hyperparameters
582
+ #### Non-Default Hyperparameters
583
+
584
+ - `eval_strategy`: steps
585
+ - `per_device_train_batch_size`: 16
586
+ - `per_device_eval_batch_size`: 16
587
+ - `num_train_epochs`: 5
588
+ - `warmup_ratio`: 0.1
589
+ - `fp16`: True
590
+ - `batch_sampler`: no_duplicates
591
+
592
+ #### All Hyperparameters
593
+ <details><summary>Click to expand</summary>
594
+
595
+ - `overwrite_output_dir`: False
596
+ - `do_predict`: False
597
+ - `eval_strategy`: steps
598
+ - `prediction_loss_only`: True
599
+ - `per_device_train_batch_size`: 16
600
+ - `per_device_eval_batch_size`: 16
601
+ - `per_gpu_train_batch_size`: None
602
+ - `per_gpu_eval_batch_size`: None
603
+ - `gradient_accumulation_steps`: 1
604
+ - `eval_accumulation_steps`: None
605
+ - `learning_rate`: 5e-05
606
+ - `weight_decay`: 0.0
607
+ - `adam_beta1`: 0.9
608
+ - `adam_beta2`: 0.999
609
+ - `adam_epsilon`: 1e-08
610
+ - `max_grad_norm`: 1.0
611
+ - `num_train_epochs`: 5
612
+ - `max_steps`: -1
613
+ - `lr_scheduler_type`: linear
614
+ - `lr_scheduler_kwargs`: {}
615
+ - `warmup_ratio`: 0.1
616
+ - `warmup_steps`: 0
617
+ - `log_level`: passive
618
+ - `log_level_replica`: warning
619
+ - `log_on_each_node`: True
620
+ - `logging_nan_inf_filter`: True
621
+ - `save_safetensors`: True
622
+ - `save_on_each_node`: False
623
+ - `save_only_model`: False
624
+ - `restore_callback_states_from_checkpoint`: False
625
+ - `no_cuda`: False
626
+ - `use_cpu`: False
627
+ - `use_mps_device`: False
628
+ - `seed`: 42
629
+ - `data_seed`: None
630
+ - `jit_mode_eval`: False
631
+ - `use_ipex`: False
632
+ - `bf16`: False
633
+ - `fp16`: True
634
+ - `fp16_opt_level`: O1
635
+ - `half_precision_backend`: auto
636
+ - `bf16_full_eval`: False
637
+ - `fp16_full_eval`: False
638
+ - `tf32`: None
639
+ - `local_rank`: 0
640
+ - `ddp_backend`: None
641
+ - `tpu_num_cores`: None
642
+ - `tpu_metrics_debug`: False
643
+ - `debug`: []
644
+ - `dataloader_drop_last`: False
645
+ - `dataloader_num_workers`: 0
646
+ - `dataloader_prefetch_factor`: None
647
+ - `past_index`: -1
648
+ - `disable_tqdm`: False
649
+ - `remove_unused_columns`: True
650
+ - `label_names`: None
651
+ - `load_best_model_at_end`: False
652
+ - `ignore_data_skip`: False
653
+ - `fsdp`: []
654
+ - `fsdp_min_num_params`: 0
655
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
656
+ - `fsdp_transformer_layer_cls_to_wrap`: None
657
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
658
+ - `deepspeed`: None
659
+ - `label_smoothing_factor`: 0.0
660
+ - `optim`: adamw_torch
661
+ - `optim_args`: None
662
+ - `adafactor`: False
663
+ - `group_by_length`: False
664
+ - `length_column_name`: length
665
+ - `ddp_find_unused_parameters`: None
666
+ - `ddp_bucket_cap_mb`: None
667
+ - `ddp_broadcast_buffers`: False
668
+ - `dataloader_pin_memory`: True
669
+ - `dataloader_persistent_workers`: False
670
+ - `skip_memory_metrics`: True
671
+ - `use_legacy_prediction_loop`: False
672
+ - `push_to_hub`: False
673
+ - `resume_from_checkpoint`: None
674
+ - `hub_model_id`: None
675
+ - `hub_strategy`: every_save
676
+ - `hub_private_repo`: False
677
+ - `hub_always_push`: False
678
+ - `gradient_checkpointing`: False
679
+ - `gradient_checkpointing_kwargs`: None
680
+ - `include_inputs_for_metrics`: False
681
+ - `eval_do_concat_batches`: True
682
+ - `fp16_backend`: auto
683
+ - `push_to_hub_model_id`: None
684
+ - `push_to_hub_organization`: None
685
+ - `mp_parameters`:
686
+ - `auto_find_batch_size`: False
687
+ - `full_determinism`: False
688
+ - `torchdynamo`: None
689
+ - `ray_scope`: last
690
+ - `ddp_timeout`: 1800
691
+ - `torch_compile`: False
692
+ - `torch_compile_backend`: None
693
+ - `torch_compile_mode`: None
694
+ - `dispatch_batches`: None
695
+ - `split_batches`: None
696
+ - `include_tokens_per_second`: False
697
+ - `include_num_input_tokens_seen`: False
698
+ - `neftune_noise_alpha`: None
699
+ - `optim_target_modules`: None
700
+ - `batch_eval_metrics`: False
701
+ - `eval_on_start`: False
702
+ - `batch_sampler`: no_duplicates
703
+ - `multi_dataset_batch_sampler`: proportional
704
+
705
+ </details>
706
+
707
+ ### Training Logs
708
+ | Epoch | Step | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
709
+ |:-----:|:----:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
710
+ | 0 | 0 | 0.8542 | 0.8611 | 0.8611 | 0.8519 | 0.8611 |
711
+
712
+
713
+ ### Framework Versions
714
+ - Python: 3.10.12
715
+ - Sentence Transformers: 3.0.1
716
+ - Transformers: 4.42.4
717
+ - PyTorch: 2.3.1+cu121
718
+ - Accelerate: 0.32.1
719
+ - Datasets: 2.20.0
720
+ - Tokenizers: 0.19.1
721
+
722
+ ## Citation
723
+
724
+ ### BibTeX
725
+
726
+ #### Sentence Transformers
727
+ ```bibtex
728
+ @inproceedings{reimers-2019-sentence-bert,
729
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
730
+ author = "Reimers, Nils and Gurevych, Iryna",
731
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
732
+ month = "11",
733
+ year = "2019",
734
+ publisher = "Association for Computational Linguistics",
735
+ url = "https://arxiv.org/abs/1908.10084",
736
+ }
737
+ ```
738
+
739
+ #### MultipleNegativesRankingLoss
740
+ ```bibtex
741
+ @misc{henderson2017efficient,
742
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
743
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
744
+ year={2017},
745
+ eprint={1705.00652},
746
+ archivePrefix={arXiv},
747
+ primaryClass={cs.CL}
748
+ }
749
+ ```
750
+
751
+ <!--
752
+ ## Glossary
753
+
754
+ *Clearly define terms in order to be accessible across audiences.*
755
+ -->
756
+
757
+ <!--
758
+ ## Model Card Authors
759
+
760
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
761
+ -->
762
+
763
+ <!--
764
+ ## Model Card Contact
765
+
766
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
767
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.42.4",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d52035a750e4a6e4637ada8fd9a191afa0b993759df025310f600c1d1855e7a
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff