arthurbresnu HF Staff commited on
Commit
c6cad00
·
verified ·
1 Parent(s): 915fff8

Add new SparseEncoder model

Browse files
1_SpladePooling/config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "pooling_strategy": "max",
3
+ "activation_function": "relu",
4
+ "word_embedding_dimension": 30522
5
+ }
README.md ADDED
@@ -0,0 +1,1753 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sparse-encoder
8
+ - sparse
9
+ - splade
10
+ - generated_from_trainer
11
+ - dataset_size:90000
12
+ - loss:SpladeLoss
13
+ - loss:SparseMarginMSELoss
14
+ - loss:FlopsLoss
15
+ base_model: Luyu/co-condenser-marco
16
+ widget:
17
+ - text: how old do you have to be to have lasik
18
+ - text: when is house of cards on netflix
19
+ - text: Answer by lauryn (194). The length of time it takes a women to get her period
20
+ after giving birth varies from women to women. For many women it can take about
21
+ 2 to 3 months before your period returns to normal. If you are nursing than this
22
+ time frame will last even longer.
23
+ - text: what are cys residues
24
+ - text: "You heard about fastest cars, bikes and plans but today we have world fastest\
25
+ \ bird collection. In our collection we have top 10 fastest birds of the world.\
26
+ \ Birdâ\x80\x99s flight speed is fundamentally changeable; a hunting bird speed\
27
+ \ will increase while diving-to-catch prey as compared to its gliding speeds.\
28
+ \ Here we have the top 10 fastest birds with their flight speed. 10. Teal 109\
29
+ \ km/h (68mph) This bird can fly 109 km/ h (68mph); they are 53 to 59cm long.\
30
+ \ This bird always lives in group. 09."
31
+ datasets:
32
+ - sentence-transformers/msmarco
33
+ pipeline_tag: feature-extraction
34
+ library_name: sentence-transformers
35
+ metrics:
36
+ - dot_accuracy@1
37
+ - dot_accuracy@3
38
+ - dot_accuracy@5
39
+ - dot_accuracy@10
40
+ - dot_precision@1
41
+ - dot_precision@3
42
+ - dot_precision@5
43
+ - dot_precision@10
44
+ - dot_recall@1
45
+ - dot_recall@3
46
+ - dot_recall@5
47
+ - dot_recall@10
48
+ - dot_ndcg@10
49
+ - dot_mrr@10
50
+ - dot_map@100
51
+ - query_active_dims
52
+ - query_sparsity_ratio
53
+ - corpus_active_dims
54
+ - corpus_sparsity_ratio
55
+ co2_eq_emissions:
56
+ emissions: 34.21475343773813
57
+ energy_consumed: 0.0926891546467269
58
+ source: codecarbon
59
+ training_type: fine-tuning
60
+ on_cloud: false
61
+ cpu_model: AMD EPYC 7R13 Processor
62
+ ram_total_size: 248.0
63
+ hours_used: 0.305
64
+ hardware_used: 1 x NVIDIA H100 80GB HBM3
65
+ model-index:
66
+ - name: splade-co-condenser-marco trained on MS MARCO hard negatives with distillation
67
+ results:
68
+ - task:
69
+ type: sparse-information-retrieval
70
+ name: Sparse Information Retrieval
71
+ dataset:
72
+ name: NanoMSMARCO
73
+ type: NanoMSMARCO
74
+ metrics:
75
+ - type: dot_accuracy@1
76
+ value: 0.4
77
+ name: Dot Accuracy@1
78
+ - type: dot_accuracy@3
79
+ value: 0.62
80
+ name: Dot Accuracy@3
81
+ - type: dot_accuracy@5
82
+ value: 0.68
83
+ name: Dot Accuracy@5
84
+ - type: dot_accuracy@10
85
+ value: 0.84
86
+ name: Dot Accuracy@10
87
+ - type: dot_precision@1
88
+ value: 0.4
89
+ name: Dot Precision@1
90
+ - type: dot_precision@3
91
+ value: 0.20666666666666667
92
+ name: Dot Precision@3
93
+ - type: dot_precision@5
94
+ value: 0.136
95
+ name: Dot Precision@5
96
+ - type: dot_precision@10
97
+ value: 0.08399999999999999
98
+ name: Dot Precision@10
99
+ - type: dot_recall@1
100
+ value: 0.4
101
+ name: Dot Recall@1
102
+ - type: dot_recall@3
103
+ value: 0.62
104
+ name: Dot Recall@3
105
+ - type: dot_recall@5
106
+ value: 0.68
107
+ name: Dot Recall@5
108
+ - type: dot_recall@10
109
+ value: 0.84
110
+ name: Dot Recall@10
111
+ - type: dot_ndcg@10
112
+ value: 0.6076647728795561
113
+ name: Dot Ndcg@10
114
+ - type: dot_mrr@10
115
+ value: 0.5352777777777777
116
+ name: Dot Mrr@10
117
+ - type: dot_map@100
118
+ value: 0.5419469179877314
119
+ name: Dot Map@100
120
+ - type: query_active_dims
121
+ value: 54.119998931884766
122
+ name: Query Active Dims
123
+ - type: query_sparsity_ratio
124
+ value: 0.9982268527969371
125
+ name: Query Sparsity Ratio
126
+ - type: corpus_active_dims
127
+ value: 187.67538452148438
128
+ name: Corpus Active Dims
129
+ - type: corpus_sparsity_ratio
130
+ value: 0.993851143944647
131
+ name: Corpus Sparsity Ratio
132
+ - type: dot_accuracy@1
133
+ value: 0.4
134
+ name: Dot Accuracy@1
135
+ - type: dot_accuracy@3
136
+ value: 0.62
137
+ name: Dot Accuracy@3
138
+ - type: dot_accuracy@5
139
+ value: 0.68
140
+ name: Dot Accuracy@5
141
+ - type: dot_accuracy@10
142
+ value: 0.84
143
+ name: Dot Accuracy@10
144
+ - type: dot_precision@1
145
+ value: 0.4
146
+ name: Dot Precision@1
147
+ - type: dot_precision@3
148
+ value: 0.20666666666666667
149
+ name: Dot Precision@3
150
+ - type: dot_precision@5
151
+ value: 0.136
152
+ name: Dot Precision@5
153
+ - type: dot_precision@10
154
+ value: 0.08399999999999999
155
+ name: Dot Precision@10
156
+ - type: dot_recall@1
157
+ value: 0.4
158
+ name: Dot Recall@1
159
+ - type: dot_recall@3
160
+ value: 0.62
161
+ name: Dot Recall@3
162
+ - type: dot_recall@5
163
+ value: 0.68
164
+ name: Dot Recall@5
165
+ - type: dot_recall@10
166
+ value: 0.84
167
+ name: Dot Recall@10
168
+ - type: dot_ndcg@10
169
+ value: 0.6076647728795561
170
+ name: Dot Ndcg@10
171
+ - type: dot_mrr@10
172
+ value: 0.5352777777777777
173
+ name: Dot Mrr@10
174
+ - type: dot_map@100
175
+ value: 0.5419469179877314
176
+ name: Dot Map@100
177
+ - type: query_active_dims
178
+ value: 54.119998931884766
179
+ name: Query Active Dims
180
+ - type: query_sparsity_ratio
181
+ value: 0.9982268527969371
182
+ name: Query Sparsity Ratio
183
+ - type: corpus_active_dims
184
+ value: 187.67538452148438
185
+ name: Corpus Active Dims
186
+ - type: corpus_sparsity_ratio
187
+ value: 0.993851143944647
188
+ name: Corpus Sparsity Ratio
189
+ - task:
190
+ type: sparse-information-retrieval
191
+ name: Sparse Information Retrieval
192
+ dataset:
193
+ name: NanoNFCorpus
194
+ type: NanoNFCorpus
195
+ metrics:
196
+ - type: dot_accuracy@1
197
+ value: 0.44
198
+ name: Dot Accuracy@1
199
+ - type: dot_accuracy@3
200
+ value: 0.6
201
+ name: Dot Accuracy@3
202
+ - type: dot_accuracy@5
203
+ value: 0.64
204
+ name: Dot Accuracy@5
205
+ - type: dot_accuracy@10
206
+ value: 0.68
207
+ name: Dot Accuracy@10
208
+ - type: dot_precision@1
209
+ value: 0.44
210
+ name: Dot Precision@1
211
+ - type: dot_precision@3
212
+ value: 0.34
213
+ name: Dot Precision@3
214
+ - type: dot_precision@5
215
+ value: 0.316
216
+ name: Dot Precision@5
217
+ - type: dot_precision@10
218
+ value: 0.27
219
+ name: Dot Precision@10
220
+ - type: dot_recall@1
221
+ value: 0.06311467051346893
222
+ name: Dot Recall@1
223
+ - type: dot_recall@3
224
+ value: 0.09895898433766803
225
+ name: Dot Recall@3
226
+ - type: dot_recall@5
227
+ value: 0.1169352131561954
228
+ name: Dot Recall@5
229
+ - type: dot_recall@10
230
+ value: 0.14677603057730104
231
+ name: Dot Recall@10
232
+ - type: dot_ndcg@10
233
+ value: 0.34523070842752446
234
+ name: Dot Ndcg@10
235
+ - type: dot_mrr@10
236
+ value: 0.5258333333333334
237
+ name: Dot Mrr@10
238
+ - type: dot_map@100
239
+ value: 0.16994217536385264
240
+ name: Dot Map@100
241
+ - type: query_active_dims
242
+ value: 51.70000076293945
243
+ name: Query Active Dims
244
+ - type: query_sparsity_ratio
245
+ value: 0.9983061398085663
246
+ name: Query Sparsity Ratio
247
+ - type: corpus_active_dims
248
+ value: 336.32476806640625
249
+ name: Corpus Active Dims
250
+ - type: corpus_sparsity_ratio
251
+ value: 0.9889809066225539
252
+ name: Corpus Sparsity Ratio
253
+ - type: dot_accuracy@1
254
+ value: 0.44
255
+ name: Dot Accuracy@1
256
+ - type: dot_accuracy@3
257
+ value: 0.6
258
+ name: Dot Accuracy@3
259
+ - type: dot_accuracy@5
260
+ value: 0.64
261
+ name: Dot Accuracy@5
262
+ - type: dot_accuracy@10
263
+ value: 0.68
264
+ name: Dot Accuracy@10
265
+ - type: dot_precision@1
266
+ value: 0.44
267
+ name: Dot Precision@1
268
+ - type: dot_precision@3
269
+ value: 0.34
270
+ name: Dot Precision@3
271
+ - type: dot_precision@5
272
+ value: 0.316
273
+ name: Dot Precision@5
274
+ - type: dot_precision@10
275
+ value: 0.27
276
+ name: Dot Precision@10
277
+ - type: dot_recall@1
278
+ value: 0.06311467051346893
279
+ name: Dot Recall@1
280
+ - type: dot_recall@3
281
+ value: 0.09895898433766803
282
+ name: Dot Recall@3
283
+ - type: dot_recall@5
284
+ value: 0.1169352131561954
285
+ name: Dot Recall@5
286
+ - type: dot_recall@10
287
+ value: 0.14677603057730104
288
+ name: Dot Recall@10
289
+ - type: dot_ndcg@10
290
+ value: 0.34523070842752446
291
+ name: Dot Ndcg@10
292
+ - type: dot_mrr@10
293
+ value: 0.5258333333333334
294
+ name: Dot Mrr@10
295
+ - type: dot_map@100
296
+ value: 0.16994217536385264
297
+ name: Dot Map@100
298
+ - type: query_active_dims
299
+ value: 51.70000076293945
300
+ name: Query Active Dims
301
+ - type: query_sparsity_ratio
302
+ value: 0.9983061398085663
303
+ name: Query Sparsity Ratio
304
+ - type: corpus_active_dims
305
+ value: 336.32476806640625
306
+ name: Corpus Active Dims
307
+ - type: corpus_sparsity_ratio
308
+ value: 0.9889809066225539
309
+ name: Corpus Sparsity Ratio
310
+ - task:
311
+ type: sparse-information-retrieval
312
+ name: Sparse Information Retrieval
313
+ dataset:
314
+ name: NanoNQ
315
+ type: NanoNQ
316
+ metrics:
317
+ - type: dot_accuracy@1
318
+ value: 0.52
319
+ name: Dot Accuracy@1
320
+ - type: dot_accuracy@3
321
+ value: 0.74
322
+ name: Dot Accuracy@3
323
+ - type: dot_accuracy@5
324
+ value: 0.78
325
+ name: Dot Accuracy@5
326
+ - type: dot_accuracy@10
327
+ value: 0.84
328
+ name: Dot Accuracy@10
329
+ - type: dot_precision@1
330
+ value: 0.52
331
+ name: Dot Precision@1
332
+ - type: dot_precision@3
333
+ value: 0.2533333333333333
334
+ name: Dot Precision@3
335
+ - type: dot_precision@5
336
+ value: 0.16
337
+ name: Dot Precision@5
338
+ - type: dot_precision@10
339
+ value: 0.08999999999999998
340
+ name: Dot Precision@10
341
+ - type: dot_recall@1
342
+ value: 0.48
343
+ name: Dot Recall@1
344
+ - type: dot_recall@3
345
+ value: 0.69
346
+ name: Dot Recall@3
347
+ - type: dot_recall@5
348
+ value: 0.73
349
+ name: Dot Recall@5
350
+ - type: dot_recall@10
351
+ value: 0.8
352
+ name: Dot Recall@10
353
+ - type: dot_ndcg@10
354
+ value: 0.6594960548473345
355
+ name: Dot Ndcg@10
356
+ - type: dot_mrr@10
357
+ value: 0.6369365079365078
358
+ name: Dot Mrr@10
359
+ - type: dot_map@100
360
+ value: 0.6105143613696246
361
+ name: Dot Map@100
362
+ - type: query_active_dims
363
+ value: 53.34000015258789
364
+ name: Query Active Dims
365
+ - type: query_sparsity_ratio
366
+ value: 0.9982524080940768
367
+ name: Query Sparsity Ratio
368
+ - type: corpus_active_dims
369
+ value: 223.5908660888672
370
+ name: Corpus Active Dims
371
+ - type: corpus_sparsity_ratio
372
+ value: 0.9926744359449294
373
+ name: Corpus Sparsity Ratio
374
+ - type: dot_accuracy@1
375
+ value: 0.52
376
+ name: Dot Accuracy@1
377
+ - type: dot_accuracy@3
378
+ value: 0.74
379
+ name: Dot Accuracy@3
380
+ - type: dot_accuracy@5
381
+ value: 0.78
382
+ name: Dot Accuracy@5
383
+ - type: dot_accuracy@10
384
+ value: 0.84
385
+ name: Dot Accuracy@10
386
+ - type: dot_precision@1
387
+ value: 0.52
388
+ name: Dot Precision@1
389
+ - type: dot_precision@3
390
+ value: 0.2533333333333333
391
+ name: Dot Precision@3
392
+ - type: dot_precision@5
393
+ value: 0.16
394
+ name: Dot Precision@5
395
+ - type: dot_precision@10
396
+ value: 0.08999999999999998
397
+ name: Dot Precision@10
398
+ - type: dot_recall@1
399
+ value: 0.48
400
+ name: Dot Recall@1
401
+ - type: dot_recall@3
402
+ value: 0.69
403
+ name: Dot Recall@3
404
+ - type: dot_recall@5
405
+ value: 0.73
406
+ name: Dot Recall@5
407
+ - type: dot_recall@10
408
+ value: 0.8
409
+ name: Dot Recall@10
410
+ - type: dot_ndcg@10
411
+ value: 0.6594960548473345
412
+ name: Dot Ndcg@10
413
+ - type: dot_mrr@10
414
+ value: 0.6369365079365078
415
+ name: Dot Mrr@10
416
+ - type: dot_map@100
417
+ value: 0.6105143613696246
418
+ name: Dot Map@100
419
+ - type: query_active_dims
420
+ value: 53.34000015258789
421
+ name: Query Active Dims
422
+ - type: query_sparsity_ratio
423
+ value: 0.9982524080940768
424
+ name: Query Sparsity Ratio
425
+ - type: corpus_active_dims
426
+ value: 223.5908660888672
427
+ name: Corpus Active Dims
428
+ - type: corpus_sparsity_ratio
429
+ value: 0.9926744359449294
430
+ name: Corpus Sparsity Ratio
431
+ - task:
432
+ type: sparse-nano-beir
433
+ name: Sparse Nano BEIR
434
+ dataset:
435
+ name: NanoBEIR mean
436
+ type: NanoBEIR_mean
437
+ metrics:
438
+ - type: dot_accuracy@1
439
+ value: 0.45333333333333337
440
+ name: Dot Accuracy@1
441
+ - type: dot_accuracy@3
442
+ value: 0.6533333333333333
443
+ name: Dot Accuracy@3
444
+ - type: dot_accuracy@5
445
+ value: 0.7000000000000001
446
+ name: Dot Accuracy@5
447
+ - type: dot_accuracy@10
448
+ value: 0.7866666666666666
449
+ name: Dot Accuracy@10
450
+ - type: dot_precision@1
451
+ value: 0.45333333333333337
452
+ name: Dot Precision@1
453
+ - type: dot_precision@3
454
+ value: 0.26666666666666666
455
+ name: Dot Precision@3
456
+ - type: dot_precision@5
457
+ value: 0.204
458
+ name: Dot Precision@5
459
+ - type: dot_precision@10
460
+ value: 0.148
461
+ name: Dot Precision@10
462
+ - type: dot_recall@1
463
+ value: 0.314371556837823
464
+ name: Dot Recall@1
465
+ - type: dot_recall@3
466
+ value: 0.4696529947792227
467
+ name: Dot Recall@3
468
+ - type: dot_recall@5
469
+ value: 0.5089784043853984
470
+ name: Dot Recall@5
471
+ - type: dot_recall@10
472
+ value: 0.5955920101924337
473
+ name: Dot Recall@10
474
+ - type: dot_ndcg@10
475
+ value: 0.5374638453848051
476
+ name: Dot Ndcg@10
477
+ - type: dot_mrr@10
478
+ value: 0.566015873015873
479
+ name: Dot Mrr@10
480
+ - type: dot_map@100
481
+ value: 0.4408011515737362
482
+ name: Dot Map@100
483
+ - type: query_active_dims
484
+ value: 53.0533332824707
485
+ name: Query Active Dims
486
+ - type: query_sparsity_ratio
487
+ value: 0.9982618002331933
488
+ name: Query Sparsity Ratio
489
+ - type: corpus_active_dims
490
+ value: 235.2385860639544
491
+ name: Corpus Active Dims
492
+ - type: corpus_sparsity_ratio
493
+ value: 0.9922928187515905
494
+ name: Corpus Sparsity Ratio
495
+ - type: dot_accuracy@1
496
+ value: 0.5580533751962323
497
+ name: Dot Accuracy@1
498
+ - type: dot_accuracy@3
499
+ value: 0.7137205651491366
500
+ name: Dot Accuracy@3
501
+ - type: dot_accuracy@5
502
+ value: 0.7722448979591837
503
+ name: Dot Accuracy@5
504
+ - type: dot_accuracy@10
505
+ value: 0.8291679748822605
506
+ name: Dot Accuracy@10
507
+ - type: dot_precision@1
508
+ value: 0.5580533751962323
509
+ name: Dot Precision@1
510
+ - type: dot_precision@3
511
+ value: 0.3332705389848246
512
+ name: Dot Precision@3
513
+ - type: dot_precision@5
514
+ value: 0.26179591836734695
515
+ name: Dot Precision@5
516
+ - type: dot_precision@10
517
+ value: 0.179171114599686
518
+ name: Dot Precision@10
519
+ - type: dot_recall@1
520
+ value: 0.32499349487208484
521
+ name: Dot Recall@1
522
+ - type: dot_recall@3
523
+ value: 0.4721752731683537
524
+ name: Dot Recall@3
525
+ - type: dot_recall@5
526
+ value: 0.5337131771857326
527
+ name: Dot Recall@5
528
+ - type: dot_recall@10
529
+ value: 0.6042058945750339
530
+ name: Dot Recall@10
531
+ - type: dot_ndcg@10
532
+ value: 0.578707182604652
533
+ name: Dot Ndcg@10
534
+ - type: dot_mrr@10
535
+ value: 0.6493701377987092
536
+ name: Dot Mrr@10
537
+ - type: dot_map@100
538
+ value: 0.5041070229886567
539
+ name: Dot Map@100
540
+ - type: query_active_dims
541
+ value: 86.67950763908115
542
+ name: Query Active Dims
543
+ - type: query_sparsity_ratio
544
+ value: 0.997160097384212
545
+ name: Query Sparsity Ratio
546
+ - type: corpus_active_dims
547
+ value: 230.5675761418069
548
+ name: Corpus Active Dims
549
+ - type: corpus_sparsity_ratio
550
+ value: 0.992445856230201
551
+ name: Corpus Sparsity Ratio
552
+ - task:
553
+ type: sparse-information-retrieval
554
+ name: Sparse Information Retrieval
555
+ dataset:
556
+ name: NanoClimateFEVER
557
+ type: NanoClimateFEVER
558
+ metrics:
559
+ - type: dot_accuracy@1
560
+ value: 0.32
561
+ name: Dot Accuracy@1
562
+ - type: dot_accuracy@3
563
+ value: 0.52
564
+ name: Dot Accuracy@3
565
+ - type: dot_accuracy@5
566
+ value: 0.54
567
+ name: Dot Accuracy@5
568
+ - type: dot_accuracy@10
569
+ value: 0.62
570
+ name: Dot Accuracy@10
571
+ - type: dot_precision@1
572
+ value: 0.32
573
+ name: Dot Precision@1
574
+ - type: dot_precision@3
575
+ value: 0.2
576
+ name: Dot Precision@3
577
+ - type: dot_precision@5
578
+ value: 0.14
579
+ name: Dot Precision@5
580
+ - type: dot_precision@10
581
+ value: 0.08199999999999999
582
+ name: Dot Precision@10
583
+ - type: dot_recall@1
584
+ value: 0.165
585
+ name: Dot Recall@1
586
+ - type: dot_recall@3
587
+ value: 0.26
588
+ name: Dot Recall@3
589
+ - type: dot_recall@5
590
+ value: 0.28733333333333333
591
+ name: Dot Recall@5
592
+ - type: dot_recall@10
593
+ value: 0.32233333333333336
594
+ name: Dot Recall@10
595
+ - type: dot_ndcg@10
596
+ value: 0.30365156381250225
597
+ name: Dot Ndcg@10
598
+ - type: dot_mrr@10
599
+ value: 0.4207222222222222
600
+ name: Dot Mrr@10
601
+ - type: dot_map@100
602
+ value: 0.25580876542561
603
+ name: Dot Map@100
604
+ - type: query_active_dims
605
+ value: 135.3000030517578
606
+ name: Query Active Dims
607
+ - type: query_sparsity_ratio
608
+ value: 0.99556713180487
609
+ name: Query Sparsity Ratio
610
+ - type: corpus_active_dims
611
+ value: 270.1291198730469
612
+ name: Corpus Active Dims
613
+ - type: corpus_sparsity_ratio
614
+ value: 0.9911496913743186
615
+ name: Corpus Sparsity Ratio
616
+ - task:
617
+ type: sparse-information-retrieval
618
+ name: Sparse Information Retrieval
619
+ dataset:
620
+ name: NanoDBPedia
621
+ type: NanoDBPedia
622
+ metrics:
623
+ - type: dot_accuracy@1
624
+ value: 0.74
625
+ name: Dot Accuracy@1
626
+ - type: dot_accuracy@3
627
+ value: 0.86
628
+ name: Dot Accuracy@3
629
+ - type: dot_accuracy@5
630
+ value: 0.9
631
+ name: Dot Accuracy@5
632
+ - type: dot_accuracy@10
633
+ value: 0.94
634
+ name: Dot Accuracy@10
635
+ - type: dot_precision@1
636
+ value: 0.74
637
+ name: Dot Precision@1
638
+ - type: dot_precision@3
639
+ value: 0.6133333333333333
640
+ name: Dot Precision@3
641
+ - type: dot_precision@5
642
+ value: 0.588
643
+ name: Dot Precision@5
644
+ - type: dot_precision@10
645
+ value: 0.508
646
+ name: Dot Precision@10
647
+ - type: dot_recall@1
648
+ value: 0.07635143960629845
649
+ name: Dot Recall@1
650
+ - type: dot_recall@3
651
+ value: 0.1800129405239251
652
+ name: Dot Recall@3
653
+ - type: dot_recall@5
654
+ value: 0.23739681193828663
655
+ name: Dot Recall@5
656
+ - type: dot_recall@10
657
+ value: 0.33976750488378327
658
+ name: Dot Recall@10
659
+ - type: dot_ndcg@10
660
+ value: 0.622759301760137
661
+ name: Dot Ndcg@10
662
+ - type: dot_mrr@10
663
+ value: 0.8137142857142856
664
+ name: Dot Mrr@10
665
+ - type: dot_map@100
666
+ value: 0.4830025510651395
667
+ name: Dot Map@100
668
+ - type: query_active_dims
669
+ value: 52.2599983215332
670
+ name: Query Active Dims
671
+ - type: query_sparsity_ratio
672
+ value: 0.9982877924670227
673
+ name: Query Sparsity Ratio
674
+ - type: corpus_active_dims
675
+ value: 219.79901123046875
676
+ name: Corpus Active Dims
677
+ - type: corpus_sparsity_ratio
678
+ value: 0.9927986694439921
679
+ name: Corpus Sparsity Ratio
680
+ - task:
681
+ type: sparse-information-retrieval
682
+ name: Sparse Information Retrieval
683
+ dataset:
684
+ name: NanoFEVER
685
+ type: NanoFEVER
686
+ metrics:
687
+ - type: dot_accuracy@1
688
+ value: 0.8
689
+ name: Dot Accuracy@1
690
+ - type: dot_accuracy@3
691
+ value: 0.92
692
+ name: Dot Accuracy@3
693
+ - type: dot_accuracy@5
694
+ value: 0.94
695
+ name: Dot Accuracy@5
696
+ - type: dot_accuracy@10
697
+ value: 0.96
698
+ name: Dot Accuracy@10
699
+ - type: dot_precision@1
700
+ value: 0.8
701
+ name: Dot Precision@1
702
+ - type: dot_precision@3
703
+ value: 0.31999999999999995
704
+ name: Dot Precision@3
705
+ - type: dot_precision@5
706
+ value: 0.204
707
+ name: Dot Precision@5
708
+ - type: dot_precision@10
709
+ value: 0.10599999999999998
710
+ name: Dot Precision@10
711
+ - type: dot_recall@1
712
+ value: 0.7566666666666666
713
+ name: Dot Recall@1
714
+ - type: dot_recall@3
715
+ value: 0.8866666666666667
716
+ name: Dot Recall@3
717
+ - type: dot_recall@5
718
+ value: 0.92
719
+ name: Dot Recall@5
720
+ - type: dot_recall@10
721
+ value: 0.95
722
+ name: Dot Recall@10
723
+ - type: dot_ndcg@10
724
+ value: 0.871923100931238
725
+ name: Dot Ndcg@10
726
+ - type: dot_mrr@10
727
+ value: 0.8608333333333333
728
+ name: Dot Mrr@10
729
+ - type: dot_map@100
730
+ value: 0.8427126216077829
731
+ name: Dot Map@100
732
+ - type: query_active_dims
733
+ value: 79.13999938964844
734
+ name: Query Active Dims
735
+ - type: query_sparsity_ratio
736
+ value: 0.9974071161984913
737
+ name: Query Sparsity Ratio
738
+ - type: corpus_active_dims
739
+ value: 287.1961669921875
740
+ name: Corpus Active Dims
741
+ - type: corpus_sparsity_ratio
742
+ value: 0.9905905193961015
743
+ name: Corpus Sparsity Ratio
744
+ - task:
745
+ type: sparse-information-retrieval
746
+ name: Sparse Information Retrieval
747
+ dataset:
748
+ name: NanoFiQA2018
749
+ type: NanoFiQA2018
750
+ metrics:
751
+ - type: dot_accuracy@1
752
+ value: 0.42
753
+ name: Dot Accuracy@1
754
+ - type: dot_accuracy@3
755
+ value: 0.52
756
+ name: Dot Accuracy@3
757
+ - type: dot_accuracy@5
758
+ value: 0.58
759
+ name: Dot Accuracy@5
760
+ - type: dot_accuracy@10
761
+ value: 0.68
762
+ name: Dot Accuracy@10
763
+ - type: dot_precision@1
764
+ value: 0.42
765
+ name: Dot Precision@1
766
+ - type: dot_precision@3
767
+ value: 0.21333333333333332
768
+ name: Dot Precision@3
769
+ - type: dot_precision@5
770
+ value: 0.16799999999999998
771
+ name: Dot Precision@5
772
+ - type: dot_precision@10
773
+ value: 0.11
774
+ name: Dot Precision@10
775
+ - type: dot_recall@1
776
+ value: 0.23607936507936508
777
+ name: Dot Recall@1
778
+ - type: dot_recall@3
779
+ value: 0.31813492063492066
780
+ name: Dot Recall@3
781
+ - type: dot_recall@5
782
+ value: 0.3794920634920635
783
+ name: Dot Recall@5
784
+ - type: dot_recall@10
785
+ value: 0.4829047619047619
786
+ name: Dot Recall@10
787
+ - type: dot_ndcg@10
788
+ value: 0.41245963928815416
789
+ name: Dot Ndcg@10
790
+ - type: dot_mrr@10
791
+ value: 0.4934444444444444
792
+ name: Dot Mrr@10
793
+ - type: dot_map@100
794
+ value: 0.35636809652397866
795
+ name: Dot Map@100
796
+ - type: query_active_dims
797
+ value: 54.040000915527344
798
+ name: Query Active Dims
799
+ - type: query_sparsity_ratio
800
+ value: 0.9982294737921654
801
+ name: Query Sparsity Ratio
802
+ - type: corpus_active_dims
803
+ value: 213.87989807128906
804
+ name: Corpus Active Dims
805
+ - type: corpus_sparsity_ratio
806
+ value: 0.992992598844398
807
+ name: Corpus Sparsity Ratio
808
+ - task:
809
+ type: sparse-information-retrieval
810
+ name: Sparse Information Retrieval
811
+ dataset:
812
+ name: NanoHotpotQA
813
+ type: NanoHotpotQA
814
+ metrics:
815
+ - type: dot_accuracy@1
816
+ value: 0.88
817
+ name: Dot Accuracy@1
818
+ - type: dot_accuracy@3
819
+ value: 0.94
820
+ name: Dot Accuracy@3
821
+ - type: dot_accuracy@5
822
+ value: 0.96
823
+ name: Dot Accuracy@5
824
+ - type: dot_accuracy@10
825
+ value: 0.96
826
+ name: Dot Accuracy@10
827
+ - type: dot_precision@1
828
+ value: 0.88
829
+ name: Dot Precision@1
830
+ - type: dot_precision@3
831
+ value: 0.5133333333333333
832
+ name: Dot Precision@3
833
+ - type: dot_precision@5
834
+ value: 0.3399999999999999
835
+ name: Dot Precision@5
836
+ - type: dot_precision@10
837
+ value: 0.17199999999999996
838
+ name: Dot Precision@10
839
+ - type: dot_recall@1
840
+ value: 0.44
841
+ name: Dot Recall@1
842
+ - type: dot_recall@3
843
+ value: 0.77
844
+ name: Dot Recall@3
845
+ - type: dot_recall@5
846
+ value: 0.85
847
+ name: Dot Recall@5
848
+ - type: dot_recall@10
849
+ value: 0.86
850
+ name: Dot Recall@10
851
+ - type: dot_ndcg@10
852
+ value: 0.8259863564109206
853
+ name: Dot Ndcg@10
854
+ - type: dot_mrr@10
855
+ value: 0.9116666666666667
856
+ name: Dot Mrr@10
857
+ - type: dot_map@100
858
+ value: 0.772433308579342
859
+ name: Dot Map@100
860
+ - type: query_active_dims
861
+ value: 68.36000061035156
862
+ name: Query Active Dims
863
+ - type: query_sparsity_ratio
864
+ value: 0.9977603040229883
865
+ name: Query Sparsity Ratio
866
+ - type: corpus_active_dims
867
+ value: 223.86521911621094
868
+ name: Corpus Active Dims
869
+ - type: corpus_sparsity_ratio
870
+ value: 0.9926654472473556
871
+ name: Corpus Sparsity Ratio
872
+ - task:
873
+ type: sparse-information-retrieval
874
+ name: Sparse Information Retrieval
875
+ dataset:
876
+ name: NanoQuoraRetrieval
877
+ type: NanoQuoraRetrieval
878
+ metrics:
879
+ - type: dot_accuracy@1
880
+ value: 0.9
881
+ name: Dot Accuracy@1
882
+ - type: dot_accuracy@3
883
+ value: 1.0
884
+ name: Dot Accuracy@3
885
+ - type: dot_accuracy@5
886
+ value: 1.0
887
+ name: Dot Accuracy@5
888
+ - type: dot_accuracy@10
889
+ value: 1.0
890
+ name: Dot Accuracy@10
891
+ - type: dot_precision@1
892
+ value: 0.9
893
+ name: Dot Precision@1
894
+ - type: dot_precision@3
895
+ value: 0.38666666666666655
896
+ name: Dot Precision@3
897
+ - type: dot_precision@5
898
+ value: 0.24799999999999997
899
+ name: Dot Precision@5
900
+ - type: dot_precision@10
901
+ value: 0.12999999999999998
902
+ name: Dot Precision@10
903
+ - type: dot_recall@1
904
+ value: 0.8073333333333333
905
+ name: Dot Recall@1
906
+ - type: dot_recall@3
907
+ value: 0.938
908
+ name: Dot Recall@3
909
+ - type: dot_recall@5
910
+ value: 0.9653333333333333
911
+ name: Dot Recall@5
912
+ - type: dot_recall@10
913
+ value: 0.98
914
+ name: Dot Recall@10
915
+ - type: dot_ndcg@10
916
+ value: 0.9411045044022702
917
+ name: Dot Ndcg@10
918
+ - type: dot_mrr@10
919
+ value: 0.9466666666666665
920
+ name: Dot Mrr@10
921
+ - type: dot_map@100
922
+ value: 0.9183274196019293
923
+ name: Dot Map@100
924
+ - type: query_active_dims
925
+ value: 57.5
926
+ name: Query Active Dims
927
+ - type: query_sparsity_ratio
928
+ value: 0.9981161129676954
929
+ name: Query Sparsity Ratio
930
+ - type: corpus_active_dims
931
+ value: 58.39020919799805
932
+ name: Corpus Active Dims
933
+ - type: corpus_sparsity_ratio
934
+ value: 0.9980869468187538
935
+ name: Corpus Sparsity Ratio
936
+ - task:
937
+ type: sparse-information-retrieval
938
+ name: Sparse Information Retrieval
939
+ dataset:
940
+ name: NanoSCIDOCS
941
+ type: NanoSCIDOCS
942
+ metrics:
943
+ - type: dot_accuracy@1
944
+ value: 0.42
945
+ name: Dot Accuracy@1
946
+ - type: dot_accuracy@3
947
+ value: 0.56
948
+ name: Dot Accuracy@3
949
+ - type: dot_accuracy@5
950
+ value: 0.74
951
+ name: Dot Accuracy@5
952
+ - type: dot_accuracy@10
953
+ value: 0.78
954
+ name: Dot Accuracy@10
955
+ - type: dot_precision@1
956
+ value: 0.42
957
+ name: Dot Precision@1
958
+ - type: dot_precision@3
959
+ value: 0.28
960
+ name: Dot Precision@3
961
+ - type: dot_precision@5
962
+ value: 0.25199999999999995
963
+ name: Dot Precision@5
964
+ - type: dot_precision@10
965
+ value: 0.154
966
+ name: Dot Precision@10
967
+ - type: dot_recall@1
968
+ value: 0.08766666666666667
969
+ name: Dot Recall@1
970
+ - type: dot_recall@3
971
+ value: 0.17266666666666666
972
+ name: Dot Recall@3
973
+ - type: dot_recall@5
974
+ value: 0.25766666666666665
975
+ name: Dot Recall@5
976
+ - type: dot_recall@10
977
+ value: 0.31566666666666665
978
+ name: Dot Recall@10
979
+ - type: dot_ndcg@10
980
+ value: 0.3183178982652113
981
+ name: Dot Ndcg@10
982
+ - type: dot_mrr@10
983
+ value: 0.5296904761904762
984
+ name: Dot Mrr@10
985
+ - type: dot_map@100
986
+ value: 0.24557421391176226
987
+ name: Dot Map@100
988
+ - type: query_active_dims
989
+ value: 73.30000305175781
990
+ name: Query Active Dims
991
+ - type: query_sparsity_ratio
992
+ value: 0.9975984534744854
993
+ name: Query Sparsity Ratio
994
+ - type: corpus_active_dims
995
+ value: 293.607177734375
996
+ name: Corpus Active Dims
997
+ - type: corpus_sparsity_ratio
998
+ value: 0.9903804738308638
999
+ name: Corpus Sparsity Ratio
1000
+ - task:
1001
+ type: sparse-information-retrieval
1002
+ name: Sparse Information Retrieval
1003
+ dataset:
1004
+ name: NanoArguAna
1005
+ type: NanoArguAna
1006
+ metrics:
1007
+ - type: dot_accuracy@1
1008
+ value: 0.14
1009
+ name: Dot Accuracy@1
1010
+ - type: dot_accuracy@3
1011
+ value: 0.42
1012
+ name: Dot Accuracy@3
1013
+ - type: dot_accuracy@5
1014
+ value: 0.58
1015
+ name: Dot Accuracy@5
1016
+ - type: dot_accuracy@10
1017
+ value: 0.7
1018
+ name: Dot Accuracy@10
1019
+ - type: dot_precision@1
1020
+ value: 0.14
1021
+ name: Dot Precision@1
1022
+ - type: dot_precision@3
1023
+ value: 0.13999999999999999
1024
+ name: Dot Precision@3
1025
+ - type: dot_precision@5
1026
+ value: 0.11600000000000002
1027
+ name: Dot Precision@5
1028
+ - type: dot_precision@10
1029
+ value: 0.07
1030
+ name: Dot Precision@10
1031
+ - type: dot_recall@1
1032
+ value: 0.14
1033
+ name: Dot Recall@1
1034
+ - type: dot_recall@3
1035
+ value: 0.42
1036
+ name: Dot Recall@3
1037
+ - type: dot_recall@5
1038
+ value: 0.58
1039
+ name: Dot Recall@5
1040
+ - type: dot_recall@10
1041
+ value: 0.7
1042
+ name: Dot Recall@10
1043
+ - type: dot_ndcg@10
1044
+ value: 0.40946212538272647
1045
+ name: Dot Ndcg@10
1046
+ - type: dot_mrr@10
1047
+ value: 0.317547619047619
1048
+ name: Dot Mrr@10
1049
+ - type: dot_map@100
1050
+ value: 0.3292918677514585
1051
+ name: Dot Map@100
1052
+ - type: query_active_dims
1053
+ value: 281.1600036621094
1054
+ name: Query Active Dims
1055
+ - type: query_sparsity_ratio
1056
+ value: 0.990788283740839
1057
+ name: Query Sparsity Ratio
1058
+ - type: corpus_active_dims
1059
+ value: 268.114990234375
1060
+ name: Corpus Active Dims
1061
+ - type: corpus_sparsity_ratio
1062
+ value: 0.991215680812713
1063
+ name: Corpus Sparsity Ratio
1064
+ - task:
1065
+ type: sparse-information-retrieval
1066
+ name: Sparse Information Retrieval
1067
+ dataset:
1068
+ name: NanoSciFact
1069
+ type: NanoSciFact
1070
+ metrics:
1071
+ - type: dot_accuracy@1
1072
+ value: 0.54
1073
+ name: Dot Accuracy@1
1074
+ - type: dot_accuracy@3
1075
+ value: 0.66
1076
+ name: Dot Accuracy@3
1077
+ - type: dot_accuracy@5
1078
+ value: 0.74
1079
+ name: Dot Accuracy@5
1080
+ - type: dot_accuracy@10
1081
+ value: 0.82
1082
+ name: Dot Accuracy@10
1083
+ - type: dot_precision@1
1084
+ value: 0.54
1085
+ name: Dot Precision@1
1086
+ - type: dot_precision@3
1087
+ value: 0.24
1088
+ name: Dot Precision@3
1089
+ - type: dot_precision@5
1090
+ value: 0.16799999999999998
1091
+ name: Dot Precision@5
1092
+ - type: dot_precision@10
1093
+ value: 0.092
1094
+ name: Dot Precision@10
1095
+ - type: dot_recall@1
1096
+ value: 0.52
1097
+ name: Dot Recall@1
1098
+ - type: dot_recall@3
1099
+ value: 0.65
1100
+ name: Dot Recall@3
1101
+ - type: dot_recall@5
1102
+ value: 0.74
1103
+ name: Dot Recall@5
1104
+ - type: dot_recall@10
1105
+ value: 0.81
1106
+ name: Dot Recall@10
1107
+ - type: dot_ndcg@10
1108
+ value: 0.668993132237426
1109
+ name: Dot Ndcg@10
1110
+ - type: dot_mrr@10
1111
+ value: 0.623968253968254
1112
+ name: Dot Mrr@10
1113
+ - type: dot_map@100
1114
+ value: 0.6278823742890459
1115
+ name: Dot Map@100
1116
+ - type: query_active_dims
1117
+ value: 109.4000015258789
1118
+ name: Query Active Dims
1119
+ - type: query_sparsity_ratio
1120
+ value: 0.9964157001007182
1121
+ name: Query Sparsity Ratio
1122
+ - type: corpus_active_dims
1123
+ value: 348.5179748535156
1124
+ name: Corpus Active Dims
1125
+ - type: corpus_sparsity_ratio
1126
+ value: 0.9885814175069289
1127
+ name: Corpus Sparsity Ratio
1128
+ - task:
1129
+ type: sparse-information-retrieval
1130
+ name: Sparse Information Retrieval
1131
+ dataset:
1132
+ name: NanoTouche2020
1133
+ type: NanoTouche2020
1134
+ metrics:
1135
+ - type: dot_accuracy@1
1136
+ value: 0.7346938775510204
1137
+ name: Dot Accuracy@1
1138
+ - type: dot_accuracy@3
1139
+ value: 0.9183673469387755
1140
+ name: Dot Accuracy@3
1141
+ - type: dot_accuracy@5
1142
+ value: 0.9591836734693877
1143
+ name: Dot Accuracy@5
1144
+ - type: dot_accuracy@10
1145
+ value: 0.9591836734693877
1146
+ name: Dot Accuracy@10
1147
+ - type: dot_precision@1
1148
+ value: 0.7346938775510204
1149
+ name: Dot Precision@1
1150
+ - type: dot_precision@3
1151
+ value: 0.6258503401360545
1152
+ name: Dot Precision@3
1153
+ - type: dot_precision@5
1154
+ value: 0.5673469387755103
1155
+ name: Dot Precision@5
1156
+ - type: dot_precision@10
1157
+ value: 0.4612244897959184
1158
+ name: Dot Precision@10
1159
+ - type: dot_recall@1
1160
+ value: 0.052703291471304
1161
+ name: Dot Recall@1
1162
+ - type: dot_recall@3
1163
+ value: 0.1338383723587515
1164
+ name: Dot Recall@3
1165
+ - type: dot_recall@5
1166
+ value: 0.19411388149464573
1167
+ name: Dot Recall@5
1168
+ - type: dot_recall@10
1169
+ value: 0.30722833210959427
1170
+ name: Dot Recall@10
1171
+ - type: dot_ndcg@10
1172
+ value: 0.5361442152154757
1173
+ name: Dot Ndcg@10
1174
+ - type: dot_mrr@10
1175
+ value: 0.8255102040816327
1176
+ name: Dot Mrr@10
1177
+ - type: dot_map@100
1178
+ value: 0.3995866253752792
1179
+ name: Dot Map@100
1180
+ - type: query_active_dims
1181
+ value: 56.61224365234375
1182
+ name: Query Active Dims
1183
+ - type: query_sparsity_ratio
1184
+ value: 0.9981451987532814
1185
+ name: Query Sparsity Ratio
1186
+ - type: corpus_active_dims
1187
+ value: 224.8710174560547
1188
+ name: Corpus Active Dims
1189
+ - type: corpus_sparsity_ratio
1190
+ value: 0.9926324940221462
1191
+ name: Corpus Sparsity Ratio
1192
+ ---
1193
+
1194
+ # splade-co-condenser-marco trained on MS MARCO hard negatives with distillation
1195
+
1196
+ This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [Luyu/co-condenser-marco](https://huggingface.co/Luyu/co-condenser-marco) on the [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) dataset using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.
1197
+ ## Model Details
1198
+
1199
+ ### Model Description
1200
+ - **Model Type:** SPLADE Sparse Encoder
1201
+ - **Base model:** [Luyu/co-condenser-marco](https://huggingface.co/Luyu/co-condenser-marco) <!-- at revision e0cef0ab2410aae0f0994366ddefb5649a266709 -->
1202
+ - **Maximum Sequence Length:** 256 tokens
1203
+ - **Output Dimensionality:** 30522 dimensions
1204
+ - **Similarity Function:** Dot Product
1205
+ - **Training Dataset:**
1206
+ - [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco)
1207
+ - **Language:** en
1208
+ - **License:** apache-2.0
1209
+
1210
+ ### Model Sources
1211
+
1212
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
1213
+ - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
1214
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
1215
+ - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)
1216
+
1217
+ ### Full Model Architecture
1218
+
1219
+ ```
1220
+ SparseEncoder(
1221
+ (0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False}) with MLMTransformer model: BertForMaskedLM
1222
+ (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
1223
+ )
1224
+ ```
1225
+
1226
+ ## Usage
1227
+
1228
+ ### Direct Usage (Sentence Transformers)
1229
+
1230
+ First install the Sentence Transformers library:
1231
+
1232
+ ```bash
1233
+ pip install -U sentence-transformers
1234
+ ```
1235
+
1236
+ Then you can load this model and run inference.
1237
+ ```python
1238
+ from sentence_transformers import SparseEncoder
1239
+
1240
+ # Download from the 🤗 Hub
1241
+ model = SparseEncoder("arthurbresnu/co-condenser-marco-msmarco-hard-negatives")
1242
+ # Run inference
1243
+ queries = [
1244
+ "fastest super cars in the world",
1245
+ ]
1246
+ documents = [
1247
+ 'The McLaren F1 is amongst the fastest cars in the McLaren series and also the fastest car in the world. The McLaren F1 can clock a maximum speed of 240 miles per hour, or an equivalent of 386 km per hour.',
1248
+ 'You heard about fastest cars, bikes and plans but today we have world fastest bird collection. In our collection we have top 10 fastest birds of the world. Birdâ\x80\x99s flight speed is fundamentally changeable; a hunting bird speed will increase while diving-to-catch prey as compared to its gliding speeds. Here we have the top 10 fastest birds with their flight speed. 10. Teal 109 km/h (68mph) This bird can fly 109 km/ h (68mph); they are 53 to 59cm long. This bird always lives in group. 09.',
1249
+ 'Where is Langley, BC? Location of Langley on a map. Langley is a city found in British Columbia, Canada. It is located 49.08 latitude and -122.59 longitude and it is situated at elevation 78 meters above sea level. Langley has a population of 93,726 making it the 13th biggest city in British Columbia.',
1250
+ ]
1251
+ query_embeddings = model.encode_query(queries)
1252
+ document_embeddings = model.encode_document(documents)
1253
+ print(query_embeddings.shape, document_embeddings.shape)
1254
+ # [1, 30522] [3, 30522]
1255
+
1256
+ # Get the similarity scores for the embeddings
1257
+ similarities = model.similarity(query_embeddings, document_embeddings)
1258
+ print(similarities)
1259
+ # tensor([[35.7080, 24.5349, 3.8619]])
1260
+ ```
1261
+
1262
+ <!--
1263
+ ### Direct Usage (Transformers)
1264
+
1265
+ <details><summary>Click to see the direct usage in Transformers</summary>
1266
+
1267
+ </details>
1268
+ -->
1269
+
1270
+ <!--
1271
+ ### Downstream Usage (Sentence Transformers)
1272
+
1273
+ You can finetune this model on your own dataset.
1274
+
1275
+ <details><summary>Click to expand</summary>
1276
+
1277
+ </details>
1278
+ -->
1279
+
1280
+ <!--
1281
+ ### Out-of-Scope Use
1282
+
1283
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
1284
+ -->
1285
+
1286
+ ## Evaluation
1287
+
1288
+ ### Metrics
1289
+
1290
+ #### Sparse Information Retrieval
1291
+
1292
+ * Datasets: `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoClimateFEVER`, `NanoDBPedia`, `NanoFEVER`, `NanoFiQA2018`, `NanoHotpotQA`, `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoQuoraRetrieval`, `NanoSCIDOCS`, `NanoArguAna`, `NanoSciFact` and `NanoTouche2020`
1293
+ * Evaluated with [<code>SparseInformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseInformationRetrievalEvaluator)
1294
+
1295
+ | Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
1296
+ |:----------------------|:------------|:-------------|:-----------|:-----------------|:------------|:-----------|:-------------|:-------------|:-------------------|:------------|:------------|:------------|:---------------|
1297
+ | dot_accuracy@1 | 0.4 | 0.44 | 0.52 | 0.32 | 0.74 | 0.8 | 0.42 | 0.88 | 0.9 | 0.42 | 0.14 | 0.54 | 0.7347 |
1298
+ | dot_accuracy@3 | 0.62 | 0.6 | 0.74 | 0.52 | 0.86 | 0.92 | 0.52 | 0.94 | 1.0 | 0.56 | 0.42 | 0.66 | 0.9184 |
1299
+ | dot_accuracy@5 | 0.68 | 0.64 | 0.78 | 0.54 | 0.9 | 0.94 | 0.58 | 0.96 | 1.0 | 0.74 | 0.58 | 0.74 | 0.9592 |
1300
+ | dot_accuracy@10 | 0.84 | 0.68 | 0.84 | 0.62 | 0.94 | 0.96 | 0.68 | 0.96 | 1.0 | 0.78 | 0.7 | 0.82 | 0.9592 |
1301
+ | dot_precision@1 | 0.4 | 0.44 | 0.52 | 0.32 | 0.74 | 0.8 | 0.42 | 0.88 | 0.9 | 0.42 | 0.14 | 0.54 | 0.7347 |
1302
+ | dot_precision@3 | 0.2067 | 0.34 | 0.2533 | 0.2 | 0.6133 | 0.32 | 0.2133 | 0.5133 | 0.3867 | 0.28 | 0.14 | 0.24 | 0.6259 |
1303
+ | dot_precision@5 | 0.136 | 0.316 | 0.16 | 0.14 | 0.588 | 0.204 | 0.168 | 0.34 | 0.248 | 0.252 | 0.116 | 0.168 | 0.5673 |
1304
+ | dot_precision@10 | 0.084 | 0.27 | 0.09 | 0.082 | 0.508 | 0.106 | 0.11 | 0.172 | 0.13 | 0.154 | 0.07 | 0.092 | 0.4612 |
1305
+ | dot_recall@1 | 0.4 | 0.0631 | 0.48 | 0.165 | 0.0764 | 0.7567 | 0.2361 | 0.44 | 0.8073 | 0.0877 | 0.14 | 0.52 | 0.0527 |
1306
+ | dot_recall@3 | 0.62 | 0.099 | 0.69 | 0.26 | 0.18 | 0.8867 | 0.3181 | 0.77 | 0.938 | 0.1727 | 0.42 | 0.65 | 0.1338 |
1307
+ | dot_recall@5 | 0.68 | 0.1169 | 0.73 | 0.2873 | 0.2374 | 0.92 | 0.3795 | 0.85 | 0.9653 | 0.2577 | 0.58 | 0.74 | 0.1941 |
1308
+ | dot_recall@10 | 0.84 | 0.1468 | 0.8 | 0.3223 | 0.3398 | 0.95 | 0.4829 | 0.86 | 0.98 | 0.3157 | 0.7 | 0.81 | 0.3072 |
1309
+ | **dot_ndcg@10** | **0.6077** | **0.3452** | **0.6595** | **0.3037** | **0.6228** | **0.8719** | **0.4125** | **0.826** | **0.9411** | **0.3183** | **0.4095** | **0.669** | **0.5361** |
1310
+ | dot_mrr@10 | 0.5353 | 0.5258 | 0.6369 | 0.4207 | 0.8137 | 0.8608 | 0.4934 | 0.9117 | 0.9467 | 0.5297 | 0.3175 | 0.624 | 0.8255 |
1311
+ | dot_map@100 | 0.5419 | 0.1699 | 0.6105 | 0.2558 | 0.483 | 0.8427 | 0.3564 | 0.7724 | 0.9183 | 0.2456 | 0.3293 | 0.6279 | 0.3996 |
1312
+ | query_active_dims | 54.12 | 51.7 | 53.34 | 135.3 | 52.26 | 79.14 | 54.04 | 68.36 | 57.5 | 73.3 | 281.16 | 109.4 | 56.6122 |
1313
+ | query_sparsity_ratio | 0.9982 | 0.9983 | 0.9983 | 0.9956 | 0.9983 | 0.9974 | 0.9982 | 0.9978 | 0.9981 | 0.9976 | 0.9908 | 0.9964 | 0.9981 |
1314
+ | corpus_active_dims | 187.6754 | 336.3248 | 223.5909 | 270.1291 | 219.799 | 287.1962 | 213.8799 | 223.8652 | 58.3902 | 293.6072 | 268.115 | 348.518 | 224.871 |
1315
+ | corpus_sparsity_ratio | 0.9939 | 0.989 | 0.9927 | 0.9911 | 0.9928 | 0.9906 | 0.993 | 0.9927 | 0.9981 | 0.9904 | 0.9912 | 0.9886 | 0.9926 |
1316
+
1317
+ #### Sparse Nano BEIR
1318
+
1319
+ * Dataset: `NanoBEIR_mean`
1320
+ * Evaluated with [<code>SparseNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters:
1321
+ ```json
1322
+ {
1323
+ "dataset_names": [
1324
+ "msmarco",
1325
+ "nfcorpus",
1326
+ "nq"
1327
+ ]
1328
+ }
1329
+ ```
1330
+
1331
+ | Metric | Value |
1332
+ |:----------------------|:-----------|
1333
+ | dot_accuracy@1 | 0.4533 |
1334
+ | dot_accuracy@3 | 0.6533 |
1335
+ | dot_accuracy@5 | 0.7 |
1336
+ | dot_accuracy@10 | 0.7867 |
1337
+ | dot_precision@1 | 0.4533 |
1338
+ | dot_precision@3 | 0.2667 |
1339
+ | dot_precision@5 | 0.204 |
1340
+ | dot_precision@10 | 0.148 |
1341
+ | dot_recall@1 | 0.3144 |
1342
+ | dot_recall@3 | 0.4697 |
1343
+ | dot_recall@5 | 0.509 |
1344
+ | dot_recall@10 | 0.5956 |
1345
+ | **dot_ndcg@10** | **0.5375** |
1346
+ | dot_mrr@10 | 0.566 |
1347
+ | dot_map@100 | 0.4408 |
1348
+ | query_active_dims | 53.0533 |
1349
+ | query_sparsity_ratio | 0.9983 |
1350
+ | corpus_active_dims | 235.2386 |
1351
+ | corpus_sparsity_ratio | 0.9923 |
1352
+
1353
+ #### Sparse Nano BEIR
1354
+
1355
+ * Dataset: `NanoBEIR_mean`
1356
+ * Evaluated with [<code>SparseNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters:
1357
+ ```json
1358
+ {
1359
+ "dataset_names": [
1360
+ "climatefever",
1361
+ "dbpedia",
1362
+ "fever",
1363
+ "fiqa2018",
1364
+ "hotpotqa",
1365
+ "msmarco",
1366
+ "nfcorpus",
1367
+ "nq",
1368
+ "quoraretrieval",
1369
+ "scidocs",
1370
+ "arguana",
1371
+ "scifact",
1372
+ "touche2020"
1373
+ ]
1374
+ }
1375
+ ```
1376
+
1377
+ | Metric | Value |
1378
+ |:----------------------|:-----------|
1379
+ | dot_accuracy@1 | 0.5581 |
1380
+ | dot_accuracy@3 | 0.7137 |
1381
+ | dot_accuracy@5 | 0.7722 |
1382
+ | dot_accuracy@10 | 0.8292 |
1383
+ | dot_precision@1 | 0.5581 |
1384
+ | dot_precision@3 | 0.3333 |
1385
+ | dot_precision@5 | 0.2618 |
1386
+ | dot_precision@10 | 0.1792 |
1387
+ | dot_recall@1 | 0.325 |
1388
+ | dot_recall@3 | 0.4722 |
1389
+ | dot_recall@5 | 0.5337 |
1390
+ | dot_recall@10 | 0.6042 |
1391
+ | **dot_ndcg@10** | **0.5787** |
1392
+ | dot_mrr@10 | 0.6494 |
1393
+ | dot_map@100 | 0.5041 |
1394
+ | query_active_dims | 86.6795 |
1395
+ | query_sparsity_ratio | 0.9972 |
1396
+ | corpus_active_dims | 230.5676 |
1397
+ | corpus_sparsity_ratio | 0.9924 |
1398
+
1399
+ <!--
1400
+ ## Bias, Risks and Limitations
1401
+
1402
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
1403
+ -->
1404
+
1405
+ <!--
1406
+ ### Recommendations
1407
+
1408
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
1409
+ -->
1410
+
1411
+ ## Training Details
1412
+
1413
+ ### Training Dataset
1414
+
1415
+ #### msmarco
1416
+
1417
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) at [9e329ed](https://huggingface.co/datasets/sentence-transformers/msmarco/tree/9e329ed2e649c9d37b0d91dd6b764ff6fe671d83)
1418
+ * Size: 90,000 training samples
1419
+ * Columns: <code>score</code>, <code>query</code>, <code>positive</code>, and <code>negative</code>
1420
+ * Approximate statistics based on the first 1000 samples:
1421
+ | | score | query | positive | negative |
1422
+ |:--------|:--------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
1423
+ | type | float | string | string | string |
1424
+ | details | <ul><li>min: -3.66</li><li>mean: 12.97</li><li>max: 22.48</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.89 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 16 tokens</li><li>mean: 80.61 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 78.92 tokens</li><li>max: 250 tokens</li></ul> |
1425
+ * Samples:
1426
+ | score | query | positive | negative |
1427
+ |:--------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1428
+ | <code>2.1688317457834883</code> | <code>what is ast test used for</code> | <code>The AST test is commonly used to check for liver diseases. It is usually measured together with alanine aminotransferase (ALT). The AST to ALT ratio can help your doctor diagnose liver disease. Symptoms of liver disease that may cause your doctor to order an AST test include: 1 fatigue. 2 weakness.3 loss of appetite.t is usually measured together with alanine aminotransferase (ALT). The AST to ALT ratio can help your doctor diagnose liver disease. Symptoms of liver disease that may cause your doctor to order an AST test include: 1 fatigue. 2 weakness. 3 loss of appetite.</code> | <code>An aspartate aminotransferase (AST) test measures the amount of this enzyme in the blood. AST is normally found in red blood cells, liver, heart, muscle tissue, pancreas, and kidneys. AST formerly was called serum glutamic oxaloacetic transaminase (SGOT).he amount of AST in the blood is directly related to the extent of the tissue damage. After severe damage, AST levels rise in 6 to 10 hours and remain high for about 4 days. The AST test may be done at the same time as a test for alanine aminotransferase, or ALT.</code> |
1429
+ | <code>12.405409197012585</code> | <code>what does the suspensory ligament do when the cillary muscles contract</code> | <code>Suspensory Ligaments of the Ciliary Body: The suspensory ligaments of the ciliary body are ligaments that attach the ciliary body to the lens of the eye. Suspensory ligaments enable the ciliary body to change the shape of the lens as needed to focus light reflected from objects at different distances from the eye.</code> | <code>Ossification of the posterior longitudinal ligament of the spine: Introduction. Ossification of the posterior longitudinal ligament of the spine: Abnormal calcification of a spinal ligament. The progressive calcification can starts within months of birth and affects the ability to move arms and legs.ssification of the posterior longitudinal ligament of the spine: Introduction. Ossification of the posterior longitudinal ligament of the spine: Abnormal calcification of a spinal ligament. The progressive calcification can starts within months of birth and affects the ability to move arms and legs.</code> |
1430
+ | <code>19.407212177912392</code> | <code>how many kids does trump have</code> | <code>Donald Trump has 5 children: Donald Jr., Eric, and Ivanka- mother Ivana Trump Tiffany -mother Marla Maples Barron-mother Malania Trump Donald Trump Jr. has 2 children: … Kai Madison Trump and Donald Trump III.</code> | <code>Copyright © 2018, Trump Make America Great Again Committee. Paid for by Trump Make America Great Again Committee, a joint fundraising committee authorized by and composed of Donald J. Trump for President, Inc. and the Republican National Committee. x Close</code> |
1431
+ * Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
1432
+ ```json
1433
+ {
1434
+ "loss": "SparseMarginMSELoss",
1435
+ "lambda_corpus": 0.08,
1436
+ "lambda_query": 0.1
1437
+ }
1438
+ ```
1439
+
1440
+ ### Evaluation Dataset
1441
+
1442
+ #### msmarco
1443
+
1444
+ * Dataset: [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco) at [9e329ed](https://huggingface.co/datasets/sentence-transformers/msmarco/tree/9e329ed2e649c9d37b0d91dd6b764ff6fe671d83)
1445
+ * Size: 10,000 evaluation samples
1446
+ * Columns: <code>score</code>, <code>query</code>, <code>positive</code>, and <code>negative</code>
1447
+ * Approximate statistics based on the first 1000 samples:
1448
+ | | score | query | positive | negative |
1449
+ |:--------|:--------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
1450
+ | type | float | string | string | string |
1451
+ | details | <ul><li>min: -4.07</li><li>mean: 13.12</li><li>max: 22.25</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.96 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 80.54 tokens</li><li>max: 220 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 78.41 tokens</li><li>max: 242 tokens</li></ul> |
1452
+ * Samples:
1453
+ | score | query | positive | negative |
1454
+ |:--------------------------------|:-------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
1455
+ | <code>11.227776050567627</code> | <code>tabernacle definition</code> | <code>Wiktionary(0.00 / 0 votes)Rate this definition: tabernacle(Noun) any temporary dwelling, a hut, tent, booth. tabernacle(Noun) (Old Testament) The portable tent used before the construction of the temple, where the shekinah (presence of God) was believed to dwell. 1611 ... So Moses finished the work. Then a cloud covered the tent of the congregation, and the glory of the LORD filled the tabernacle.</code> | <code>Both the Annunciation tabernacle in Santa Croce and the Cantoria (the singer's pulpit) in the Duomo (now in the Museo dell'Opera del Duomo) show a vastly increased repertory of forms derived from ancient art, the harvest of Donatello's long stay in Rome (1430-33).</code> |
1456
+ | <code>12.354041655858357</code> | <code>what scientist discovered radiation</code> | <code>Becquerel used an apparatus similar to that displayed below to show that the radiation he discovered could not be x-rays. X-rays are neutral and cannot be bent in a magnetic field. The new radiation was bent by the magnetic field so that the radiation must be charged and different than x-rays.</code> | <code>5a-Hydroxy Laxogenin. 5a-Hydroxy Laxogenin was discovered by a American scientist in 1996. It was shown to possess an anabolic/androgenic ratio similar to one of the most efficient anabolic substances, in particular Anavar but without the side effects of liver toxicity or testing positive for steroidal therapy.</code> |
1457
+ | <code>11.721514344215393</code> | <code>are horses primates</code> | <code>Primates still do, but many, if not most, mammals do not. Horses, deer, cows and many other mammals have a reduced number of digits on their forelimbs and hindlimbs. Primates also retain other generalized skeletal features like the clavicle or collar bone.</code> | <code>The only primates that live in Canada are humans. The species originated in east Africa and is unrelated to South American primates. Humans first arrived in large numbers to Canada around 15,000 years ago from North Asia, and surged in migration starting 400 years ago from around the world, especially from Europe.</code> |
1458
+ * Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
1459
+ ```json
1460
+ {
1461
+ "loss": "SparseMarginMSELoss",
1462
+ "lambda_corpus": 0.08,
1463
+ "lambda_query": 0.1
1464
+ }
1465
+ ```
1466
+
1467
+ ### Training Hyperparameters
1468
+ #### Non-Default Hyperparameters
1469
+
1470
+ - `eval_strategy`: steps
1471
+ - `per_device_train_batch_size`: 16
1472
+ - `per_device_eval_batch_size`: 16
1473
+ - `learning_rate`: 2e-05
1474
+ - `num_train_epochs`: 1
1475
+ - `warmup_ratio`: 0.1
1476
+ - `bf16`: True
1477
+ - `load_best_model_at_end`: True
1478
+
1479
+ #### All Hyperparameters
1480
+ <details><summary>Click to expand</summary>
1481
+
1482
+ - `overwrite_output_dir`: False
1483
+ - `do_predict`: False
1484
+ - `eval_strategy`: steps
1485
+ - `prediction_loss_only`: True
1486
+ - `per_device_train_batch_size`: 16
1487
+ - `per_device_eval_batch_size`: 16
1488
+ - `per_gpu_train_batch_size`: None
1489
+ - `per_gpu_eval_batch_size`: None
1490
+ - `gradient_accumulation_steps`: 1
1491
+ - `eval_accumulation_steps`: None
1492
+ - `torch_empty_cache_steps`: None
1493
+ - `learning_rate`: 2e-05
1494
+ - `weight_decay`: 0.0
1495
+ - `adam_beta1`: 0.9
1496
+ - `adam_beta2`: 0.999
1497
+ - `adam_epsilon`: 1e-08
1498
+ - `max_grad_norm`: 1.0
1499
+ - `num_train_epochs`: 1
1500
+ - `max_steps`: -1
1501
+ - `lr_scheduler_type`: linear
1502
+ - `lr_scheduler_kwargs`: {}
1503
+ - `warmup_ratio`: 0.1
1504
+ - `warmup_steps`: 0
1505
+ - `log_level`: passive
1506
+ - `log_level_replica`: warning
1507
+ - `log_on_each_node`: True
1508
+ - `logging_nan_inf_filter`: True
1509
+ - `save_safetensors`: True
1510
+ - `save_on_each_node`: False
1511
+ - `save_only_model`: False
1512
+ - `restore_callback_states_from_checkpoint`: False
1513
+ - `no_cuda`: False
1514
+ - `use_cpu`: False
1515
+ - `use_mps_device`: False
1516
+ - `seed`: 42
1517
+ - `data_seed`: None
1518
+ - `jit_mode_eval`: False
1519
+ - `use_ipex`: False
1520
+ - `bf16`: True
1521
+ - `fp16`: False
1522
+ - `fp16_opt_level`: O1
1523
+ - `half_precision_backend`: auto
1524
+ - `bf16_full_eval`: False
1525
+ - `fp16_full_eval`: False
1526
+ - `tf32`: None
1527
+ - `local_rank`: 0
1528
+ - `ddp_backend`: None
1529
+ - `tpu_num_cores`: None
1530
+ - `tpu_metrics_debug`: False
1531
+ - `debug`: []
1532
+ - `dataloader_drop_last`: False
1533
+ - `dataloader_num_workers`: 0
1534
+ - `dataloader_prefetch_factor`: None
1535
+ - `past_index`: -1
1536
+ - `disable_tqdm`: False
1537
+ - `remove_unused_columns`: True
1538
+ - `label_names`: None
1539
+ - `load_best_model_at_end`: True
1540
+ - `ignore_data_skip`: False
1541
+ - `fsdp`: []
1542
+ - `fsdp_min_num_params`: 0
1543
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
1544
+ - `tp_size`: 0
1545
+ - `fsdp_transformer_layer_cls_to_wrap`: None
1546
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
1547
+ - `deepspeed`: None
1548
+ - `label_smoothing_factor`: 0.0
1549
+ - `optim`: adamw_torch
1550
+ - `optim_args`: None
1551
+ - `adafactor`: False
1552
+ - `group_by_length`: False
1553
+ - `length_column_name`: length
1554
+ - `ddp_find_unused_parameters`: None
1555
+ - `ddp_bucket_cap_mb`: None
1556
+ - `ddp_broadcast_buffers`: False
1557
+ - `dataloader_pin_memory`: True
1558
+ - `dataloader_persistent_workers`: False
1559
+ - `skip_memory_metrics`: True
1560
+ - `use_legacy_prediction_loop`: False
1561
+ - `push_to_hub`: False
1562
+ - `resume_from_checkpoint`: None
1563
+ - `hub_model_id`: None
1564
+ - `hub_strategy`: every_save
1565
+ - `hub_private_repo`: None
1566
+ - `hub_always_push`: False
1567
+ - `gradient_checkpointing`: False
1568
+ - `gradient_checkpointing_kwargs`: None
1569
+ - `include_inputs_for_metrics`: False
1570
+ - `include_for_metrics`: []
1571
+ - `eval_do_concat_batches`: True
1572
+ - `fp16_backend`: auto
1573
+ - `push_to_hub_model_id`: None
1574
+ - `push_to_hub_organization`: None
1575
+ - `mp_parameters`:
1576
+ - `auto_find_batch_size`: False
1577
+ - `full_determinism`: False
1578
+ - `torchdynamo`: None
1579
+ - `ray_scope`: last
1580
+ - `ddp_timeout`: 1800
1581
+ - `torch_compile`: False
1582
+ - `torch_compile_backend`: None
1583
+ - `torch_compile_mode`: None
1584
+ - `include_tokens_per_second`: False
1585
+ - `include_num_input_tokens_seen`: False
1586
+ - `neftune_noise_alpha`: None
1587
+ - `optim_target_modules`: None
1588
+ - `batch_eval_metrics`: False
1589
+ - `eval_on_start`: False
1590
+ - `use_liger_kernel`: False
1591
+ - `eval_use_gather_object`: False
1592
+ - `average_tokens_across_devices`: False
1593
+ - `prompts`: None
1594
+ - `batch_sampler`: batch_sampler
1595
+ - `multi_dataset_batch_sampler`: proportional
1596
+ - `router_mapping`: {}
1597
+ - `learning_rate_mapping`: {}
1598
+
1599
+ </details>
1600
+
1601
+ ### Training Logs
1602
+ | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_dot_ndcg@10 | NanoNFCorpus_dot_ndcg@10 | NanoNQ_dot_ndcg@10 | NanoBEIR_mean_dot_ndcg@10 | NanoClimateFEVER_dot_ndcg@10 | NanoDBPedia_dot_ndcg@10 | NanoFEVER_dot_ndcg@10 | NanoFiQA2018_dot_ndcg@10 | NanoHotpotQA_dot_ndcg@10 | NanoQuoraRetrieval_dot_ndcg@10 | NanoSCIDOCS_dot_ndcg@10 | NanoArguAna_dot_ndcg@10 | NanoSciFact_dot_ndcg@10 | NanoTouche2020_dot_ndcg@10 |
1603
+ |:----------:|:--------:|:-------------:|:---------------:|:-----------------------:|:------------------------:|:------------------:|:-------------------------:|:----------------------------:|:-----------------------:|:---------------------:|:------------------------:|:------------------------:|:------------------------------:|:-----------------------:|:-----------------------:|:-----------------------:|:--------------------------:|
1604
+ | 0.0178 | 100 | 664548.88 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1605
+ | 0.0356 | 200 | 1912.7461 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1606
+ | 0.0533 | 300 | 89.4823 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1607
+ | 0.0711 | 400 | 57.4213 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1608
+ | 0.0889 | 500 | 43.5322 | 37.8169 | 0.5271 | 0.2411 | 0.5761 | 0.4481 | - | - | - | - | - | - | - | - | - | - |
1609
+ | 0.1067 | 600 | 38.8042 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1610
+ | 0.1244 | 700 | 34.1112 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1611
+ | 0.1422 | 800 | 30.3487 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1612
+ | 0.16 | 900 | 30.4368 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1613
+ | 0.1778 | 1000 | 30.9444 | 27.4550 | 0.5513 | 0.3375 | 0.6122 | 0.5003 | - | - | - | - | - | - | - | - | - | - |
1614
+ | 0.1956 | 1100 | 27.7082 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1615
+ | 0.2133 | 1200 | 28.6251 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1616
+ | 0.2311 | 1300 | 27.6298 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1617
+ | 0.2489 | 1400 | 24.1523 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1618
+ | 0.2667 | 1500 | 25.3053 | 23.4952 | 0.5898 | 0.3416 | 0.6296 | 0.5203 | - | - | - | - | - | - | - | - | - | - |
1619
+ | 0.2844 | 1600 | 24.8645 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1620
+ | 0.3022 | 1700 | 25.9037 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1621
+ | 0.32 | 1800 | 25.255 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1622
+ | 0.3378 | 1900 | 24.4475 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1623
+ | 0.3556 | 2000 | 22.8183 | 26.7798 | 0.5579 | 0.3407 | 0.6160 | 0.5049 | - | - | - | - | - | - | - | - | - | - |
1624
+ | 0.3733 | 2100 | 22.0948 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1625
+ | 0.3911 | 2200 | 22.9483 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1626
+ | 0.4089 | 2300 | 20.8408 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1627
+ | 0.4267 | 2400 | 19.5543 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1628
+ | 0.4444 | 2500 | 20.9379 | 18.6976 | 0.6327 | 0.3216 | 0.6255 | 0.5266 | - | - | - | - | - | - | - | - | - | - |
1629
+ | 0.4622 | 2600 | 20.2078 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1630
+ | 0.48 | 2700 | 20.6449 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1631
+ | 0.4978 | 2800 | 19.1764 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1632
+ | 0.5156 | 2900 | 19.4603 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1633
+ | 0.5333 | 3000 | 20.3068 | 18.4043 | 0.6081 | 0.3220 | 0.6515 | 0.5272 | - | - | - | - | - | - | - | - | - | - |
1634
+ | 0.5511 | 3100 | 19.1402 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1635
+ | 0.5689 | 3200 | 18.0542 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1636
+ | 0.5867 | 3300 | 17.9658 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1637
+ | 0.6044 | 3400 | 18.4345 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1638
+ | 0.6222 | 3500 | 19.4609 | 17.0769 | 0.6155 | 0.3219 | 0.6545 | 0.5306 | - | - | - | - | - | - | - | - | - | - |
1639
+ | 0.64 | 3600 | 17.4228 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1640
+ | 0.6578 | 3700 | 17.8939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1641
+ | 0.6756 | 3800 | 16.2358 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1642
+ | 0.6933 | 3900 | 16.6908 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1643
+ | 0.7111 | 4000 | 15.9995 | 17.7298 | 0.6022 | 0.3555 | 0.6525 | 0.5367 | - | - | - | - | - | - | - | - | - | - |
1644
+ | 0.7289 | 4100 | 16.3495 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1645
+ | 0.7467 | 4200 | 15.559 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1646
+ | 0.7644 | 4300 | 17.4544 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1647
+ | 0.7822 | 4400 | 15.8666 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1648
+ | 0.8 | 4500 | 16.3616 | 18.8307 | 0.6036 | 0.3472 | 0.6112 | 0.5207 | - | - | - | - | - | - | - | - | - | - |
1649
+ | 0.8178 | 4600 | 15.276 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1650
+ | 0.8356 | 4700 | 15.2697 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1651
+ | 0.8533 | 4800 | 16.6727 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1652
+ | 0.8711 | 4900 | 15.2223 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1653
+ | 0.8889 | 5000 | 15.7583 | 16.2949 | 0.6177 | 0.3438 | 0.6505 | 0.5373 | - | - | - | - | - | - | - | - | - | - |
1654
+ | 0.9067 | 5100 | 15.3164 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1655
+ | 0.9244 | 5200 | 14.9429 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1656
+ | 0.9422 | 5300 | 15.5992 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1657
+ | 0.96 | 5400 | 14.8593 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1658
+ | **0.9778** | **5500** | **14.7565** | **16.423** | **0.6077** | **0.3452** | **0.6595** | **0.5375** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** |
1659
+ | 0.9956 | 5600 | 14.5115 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
1660
+ | -1 | -1 | - | - | 0.6077 | 0.3452 | 0.6595 | 0.5787 | 0.3037 | 0.6228 | 0.8719 | 0.4125 | 0.8260 | 0.9411 | 0.3183 | 0.4095 | 0.6690 | 0.5361 |
1661
+
1662
+ * The bold row denotes the saved checkpoint.
1663
+
1664
+ ### Environmental Impact
1665
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
1666
+ - **Energy Consumed**: 0.093 kWh
1667
+ - **Carbon Emitted**: 0.034 kg of CO2
1668
+ - **Hours Used**: 0.305 hours
1669
+
1670
+ ### Training Hardware
1671
+ - **On Cloud**: No
1672
+ - **GPU Model**: 1 x NVIDIA H100 80GB HBM3
1673
+ - **CPU Model**: AMD EPYC 7R13 Processor
1674
+ - **RAM Size**: 248.00 GB
1675
+
1676
+ ### Framework Versions
1677
+ - Python: 3.13.3
1678
+ - Sentence Transformers: 4.2.0.dev0
1679
+ - Transformers: 4.51.3
1680
+ - PyTorch: 2.7.1+cu126
1681
+ - Accelerate: 0.26.0
1682
+ - Datasets: 2.21.0
1683
+ - Tokenizers: 0.21.1
1684
+
1685
+ ## Citation
1686
+
1687
+ ### BibTeX
1688
+
1689
+ #### Sentence Transformers
1690
+ ```bibtex
1691
+ @inproceedings{reimers-2019-sentence-bert,
1692
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1693
+ author = "Reimers, Nils and Gurevych, Iryna",
1694
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1695
+ month = "11",
1696
+ year = "2019",
1697
+ publisher = "Association for Computational Linguistics",
1698
+ url = "https://arxiv.org/abs/1908.10084",
1699
+ }
1700
+ ```
1701
+
1702
+ #### SpladeLoss
1703
+ ```bibtex
1704
+ @misc{formal2022distillationhardnegativesampling,
1705
+ title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
1706
+ author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
1707
+ year={2022},
1708
+ eprint={2205.04733},
1709
+ archivePrefix={arXiv},
1710
+ primaryClass={cs.IR},
1711
+ url={https://arxiv.org/abs/2205.04733},
1712
+ }
1713
+ ```
1714
+
1715
+ #### SparseMarginMSELoss
1716
+ ```bibtex
1717
+ @misc{hofstätter2021improving,
1718
+ title={Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation},
1719
+ author={Sebastian Hofstätter and Sophia Althammer and Michael Schröder and Mete Sertkan and Allan Hanbury},
1720
+ year={2021},
1721
+ eprint={2010.02666},
1722
+ archivePrefix={arXiv},
1723
+ primaryClass={cs.IR}
1724
+ }
1725
+ ```
1726
+
1727
+ #### FlopsLoss
1728
+ ```bibtex
1729
+ @article{paria2020minimizing,
1730
+ title={Minimizing flops to learn efficient sparse representations},
1731
+ author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
1732
+ journal={arXiv preprint arXiv:2004.05665},
1733
+ year={2020}
1734
+ }
1735
+ ```
1736
+
1737
+ <!--
1738
+ ## Glossary
1739
+
1740
+ *Clearly define terms in order to be accessible across audiences.*
1741
+ -->
1742
+
1743
+ <!--
1744
+ ## Model Card Authors
1745
+
1746
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1747
+ -->
1748
+
1749
+ <!--
1750
+ ## Model Card Contact
1751
+
1752
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1753
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForMaskedLM"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.51.3",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SparseEncoder",
3
+ "__version__": {
4
+ "sentence_transformers": "4.2.0.dev0",
5
+ "transformers": "4.51.3",
6
+ "pytorch": "2.7.1+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "dot"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce14d5eea9d846b599a2870823ddf642e6a6243d713037f36e2bf43588824e4e
3
+ size 438080896
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.sparse_encoder.models.MLMTransformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_SpladePooling",
12
+ "type": "sentence_transformers.sparse_encoder.models.SpladePooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff