ImranzamanML commited on
Commit
b81a760
·
verified ·
1 Parent(s): 1447e34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -3271
README.md CHANGED
@@ -1,3271 +0,0 @@
1
- ---
2
- tags:
3
- - sentence-transformers
4
- - feature-extraction
5
- - sentence-similarity
6
- - mteb
7
- - transformers
8
- - transformers.js
9
- language:
10
- - de
11
- - en
12
- inference: false
13
- license: apache-2.0
14
- model-index:
15
- - name: jina-embeddings-v2-base-de
16
- results:
17
- - task:
18
- type: Classification
19
- dataset:
20
- type: mteb/amazon_counterfactual
21
- name: MTEB AmazonCounterfactualClassification (en)
22
- config: en
23
- split: test
24
- revision: e8379541af4e31359cca9fbcf4b00f2671dba205
25
- metrics:
26
- - type: accuracy
27
- value: 73.76119402985076
28
- - type: ap
29
- value: 35.99577188521176
30
- - type: f1
31
- value: 67.50397431543269
32
- - task:
33
- type: Classification
34
- dataset:
35
- type: mteb/amazon_counterfactual
36
- name: MTEB AmazonCounterfactualClassification (de)
37
- config: de
38
- split: test
39
- revision: e8379541af4e31359cca9fbcf4b00f2671dba205
40
- metrics:
41
- - type: accuracy
42
- value: 68.9186295503212
43
- - type: ap
44
- value: 79.73307115840507
45
- - type: f1
46
- value: 66.66245744831339
47
- - task:
48
- type: Classification
49
- dataset:
50
- type: mteb/amazon_polarity
51
- name: MTEB AmazonPolarityClassification
52
- config: default
53
- split: test
54
- revision: e2d317d38cd51312af73b3d32a06d1a08b442046
55
- metrics:
56
- - type: accuracy
57
- value: 77.52215
58
- - type: ap
59
- value: 71.85051037177416
60
- - type: f1
61
- value: 77.4171096157774
62
- - task:
63
- type: Classification
64
- dataset:
65
- type: mteb/amazon_reviews_multi
66
- name: MTEB AmazonReviewsClassification (en)
67
- config: en
68
- split: test
69
- revision: 1399c76144fd37290681b995c656ef9b2e06e26d
70
- metrics:
71
- - type: accuracy
72
- value: 38.498
73
- - type: f1
74
- value: 38.058193386555956
75
- - task:
76
- type: Classification
77
- dataset:
78
- type: mteb/amazon_reviews_multi
79
- name: MTEB AmazonReviewsClassification (de)
80
- config: de
81
- split: test
82
- revision: 1399c76144fd37290681b995c656ef9b2e06e26d
83
- metrics:
84
- - type: accuracy
85
- value: 37.717999999999996
86
- - type: f1
87
- value: 37.22674371574757
88
- - task:
89
- type: Retrieval
90
- dataset:
91
- type: arguana
92
- name: MTEB ArguAna
93
- config: default
94
- split: test
95
- revision: None
96
- metrics:
97
- - type: map_at_1
98
- value: 25.319999999999997
99
- - type: map_at_10
100
- value: 40.351
101
- - type: map_at_100
102
- value: 41.435
103
- - type: map_at_1000
104
- value: 41.443000000000005
105
- - type: map_at_3
106
- value: 35.266
107
- - type: map_at_5
108
- value: 37.99
109
- - type: mrr_at_1
110
- value: 25.746999999999996
111
- - type: mrr_at_10
112
- value: 40.515
113
- - type: mrr_at_100
114
- value: 41.606
115
- - type: mrr_at_1000
116
- value: 41.614000000000004
117
- - type: mrr_at_3
118
- value: 35.42
119
- - type: mrr_at_5
120
- value: 38.112
121
- - type: ndcg_at_1
122
- value: 25.319999999999997
123
- - type: ndcg_at_10
124
- value: 49.332
125
- - type: ndcg_at_100
126
- value: 53.909
127
- - type: ndcg_at_1000
128
- value: 54.089
129
- - type: ndcg_at_3
130
- value: 38.705
131
- - type: ndcg_at_5
132
- value: 43.606
133
- - type: precision_at_1
134
- value: 25.319999999999997
135
- - type: precision_at_10
136
- value: 7.831
137
- - type: precision_at_100
138
- value: 0.9820000000000001
139
- - type: precision_at_1000
140
- value: 0.1
141
- - type: precision_at_3
142
- value: 16.24
143
- - type: precision_at_5
144
- value: 12.119
145
- - type: recall_at_1
146
- value: 25.319999999999997
147
- - type: recall_at_10
148
- value: 78.307
149
- - type: recall_at_100
150
- value: 98.222
151
- - type: recall_at_1000
152
- value: 99.57300000000001
153
- - type: recall_at_3
154
- value: 48.72
155
- - type: recall_at_5
156
- value: 60.597
157
- - task:
158
- type: Clustering
159
- dataset:
160
- type: mteb/arxiv-clustering-p2p
161
- name: MTEB ArxivClusteringP2P
162
- config: default
163
- split: test
164
- revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
165
- metrics:
166
- - type: v_measure
167
- value: 41.43100588255654
168
- - task:
169
- type: Clustering
170
- dataset:
171
- type: mteb/arxiv-clustering-s2s
172
- name: MTEB ArxivClusteringS2S
173
- config: default
174
- split: test
175
- revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
176
- metrics:
177
- - type: v_measure
178
- value: 32.08988904593667
179
- - task:
180
- type: Reranking
181
- dataset:
182
- type: mteb/askubuntudupquestions-reranking
183
- name: MTEB AskUbuntuDupQuestions
184
- config: default
185
- split: test
186
- revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
187
- metrics:
188
- - type: map
189
- value: 60.55514765595906
190
- - type: mrr
191
- value: 73.51393835465858
192
- - task:
193
- type: STS
194
- dataset:
195
- type: mteb/biosses-sts
196
- name: MTEB BIOSSES
197
- config: default
198
- split: test
199
- revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
200
- metrics:
201
- - type: cos_sim_pearson
202
- value: 79.6723823121172
203
- - type: cos_sim_spearman
204
- value: 76.90596922214986
205
- - type: euclidean_pearson
206
- value: 77.87910737957918
207
- - type: euclidean_spearman
208
- value: 76.66319260598262
209
- - type: manhattan_pearson
210
- value: 77.37039493457965
211
- - type: manhattan_spearman
212
- value: 76.09872191280964
213
- - task:
214
- type: BitextMining
215
- dataset:
216
- type: mteb/bucc-bitext-mining
217
- name: MTEB BUCC (de-en)
218
- config: de-en
219
- split: test
220
- revision: d51519689f32196a32af33b075a01d0e7c51e252
221
- metrics:
222
- - type: accuracy
223
- value: 98.97703549060543
224
- - type: f1
225
- value: 98.86569241475296
226
- - type: precision
227
- value: 98.81002087682673
228
- - type: recall
229
- value: 98.97703549060543
230
- - task:
231
- type: Classification
232
- dataset:
233
- type: mteb/banking77
234
- name: MTEB Banking77Classification
235
- config: default
236
- split: test
237
- revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
238
- metrics:
239
- - type: accuracy
240
- value: 83.93506493506493
241
- - type: f1
242
- value: 83.91014949949302
243
- - task:
244
- type: Clustering
245
- dataset:
246
- type: mteb/biorxiv-clustering-p2p
247
- name: MTEB BiorxivClusteringP2P
248
- config: default
249
- split: test
250
- revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
251
- metrics:
252
- - type: v_measure
253
- value: 34.970675877585144
254
- - task:
255
- type: Clustering
256
- dataset:
257
- type: mteb/biorxiv-clustering-s2s
258
- name: MTEB BiorxivClusteringS2S
259
- config: default
260
- split: test
261
- revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
262
- metrics:
263
- - type: v_measure
264
- value: 28.779230269190954
265
- - task:
266
- type: Clustering
267
- dataset:
268
- type: slvnwhrl/blurbs-clustering-p2p
269
- name: MTEB BlurbsClusteringP2P
270
- config: default
271
- split: test
272
- revision: a2dd5b02a77de3466a3eaa98ae586b5610314496
273
- metrics:
274
- - type: v_measure
275
- value: 35.490175601567216
276
- - task:
277
- type: Clustering
278
- dataset:
279
- type: slvnwhrl/blurbs-clustering-s2s
280
- name: MTEB BlurbsClusteringS2S
281
- config: default
282
- split: test
283
- revision: 9bfff9a7f8f6dc6ffc9da71c48dd48b68696471d
284
- metrics:
285
- - type: v_measure
286
- value: 16.16638280560168
287
- - task:
288
- type: Retrieval
289
- dataset:
290
- type: BeIR/cqadupstack
291
- name: MTEB CQADupstackAndroidRetrieval
292
- config: default
293
- split: test
294
- revision: None
295
- metrics:
296
- - type: map_at_1
297
- value: 30.830999999999996
298
- - type: map_at_10
299
- value: 41.355
300
- - type: map_at_100
301
- value: 42.791000000000004
302
- - type: map_at_1000
303
- value: 42.918
304
- - type: map_at_3
305
- value: 38.237
306
- - type: map_at_5
307
- value: 40.066
308
- - type: mrr_at_1
309
- value: 38.484
310
- - type: mrr_at_10
311
- value: 47.593
312
- - type: mrr_at_100
313
- value: 48.388
314
- - type: mrr_at_1000
315
- value: 48.439
316
- - type: mrr_at_3
317
- value: 45.279
318
- - type: mrr_at_5
319
- value: 46.724
320
- - type: ndcg_at_1
321
- value: 38.484
322
- - type: ndcg_at_10
323
- value: 47.27
324
- - type: ndcg_at_100
325
- value: 52.568000000000005
326
- - type: ndcg_at_1000
327
- value: 54.729000000000006
328
- - type: ndcg_at_3
329
- value: 43.061
330
- - type: ndcg_at_5
331
- value: 45.083
332
- - type: precision_at_1
333
- value: 38.484
334
- - type: precision_at_10
335
- value: 8.927
336
- - type: precision_at_100
337
- value: 1.425
338
- - type: precision_at_1000
339
- value: 0.19
340
- - type: precision_at_3
341
- value: 20.791999999999998
342
- - type: precision_at_5
343
- value: 14.85
344
- - type: recall_at_1
345
- value: 30.830999999999996
346
- - type: recall_at_10
347
- value: 57.87799999999999
348
- - type: recall_at_100
349
- value: 80.124
350
- - type: recall_at_1000
351
- value: 94.208
352
- - type: recall_at_3
353
- value: 45.083
354
- - type: recall_at_5
355
- value: 51.154999999999994
356
- - task:
357
- type: Retrieval
358
- dataset:
359
- type: BeIR/cqadupstack
360
- name: MTEB CQADupstackEnglishRetrieval
361
- config: default
362
- split: test
363
- revision: None
364
- metrics:
365
- - type: map_at_1
366
- value: 25.782
367
- - type: map_at_10
368
- value: 34.492
369
- - type: map_at_100
370
- value: 35.521
371
- - type: map_at_1000
372
- value: 35.638
373
- - type: map_at_3
374
- value: 31.735999999999997
375
- - type: map_at_5
376
- value: 33.339
377
- - type: mrr_at_1
378
- value: 32.357
379
- - type: mrr_at_10
380
- value: 39.965
381
- - type: mrr_at_100
382
- value: 40.644000000000005
383
- - type: mrr_at_1000
384
- value: 40.695
385
- - type: mrr_at_3
386
- value: 37.739
387
- - type: mrr_at_5
388
- value: 39.061
389
- - type: ndcg_at_1
390
- value: 32.357
391
- - type: ndcg_at_10
392
- value: 39.644
393
- - type: ndcg_at_100
394
- value: 43.851
395
- - type: ndcg_at_1000
396
- value: 46.211999999999996
397
- - type: ndcg_at_3
398
- value: 35.675000000000004
399
- - type: ndcg_at_5
400
- value: 37.564
401
- - type: precision_at_1
402
- value: 32.357
403
- - type: precision_at_10
404
- value: 7.344
405
- - type: precision_at_100
406
- value: 1.201
407
- - type: precision_at_1000
408
- value: 0.168
409
- - type: precision_at_3
410
- value: 17.155
411
- - type: precision_at_5
412
- value: 12.166
413
- - type: recall_at_1
414
- value: 25.782
415
- - type: recall_at_10
416
- value: 49.132999999999996
417
- - type: recall_at_100
418
- value: 67.24
419
- - type: recall_at_1000
420
- value: 83.045
421
- - type: recall_at_3
422
- value: 37.021
423
- - type: recall_at_5
424
- value: 42.548
425
- - task:
426
- type: Retrieval
427
- dataset:
428
- type: BeIR/cqadupstack
429
- name: MTEB CQADupstackGamingRetrieval
430
- config: default
431
- split: test
432
- revision: None
433
- metrics:
434
- - type: map_at_1
435
- value: 35.778999999999996
436
- - type: map_at_10
437
- value: 47.038000000000004
438
- - type: map_at_100
439
- value: 48.064
440
- - type: map_at_1000
441
- value: 48.128
442
- - type: map_at_3
443
- value: 44.186
444
- - type: map_at_5
445
- value: 45.788000000000004
446
- - type: mrr_at_1
447
- value: 41.254000000000005
448
- - type: mrr_at_10
449
- value: 50.556999999999995
450
- - type: mrr_at_100
451
- value: 51.296
452
- - type: mrr_at_1000
453
- value: 51.331
454
- - type: mrr_at_3
455
- value: 48.318
456
- - type: mrr_at_5
457
- value: 49.619
458
- - type: ndcg_at_1
459
- value: 41.254000000000005
460
- - type: ndcg_at_10
461
- value: 52.454
462
- - type: ndcg_at_100
463
- value: 56.776
464
- - type: ndcg_at_1000
465
- value: 58.181000000000004
466
- - type: ndcg_at_3
467
- value: 47.713
468
- - type: ndcg_at_5
469
- value: 49.997
470
- - type: precision_at_1
471
- value: 41.254000000000005
472
- - type: precision_at_10
473
- value: 8.464
474
- - type: precision_at_100
475
- value: 1.157
476
- - type: precision_at_1000
477
- value: 0.133
478
- - type: precision_at_3
479
- value: 21.526
480
- - type: precision_at_5
481
- value: 14.696000000000002
482
- - type: recall_at_1
483
- value: 35.778999999999996
484
- - type: recall_at_10
485
- value: 64.85300000000001
486
- - type: recall_at_100
487
- value: 83.98400000000001
488
- - type: recall_at_1000
489
- value: 94.18299999999999
490
- - type: recall_at_3
491
- value: 51.929
492
- - type: recall_at_5
493
- value: 57.666
494
- - task:
495
- type: Retrieval
496
- dataset:
497
- type: BeIR/cqadupstack
498
- name: MTEB CQADupstackGisRetrieval
499
- config: default
500
- split: test
501
- revision: None
502
- metrics:
503
- - type: map_at_1
504
- value: 21.719
505
- - type: map_at_10
506
- value: 29.326999999999998
507
- - type: map_at_100
508
- value: 30.314000000000004
509
- - type: map_at_1000
510
- value: 30.397000000000002
511
- - type: map_at_3
512
- value: 27.101
513
- - type: map_at_5
514
- value: 28.141
515
- - type: mrr_at_1
516
- value: 23.503
517
- - type: mrr_at_10
518
- value: 31.225
519
- - type: mrr_at_100
520
- value: 32.096000000000004
521
- - type: mrr_at_1000
522
- value: 32.159
523
- - type: mrr_at_3
524
- value: 29.076999999999998
525
- - type: mrr_at_5
526
- value: 30.083
527
- - type: ndcg_at_1
528
- value: 23.503
529
- - type: ndcg_at_10
530
- value: 33.842
531
- - type: ndcg_at_100
532
- value: 39.038000000000004
533
- - type: ndcg_at_1000
534
- value: 41.214
535
- - type: ndcg_at_3
536
- value: 29.347
537
- - type: ndcg_at_5
538
- value: 31.121
539
- - type: precision_at_1
540
- value: 23.503
541
- - type: precision_at_10
542
- value: 5.266
543
- - type: precision_at_100
544
- value: 0.831
545
- - type: precision_at_1000
546
- value: 0.106
547
- - type: precision_at_3
548
- value: 12.504999999999999
549
- - type: precision_at_5
550
- value: 8.565000000000001
551
- - type: recall_at_1
552
- value: 21.719
553
- - type: recall_at_10
554
- value: 46.024
555
- - type: recall_at_100
556
- value: 70.78999999999999
557
- - type: recall_at_1000
558
- value: 87.022
559
- - type: recall_at_3
560
- value: 33.64
561
- - type: recall_at_5
562
- value: 37.992
563
- - task:
564
- type: Retrieval
565
- dataset:
566
- type: BeIR/cqadupstack
567
- name: MTEB CQADupstackMathematicaRetrieval
568
- config: default
569
- split: test
570
- revision: None
571
- metrics:
572
- - type: map_at_1
573
- value: 15.601
574
- - type: map_at_10
575
- value: 22.054000000000002
576
- - type: map_at_100
577
- value: 23.177
578
- - type: map_at_1000
579
- value: 23.308
580
- - type: map_at_3
581
- value: 19.772000000000002
582
- - type: map_at_5
583
- value: 21.055
584
- - type: mrr_at_1
585
- value: 19.403000000000002
586
- - type: mrr_at_10
587
- value: 26.409
588
- - type: mrr_at_100
589
- value: 27.356
590
- - type: mrr_at_1000
591
- value: 27.441
592
- - type: mrr_at_3
593
- value: 24.108999999999998
594
- - type: mrr_at_5
595
- value: 25.427
596
- - type: ndcg_at_1
597
- value: 19.403000000000002
598
- - type: ndcg_at_10
599
- value: 26.474999999999998
600
- - type: ndcg_at_100
601
- value: 32.086
602
- - type: ndcg_at_1000
603
- value: 35.231
604
- - type: ndcg_at_3
605
- value: 22.289
606
- - type: ndcg_at_5
607
- value: 24.271
608
- - type: precision_at_1
609
- value: 19.403000000000002
610
- - type: precision_at_10
611
- value: 4.813
612
- - type: precision_at_100
613
- value: 0.8869999999999999
614
- - type: precision_at_1000
615
- value: 0.13
616
- - type: precision_at_3
617
- value: 10.531
618
- - type: precision_at_5
619
- value: 7.710999999999999
620
- - type: recall_at_1
621
- value: 15.601
622
- - type: recall_at_10
623
- value: 35.916
624
- - type: recall_at_100
625
- value: 60.8
626
- - type: recall_at_1000
627
- value: 83.245
628
- - type: recall_at_3
629
- value: 24.321
630
- - type: recall_at_5
631
- value: 29.372999999999998
632
- - task:
633
- type: Retrieval
634
- dataset:
635
- type: BeIR/cqadupstack
636
- name: MTEB CQADupstackPhysicsRetrieval
637
- config: default
638
- split: test
639
- revision: None
640
- metrics:
641
- - type: map_at_1
642
- value: 25.522
643
- - type: map_at_10
644
- value: 34.854
645
- - type: map_at_100
646
- value: 36.269
647
- - type: map_at_1000
648
- value: 36.387
649
- - type: map_at_3
650
- value: 32.187
651
- - type: map_at_5
652
- value: 33.692
653
- - type: mrr_at_1
654
- value: 31.375999999999998
655
- - type: mrr_at_10
656
- value: 40.471000000000004
657
- - type: mrr_at_100
658
- value: 41.481
659
- - type: mrr_at_1000
660
- value: 41.533
661
- - type: mrr_at_3
662
- value: 38.274
663
- - type: mrr_at_5
664
- value: 39.612
665
- - type: ndcg_at_1
666
- value: 31.375999999999998
667
- - type: ndcg_at_10
668
- value: 40.298
669
- - type: ndcg_at_100
670
- value: 46.255
671
- - type: ndcg_at_1000
672
- value: 48.522
673
- - type: ndcg_at_3
674
- value: 36.049
675
- - type: ndcg_at_5
676
- value: 38.095
677
- - type: precision_at_1
678
- value: 31.375999999999998
679
- - type: precision_at_10
680
- value: 7.305000000000001
681
- - type: precision_at_100
682
- value: 1.201
683
- - type: precision_at_1000
684
- value: 0.157
685
- - type: precision_at_3
686
- value: 17.132
687
- - type: precision_at_5
688
- value: 12.107999999999999
689
- - type: recall_at_1
690
- value: 25.522
691
- - type: recall_at_10
692
- value: 50.988
693
- - type: recall_at_100
694
- value: 76.005
695
- - type: recall_at_1000
696
- value: 91.11200000000001
697
- - type: recall_at_3
698
- value: 38.808
699
- - type: recall_at_5
700
- value: 44.279
701
- - task:
702
- type: Retrieval
703
- dataset:
704
- type: BeIR/cqadupstack
705
- name: MTEB CQADupstackProgrammersRetrieval
706
- config: default
707
- split: test
708
- revision: None
709
- metrics:
710
- - type: map_at_1
711
- value: 24.615000000000002
712
- - type: map_at_10
713
- value: 32.843
714
- - type: map_at_100
715
- value: 34.172999999999995
716
- - type: map_at_1000
717
- value: 34.286
718
- - type: map_at_3
719
- value: 30.125
720
- - type: map_at_5
721
- value: 31.495
722
- - type: mrr_at_1
723
- value: 30.023
724
- - type: mrr_at_10
725
- value: 38.106
726
- - type: mrr_at_100
727
- value: 39.01
728
- - type: mrr_at_1000
729
- value: 39.071
730
- - type: mrr_at_3
731
- value: 35.674
732
- - type: mrr_at_5
733
- value: 36.924
734
- - type: ndcg_at_1
735
- value: 30.023
736
- - type: ndcg_at_10
737
- value: 38.091
738
- - type: ndcg_at_100
739
- value: 43.771
740
- - type: ndcg_at_1000
741
- value: 46.315
742
- - type: ndcg_at_3
743
- value: 33.507
744
- - type: ndcg_at_5
745
- value: 35.304
746
- - type: precision_at_1
747
- value: 30.023
748
- - type: precision_at_10
749
- value: 6.837999999999999
750
- - type: precision_at_100
751
- value: 1.124
752
- - type: precision_at_1000
753
- value: 0.152
754
- - type: precision_at_3
755
- value: 15.562999999999999
756
- - type: precision_at_5
757
- value: 10.936
758
- - type: recall_at_1
759
- value: 24.615000000000002
760
- - type: recall_at_10
761
- value: 48.691
762
- - type: recall_at_100
763
- value: 72.884
764
- - type: recall_at_1000
765
- value: 90.387
766
- - type: recall_at_3
767
- value: 35.659
768
- - type: recall_at_5
769
- value: 40.602
770
- - task:
771
- type: Retrieval
772
- dataset:
773
- type: BeIR/cqadupstack
774
- name: MTEB CQADupstackRetrieval
775
- config: default
776
- split: test
777
- revision: None
778
- metrics:
779
- - type: map_at_1
780
- value: 23.223666666666666
781
- - type: map_at_10
782
- value: 31.338166666666673
783
- - type: map_at_100
784
- value: 32.47358333333333
785
- - type: map_at_1000
786
- value: 32.5955
787
- - type: map_at_3
788
- value: 28.84133333333333
789
- - type: map_at_5
790
- value: 30.20808333333333
791
- - type: mrr_at_1
792
- value: 27.62483333333333
793
- - type: mrr_at_10
794
- value: 35.385916666666674
795
- - type: mrr_at_100
796
- value: 36.23325
797
- - type: mrr_at_1000
798
- value: 36.29966666666667
799
- - type: mrr_at_3
800
- value: 33.16583333333333
801
- - type: mrr_at_5
802
- value: 34.41983333333334
803
- - type: ndcg_at_1
804
- value: 27.62483333333333
805
- - type: ndcg_at_10
806
- value: 36.222
807
- - type: ndcg_at_100
808
- value: 41.29491666666666
809
- - type: ndcg_at_1000
810
- value: 43.85508333333333
811
- - type: ndcg_at_3
812
- value: 31.95116666666667
813
- - type: ndcg_at_5
814
- value: 33.88541666666667
815
- - type: precision_at_1
816
- value: 27.62483333333333
817
- - type: precision_at_10
818
- value: 6.339916666666667
819
- - type: precision_at_100
820
- value: 1.0483333333333333
821
- - type: precision_at_1000
822
- value: 0.14608333333333334
823
- - type: precision_at_3
824
- value: 14.726500000000003
825
- - type: precision_at_5
826
- value: 10.395
827
- - type: recall_at_1
828
- value: 23.223666666666666
829
- - type: recall_at_10
830
- value: 46.778999999999996
831
- - type: recall_at_100
832
- value: 69.27141666666667
833
- - type: recall_at_1000
834
- value: 87.27383333333334
835
- - type: recall_at_3
836
- value: 34.678749999999994
837
- - type: recall_at_5
838
- value: 39.79900000000001
839
- - task:
840
- type: Retrieval
841
- dataset:
842
- type: BeIR/cqadupstack
843
- name: MTEB CQADupstackStatsRetrieval
844
- config: default
845
- split: test
846
- revision: None
847
- metrics:
848
- - type: map_at_1
849
- value: 21.677
850
- - type: map_at_10
851
- value: 27.828000000000003
852
- - type: map_at_100
853
- value: 28.538999999999998
854
- - type: map_at_1000
855
- value: 28.64
856
- - type: map_at_3
857
- value: 26.105
858
- - type: map_at_5
859
- value: 27.009
860
- - type: mrr_at_1
861
- value: 24.387
862
- - type: mrr_at_10
863
- value: 30.209999999999997
864
- - type: mrr_at_100
865
- value: 30.953000000000003
866
- - type: mrr_at_1000
867
- value: 31.029
868
- - type: mrr_at_3
869
- value: 28.707
870
- - type: mrr_at_5
871
- value: 29.610999999999997
872
- - type: ndcg_at_1
873
- value: 24.387
874
- - type: ndcg_at_10
875
- value: 31.378
876
- - type: ndcg_at_100
877
- value: 35.249
878
- - type: ndcg_at_1000
879
- value: 37.923
880
- - type: ndcg_at_3
881
- value: 28.213
882
- - type: ndcg_at_5
883
- value: 29.658
884
- - type: precision_at_1
885
- value: 24.387
886
- - type: precision_at_10
887
- value: 4.8309999999999995
888
- - type: precision_at_100
889
- value: 0.73
890
- - type: precision_at_1000
891
- value: 0.104
892
- - type: precision_at_3
893
- value: 12.168
894
- - type: precision_at_5
895
- value: 8.251999999999999
896
- - type: recall_at_1
897
- value: 21.677
898
- - type: recall_at_10
899
- value: 40.069
900
- - type: recall_at_100
901
- value: 58.077
902
- - type: recall_at_1000
903
- value: 77.97
904
- - type: recall_at_3
905
- value: 31.03
906
- - type: recall_at_5
907
- value: 34.838
908
- - task:
909
- type: Retrieval
910
- dataset:
911
- type: BeIR/cqadupstack
912
- name: MTEB CQADupstackTexRetrieval
913
- config: default
914
- split: test
915
- revision: None
916
- metrics:
917
- - type: map_at_1
918
- value: 14.484
919
- - type: map_at_10
920
- value: 20.355
921
- - type: map_at_100
922
- value: 21.382
923
- - type: map_at_1000
924
- value: 21.511
925
- - type: map_at_3
926
- value: 18.448
927
- - type: map_at_5
928
- value: 19.451999999999998
929
- - type: mrr_at_1
930
- value: 17.584
931
- - type: mrr_at_10
932
- value: 23.825
933
- - type: mrr_at_100
934
- value: 24.704
935
- - type: mrr_at_1000
936
- value: 24.793000000000003
937
- - type: mrr_at_3
938
- value: 21.92
939
- - type: mrr_at_5
940
- value: 22.97
941
- - type: ndcg_at_1
942
- value: 17.584
943
- - type: ndcg_at_10
944
- value: 24.315
945
- - type: ndcg_at_100
946
- value: 29.354999999999997
947
- - type: ndcg_at_1000
948
- value: 32.641999999999996
949
- - type: ndcg_at_3
950
- value: 20.802
951
- - type: ndcg_at_5
952
- value: 22.335
953
- - type: precision_at_1
954
- value: 17.584
955
- - type: precision_at_10
956
- value: 4.443
957
- - type: precision_at_100
958
- value: 0.8160000000000001
959
- - type: precision_at_1000
960
- value: 0.128
961
- - type: precision_at_3
962
- value: 9.807
963
- - type: precision_at_5
964
- value: 7.0889999999999995
965
- - type: recall_at_1
966
- value: 14.484
967
- - type: recall_at_10
968
- value: 32.804
969
- - type: recall_at_100
970
- value: 55.679
971
- - type: recall_at_1000
972
- value: 79.63
973
- - type: recall_at_3
974
- value: 22.976
975
- - type: recall_at_5
976
- value: 26.939
977
- - task:
978
- type: Retrieval
979
- dataset:
980
- type: BeIR/cqadupstack
981
- name: MTEB CQADupstackUnixRetrieval
982
- config: default
983
- split: test
984
- revision: None
985
- metrics:
986
- - type: map_at_1
987
- value: 22.983999999999998
988
- - type: map_at_10
989
- value: 30.812
990
- - type: map_at_100
991
- value: 31.938
992
- - type: map_at_1000
993
- value: 32.056000000000004
994
- - type: map_at_3
995
- value: 28.449999999999996
996
- - type: map_at_5
997
- value: 29.542
998
- - type: mrr_at_1
999
- value: 27.145999999999997
1000
- - type: mrr_at_10
1001
- value: 34.782999999999994
1002
- - type: mrr_at_100
1003
- value: 35.699
1004
- - type: mrr_at_1000
1005
- value: 35.768
1006
- - type: mrr_at_3
1007
- value: 32.572
1008
- - type: mrr_at_5
1009
- value: 33.607
1010
- - type: ndcg_at_1
1011
- value: 27.145999999999997
1012
- - type: ndcg_at_10
1013
- value: 35.722
1014
- - type: ndcg_at_100
1015
- value: 40.964
1016
- - type: ndcg_at_1000
1017
- value: 43.598
1018
- - type: ndcg_at_3
1019
- value: 31.379
1020
- - type: ndcg_at_5
1021
- value: 32.924
1022
- - type: precision_at_1
1023
- value: 27.145999999999997
1024
- - type: precision_at_10
1025
- value: 6.063000000000001
1026
- - type: precision_at_100
1027
- value: 0.9730000000000001
1028
- - type: precision_at_1000
1029
- value: 0.13
1030
- - type: precision_at_3
1031
- value: 14.366000000000001
1032
- - type: precision_at_5
1033
- value: 9.776
1034
- - type: recall_at_1
1035
- value: 22.983999999999998
1036
- - type: recall_at_10
1037
- value: 46.876
1038
- - type: recall_at_100
1039
- value: 69.646
1040
- - type: recall_at_1000
1041
- value: 88.305
1042
- - type: recall_at_3
1043
- value: 34.471000000000004
1044
- - type: recall_at_5
1045
- value: 38.76
1046
- - task:
1047
- type: Retrieval
1048
- dataset:
1049
- type: BeIR/cqadupstack
1050
- name: MTEB CQADupstackWebmastersRetrieval
1051
- config: default
1052
- split: test
1053
- revision: None
1054
- metrics:
1055
- - type: map_at_1
1056
- value: 23.017000000000003
1057
- - type: map_at_10
1058
- value: 31.049
1059
- - type: map_at_100
1060
- value: 32.582
1061
- - type: map_at_1000
1062
- value: 32.817
1063
- - type: map_at_3
1064
- value: 28.303
1065
- - type: map_at_5
1066
- value: 29.854000000000003
1067
- - type: mrr_at_1
1068
- value: 27.866000000000003
1069
- - type: mrr_at_10
1070
- value: 35.56
1071
- - type: mrr_at_100
1072
- value: 36.453
1073
- - type: mrr_at_1000
1074
- value: 36.519
1075
- - type: mrr_at_3
1076
- value: 32.938
1077
- - type: mrr_at_5
1078
- value: 34.391
1079
- - type: ndcg_at_1
1080
- value: 27.866000000000003
1081
- - type: ndcg_at_10
1082
- value: 36.506
1083
- - type: ndcg_at_100
1084
- value: 42.344
1085
- - type: ndcg_at_1000
1086
- value: 45.213
1087
- - type: ndcg_at_3
1088
- value: 31.805
1089
- - type: ndcg_at_5
1090
- value: 33.933
1091
- - type: precision_at_1
1092
- value: 27.866000000000003
1093
- - type: precision_at_10
1094
- value: 7.016
1095
- - type: precision_at_100
1096
- value: 1.468
1097
- - type: precision_at_1000
1098
- value: 0.23900000000000002
1099
- - type: precision_at_3
1100
- value: 14.822
1101
- - type: precision_at_5
1102
- value: 10.791
1103
- - type: recall_at_1
1104
- value: 23.017000000000003
1105
- - type: recall_at_10
1106
- value: 47.053
1107
- - type: recall_at_100
1108
- value: 73.177
1109
- - type: recall_at_1000
1110
- value: 91.47800000000001
1111
- - type: recall_at_3
1112
- value: 33.675
1113
- - type: recall_at_5
1114
- value: 39.36
1115
- - task:
1116
- type: Retrieval
1117
- dataset:
1118
- type: BeIR/cqadupstack
1119
- name: MTEB CQADupstackWordpressRetrieval
1120
- config: default
1121
- split: test
1122
- revision: None
1123
- metrics:
1124
- - type: map_at_1
1125
- value: 16.673
1126
- - type: map_at_10
1127
- value: 24.051000000000002
1128
- - type: map_at_100
1129
- value: 24.933
1130
- - type: map_at_1000
1131
- value: 25.06
1132
- - type: map_at_3
1133
- value: 21.446
1134
- - type: map_at_5
1135
- value: 23.064
1136
- - type: mrr_at_1
1137
- value: 18.115000000000002
1138
- - type: mrr_at_10
1139
- value: 25.927
1140
- - type: mrr_at_100
1141
- value: 26.718999999999998
1142
- - type: mrr_at_1000
1143
- value: 26.817999999999998
1144
- - type: mrr_at_3
1145
- value: 23.383000000000003
1146
- - type: mrr_at_5
1147
- value: 25.008999999999997
1148
- - type: ndcg_at_1
1149
- value: 18.115000000000002
1150
- - type: ndcg_at_10
1151
- value: 28.669
1152
- - type: ndcg_at_100
1153
- value: 33.282000000000004
1154
- - type: ndcg_at_1000
1155
- value: 36.481
1156
- - type: ndcg_at_3
1157
- value: 23.574
1158
- - type: ndcg_at_5
1159
- value: 26.340000000000003
1160
- - type: precision_at_1
1161
- value: 18.115000000000002
1162
- - type: precision_at_10
1163
- value: 4.769
1164
- - type: precision_at_100
1165
- value: 0.767
1166
- - type: precision_at_1000
1167
- value: 0.116
1168
- - type: precision_at_3
1169
- value: 10.351
1170
- - type: precision_at_5
1171
- value: 7.8
1172
- - type: recall_at_1
1173
- value: 16.673
1174
- - type: recall_at_10
1175
- value: 41.063
1176
- - type: recall_at_100
1177
- value: 62.851
1178
- - type: recall_at_1000
1179
- value: 86.701
1180
- - type: recall_at_3
1181
- value: 27.532
1182
- - type: recall_at_5
1183
- value: 34.076
1184
- - task:
1185
- type: Retrieval
1186
- dataset:
1187
- type: climate-fever
1188
- name: MTEB ClimateFEVER
1189
- config: default
1190
- split: test
1191
- revision: None
1192
- metrics:
1193
- - type: map_at_1
1194
- value: 8.752
1195
- - type: map_at_10
1196
- value: 15.120000000000001
1197
- - type: map_at_100
1198
- value: 16.678
1199
- - type: map_at_1000
1200
- value: 16.854
1201
- - type: map_at_3
1202
- value: 12.603
1203
- - type: map_at_5
1204
- value: 13.918
1205
- - type: mrr_at_1
1206
- value: 19.283
1207
- - type: mrr_at_10
1208
- value: 29.145
1209
- - type: mrr_at_100
1210
- value: 30.281000000000002
1211
- - type: mrr_at_1000
1212
- value: 30.339
1213
- - type: mrr_at_3
1214
- value: 26.069
1215
- - type: mrr_at_5
1216
- value: 27.864
1217
- - type: ndcg_at_1
1218
- value: 19.283
1219
- - type: ndcg_at_10
1220
- value: 21.804000000000002
1221
- - type: ndcg_at_100
1222
- value: 28.576
1223
- - type: ndcg_at_1000
1224
- value: 32.063
1225
- - type: ndcg_at_3
1226
- value: 17.511
1227
- - type: ndcg_at_5
1228
- value: 19.112000000000002
1229
- - type: precision_at_1
1230
- value: 19.283
1231
- - type: precision_at_10
1232
- value: 6.873
1233
- - type: precision_at_100
1234
- value: 1.405
1235
- - type: precision_at_1000
1236
- value: 0.20500000000000002
1237
- - type: precision_at_3
1238
- value: 13.16
1239
- - type: precision_at_5
1240
- value: 10.189
1241
- - type: recall_at_1
1242
- value: 8.752
1243
- - type: recall_at_10
1244
- value: 27.004
1245
- - type: recall_at_100
1246
- value: 50.648
1247
- - type: recall_at_1000
1248
- value: 70.458
1249
- - type: recall_at_3
1250
- value: 16.461000000000002
1251
- - type: recall_at_5
1252
- value: 20.973
1253
- - task:
1254
- type: Retrieval
1255
- dataset:
1256
- type: dbpedia-entity
1257
- name: MTEB DBPedia
1258
- config: default
1259
- split: test
1260
- revision: None
1261
- metrics:
1262
- - type: map_at_1
1263
- value: 6.81
1264
- - type: map_at_10
1265
- value: 14.056
1266
- - type: map_at_100
1267
- value: 18.961
1268
- - type: map_at_1000
1269
- value: 20.169
1270
- - type: map_at_3
1271
- value: 10.496
1272
- - type: map_at_5
1273
- value: 11.952
1274
- - type: mrr_at_1
1275
- value: 53.5
1276
- - type: mrr_at_10
1277
- value: 63.479
1278
- - type: mrr_at_100
1279
- value: 63.971999999999994
1280
- - type: mrr_at_1000
1281
- value: 63.993
1282
- - type: mrr_at_3
1283
- value: 61.541999999999994
1284
- - type: mrr_at_5
1285
- value: 62.778999999999996
1286
- - type: ndcg_at_1
1287
- value: 42.25
1288
- - type: ndcg_at_10
1289
- value: 31.471
1290
- - type: ndcg_at_100
1291
- value: 35.115
1292
- - type: ndcg_at_1000
1293
- value: 42.408
1294
- - type: ndcg_at_3
1295
- value: 35.458
1296
- - type: ndcg_at_5
1297
- value: 32.973
1298
- - type: precision_at_1
1299
- value: 53.5
1300
- - type: precision_at_10
1301
- value: 24.85
1302
- - type: precision_at_100
1303
- value: 7.79
1304
- - type: precision_at_1000
1305
- value: 1.599
1306
- - type: precision_at_3
1307
- value: 38.667
1308
- - type: precision_at_5
1309
- value: 31.55
1310
- - type: recall_at_1
1311
- value: 6.81
1312
- - type: recall_at_10
1313
- value: 19.344
1314
- - type: recall_at_100
1315
- value: 40.837
1316
- - type: recall_at_1000
1317
- value: 64.661
1318
- - type: recall_at_3
1319
- value: 11.942
1320
- - type: recall_at_5
1321
- value: 14.646
1322
- - task:
1323
- type: Classification
1324
- dataset:
1325
- type: mteb/emotion
1326
- name: MTEB EmotionClassification
1327
- config: default
1328
- split: test
1329
- revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1330
- metrics:
1331
- - type: accuracy
1332
- value: 44.64499999999999
1333
- - type: f1
1334
- value: 39.39106911352714
1335
- - task:
1336
- type: Retrieval
1337
- dataset:
1338
- type: fever
1339
- name: MTEB FEVER
1340
- config: default
1341
- split: test
1342
- revision: None
1343
- metrics:
1344
- - type: map_at_1
1345
- value: 48.196
1346
- - type: map_at_10
1347
- value: 61.404
1348
- - type: map_at_100
1349
- value: 61.846000000000004
1350
- - type: map_at_1000
1351
- value: 61.866
1352
- - type: map_at_3
1353
- value: 58.975
1354
- - type: map_at_5
1355
- value: 60.525
1356
- - type: mrr_at_1
1357
- value: 52.025
1358
- - type: mrr_at_10
1359
- value: 65.43299999999999
1360
- - type: mrr_at_100
1361
- value: 65.80799999999999
1362
- - type: mrr_at_1000
1363
- value: 65.818
1364
- - type: mrr_at_3
1365
- value: 63.146
1366
- - type: mrr_at_5
1367
- value: 64.64
1368
- - type: ndcg_at_1
1369
- value: 52.025
1370
- - type: ndcg_at_10
1371
- value: 67.889
1372
- - type: ndcg_at_100
1373
- value: 69.864
1374
- - type: ndcg_at_1000
1375
- value: 70.337
1376
- - type: ndcg_at_3
1377
- value: 63.315
1378
- - type: ndcg_at_5
1379
- value: 65.91799999999999
1380
- - type: precision_at_1
1381
- value: 52.025
1382
- - type: precision_at_10
1383
- value: 9.182
1384
- - type: precision_at_100
1385
- value: 1.027
1386
- - type: precision_at_1000
1387
- value: 0.108
1388
- - type: precision_at_3
1389
- value: 25.968000000000004
1390
- - type: precision_at_5
1391
- value: 17.006
1392
- - type: recall_at_1
1393
- value: 48.196
1394
- - type: recall_at_10
1395
- value: 83.885
1396
- - type: recall_at_100
1397
- value: 92.671
1398
- - type: recall_at_1000
1399
- value: 96.018
1400
- - type: recall_at_3
1401
- value: 71.59
1402
- - type: recall_at_5
1403
- value: 77.946
1404
- - task:
1405
- type: Retrieval
1406
- dataset:
1407
- type: fiqa
1408
- name: MTEB FiQA2018
1409
- config: default
1410
- split: test
1411
- revision: None
1412
- metrics:
1413
- - type: map_at_1
1414
- value: 15.193000000000001
1415
- - type: map_at_10
1416
- value: 25.168000000000003
1417
- - type: map_at_100
1418
- value: 27.017000000000003
1419
- - type: map_at_1000
1420
- value: 27.205000000000002
1421
- - type: map_at_3
1422
- value: 21.746
1423
- - type: map_at_5
1424
- value: 23.579
1425
- - type: mrr_at_1
1426
- value: 31.635999999999996
1427
- - type: mrr_at_10
1428
- value: 40.077
1429
- - type: mrr_at_100
1430
- value: 41.112
1431
- - type: mrr_at_1000
1432
- value: 41.160999999999994
1433
- - type: mrr_at_3
1434
- value: 37.937
1435
- - type: mrr_at_5
1436
- value: 39.18
1437
- - type: ndcg_at_1
1438
- value: 31.635999999999996
1439
- - type: ndcg_at_10
1440
- value: 32.298
1441
- - type: ndcg_at_100
1442
- value: 39.546
1443
- - type: ndcg_at_1000
1444
- value: 42.88
1445
- - type: ndcg_at_3
1446
- value: 29.221999999999998
1447
- - type: ndcg_at_5
1448
- value: 30.069000000000003
1449
- - type: precision_at_1
1450
- value: 31.635999999999996
1451
- - type: precision_at_10
1452
- value: 9.367
1453
- - type: precision_at_100
1454
- value: 1.645
1455
- - type: precision_at_1000
1456
- value: 0.22399999999999998
1457
- - type: precision_at_3
1458
- value: 20.01
1459
- - type: precision_at_5
1460
- value: 14.753
1461
- - type: recall_at_1
1462
- value: 15.193000000000001
1463
- - type: recall_at_10
1464
- value: 38.214999999999996
1465
- - type: recall_at_100
1466
- value: 65.95
1467
- - type: recall_at_1000
1468
- value: 85.85300000000001
1469
- - type: recall_at_3
1470
- value: 26.357000000000003
1471
- - type: recall_at_5
1472
- value: 31.319999999999997
1473
- - task:
1474
- type: Retrieval
1475
- dataset:
1476
- type: jinaai/ger_da_lir
1477
- name: MTEB GerDaLIR
1478
- config: default
1479
- split: test
1480
- revision: None
1481
- metrics:
1482
- - type: map_at_1
1483
- value: 10.363
1484
- - type: map_at_10
1485
- value: 16.222
1486
- - type: map_at_100
1487
- value: 17.28
1488
- - type: map_at_1000
1489
- value: 17.380000000000003
1490
- - type: map_at_3
1491
- value: 14.054
1492
- - type: map_at_5
1493
- value: 15.203
1494
- - type: mrr_at_1
1495
- value: 11.644
1496
- - type: mrr_at_10
1497
- value: 17.625
1498
- - type: mrr_at_100
1499
- value: 18.608
1500
- - type: mrr_at_1000
1501
- value: 18.695999999999998
1502
- - type: mrr_at_3
1503
- value: 15.481
1504
- - type: mrr_at_5
1505
- value: 16.659
1506
- - type: ndcg_at_1
1507
- value: 11.628
1508
- - type: ndcg_at_10
1509
- value: 20.028000000000002
1510
- - type: ndcg_at_100
1511
- value: 25.505
1512
- - type: ndcg_at_1000
1513
- value: 28.288000000000004
1514
- - type: ndcg_at_3
1515
- value: 15.603
1516
- - type: ndcg_at_5
1517
- value: 17.642
1518
- - type: precision_at_1
1519
- value: 11.628
1520
- - type: precision_at_10
1521
- value: 3.5589999999999997
1522
- - type: precision_at_100
1523
- value: 0.664
1524
- - type: precision_at_1000
1525
- value: 0.092
1526
- - type: precision_at_3
1527
- value: 7.109999999999999
1528
- - type: precision_at_5
1529
- value: 5.401
1530
- - type: recall_at_1
1531
- value: 10.363
1532
- - type: recall_at_10
1533
- value: 30.586000000000002
1534
- - type: recall_at_100
1535
- value: 56.43
1536
- - type: recall_at_1000
1537
- value: 78.142
1538
- - type: recall_at_3
1539
- value: 18.651
1540
- - type: recall_at_5
1541
- value: 23.493
1542
- - task:
1543
- type: Retrieval
1544
- dataset:
1545
- type: deepset/germandpr
1546
- name: MTEB GermanDPR
1547
- config: default
1548
- split: test
1549
- revision: 5129d02422a66be600ac89cd3e8531b4f97d347d
1550
- metrics:
1551
- - type: map_at_1
1552
- value: 60.78
1553
- - type: map_at_10
1554
- value: 73.91499999999999
1555
- - type: map_at_100
1556
- value: 74.089
1557
- - type: map_at_1000
1558
- value: 74.09400000000001
1559
- - type: map_at_3
1560
- value: 71.87
1561
- - type: map_at_5
1562
- value: 73.37700000000001
1563
- - type: mrr_at_1
1564
- value: 60.78
1565
- - type: mrr_at_10
1566
- value: 73.91499999999999
1567
- - type: mrr_at_100
1568
- value: 74.089
1569
- - type: mrr_at_1000
1570
- value: 74.09400000000001
1571
- - type: mrr_at_3
1572
- value: 71.87
1573
- - type: mrr_at_5
1574
- value: 73.37700000000001
1575
- - type: ndcg_at_1
1576
- value: 60.78
1577
- - type: ndcg_at_10
1578
- value: 79.35600000000001
1579
- - type: ndcg_at_100
1580
- value: 80.077
1581
- - type: ndcg_at_1000
1582
- value: 80.203
1583
- - type: ndcg_at_3
1584
- value: 75.393
1585
- - type: ndcg_at_5
1586
- value: 78.077
1587
- - type: precision_at_1
1588
- value: 60.78
1589
- - type: precision_at_10
1590
- value: 9.59
1591
- - type: precision_at_100
1592
- value: 0.9900000000000001
1593
- - type: precision_at_1000
1594
- value: 0.1
1595
- - type: precision_at_3
1596
- value: 28.52
1597
- - type: precision_at_5
1598
- value: 18.4
1599
- - type: recall_at_1
1600
- value: 60.78
1601
- - type: recall_at_10
1602
- value: 95.902
1603
- - type: recall_at_100
1604
- value: 99.024
1605
- - type: recall_at_1000
1606
- value: 100.0
1607
- - type: recall_at_3
1608
- value: 85.56099999999999
1609
- - type: recall_at_5
1610
- value: 92.0
1611
- - task:
1612
- type: STS
1613
- dataset:
1614
- type: jinaai/german-STSbenchmark
1615
- name: MTEB GermanSTSBenchmark
1616
- config: default
1617
- split: test
1618
- revision: 49d9b423b996fea62b483f9ee6dfb5ec233515ca
1619
- metrics:
1620
- - type: cos_sim_pearson
1621
- value: 88.49524420894356
1622
- - type: cos_sim_spearman
1623
- value: 88.32407839427714
1624
- - type: euclidean_pearson
1625
- value: 87.25098779877104
1626
- - type: euclidean_spearman
1627
- value: 88.22738098593608
1628
- - type: manhattan_pearson
1629
- value: 87.23872691839607
1630
- - type: manhattan_spearman
1631
- value: 88.2002968380165
1632
- - task:
1633
- type: Retrieval
1634
- dataset:
1635
- type: hotpotqa
1636
- name: MTEB HotpotQA
1637
- config: default
1638
- split: test
1639
- revision: None
1640
- metrics:
1641
- - type: map_at_1
1642
- value: 31.81
1643
- - type: map_at_10
1644
- value: 46.238
1645
- - type: map_at_100
1646
- value: 47.141
1647
- - type: map_at_1000
1648
- value: 47.213
1649
- - type: map_at_3
1650
- value: 43.248999999999995
1651
- - type: map_at_5
1652
- value: 45.078
1653
- - type: mrr_at_1
1654
- value: 63.619
1655
- - type: mrr_at_10
1656
- value: 71.279
1657
- - type: mrr_at_100
1658
- value: 71.648
1659
- - type: mrr_at_1000
1660
- value: 71.665
1661
- - type: mrr_at_3
1662
- value: 69.76599999999999
1663
- - type: mrr_at_5
1664
- value: 70.743
1665
- - type: ndcg_at_1
1666
- value: 63.619
1667
- - type: ndcg_at_10
1668
- value: 55.38999999999999
1669
- - type: ndcg_at_100
1670
- value: 58.80800000000001
1671
- - type: ndcg_at_1000
1672
- value: 60.331999999999994
1673
- - type: ndcg_at_3
1674
- value: 50.727
1675
- - type: ndcg_at_5
1676
- value: 53.284
1677
- - type: precision_at_1
1678
- value: 63.619
1679
- - type: precision_at_10
1680
- value: 11.668000000000001
1681
- - type: precision_at_100
1682
- value: 1.434
1683
- - type: precision_at_1000
1684
- value: 0.164
1685
- - type: precision_at_3
1686
- value: 32.001000000000005
1687
- - type: precision_at_5
1688
- value: 21.223
1689
- - type: recall_at_1
1690
- value: 31.81
1691
- - type: recall_at_10
1692
- value: 58.339
1693
- - type: recall_at_100
1694
- value: 71.708
1695
- - type: recall_at_1000
1696
- value: 81.85
1697
- - type: recall_at_3
1698
- value: 48.001
1699
- - type: recall_at_5
1700
- value: 53.059
1701
- - task:
1702
- type: Classification
1703
- dataset:
1704
- type: mteb/imdb
1705
- name: MTEB ImdbClassification
1706
- config: default
1707
- split: test
1708
- revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1709
- metrics:
1710
- - type: accuracy
1711
- value: 68.60640000000001
1712
- - type: ap
1713
- value: 62.84296904042086
1714
- - type: f1
1715
- value: 68.50643633327537
1716
- - task:
1717
- type: Reranking
1718
- dataset:
1719
- type: jinaai/miracl
1720
- name: MTEB MIRACL
1721
- config: default
1722
- split: test
1723
- revision: 8741c3b61cd36ed9ca1b3d4203543a41793239e2
1724
- metrics:
1725
- - type: map
1726
- value: 64.29704335389768
1727
- - type: mrr
1728
- value: 72.11962197159565
1729
- - task:
1730
- type: Classification
1731
- dataset:
1732
- type: mteb/mtop_domain
1733
- name: MTEB MTOPDomainClassification (en)
1734
- config: en
1735
- split: test
1736
- revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1737
- metrics:
1738
- - type: accuracy
1739
- value: 89.3844049247606
1740
- - type: f1
1741
- value: 89.2124328528015
1742
- - task:
1743
- type: Classification
1744
- dataset:
1745
- type: mteb/mtop_domain
1746
- name: MTEB MTOPDomainClassification (de)
1747
- config: de
1748
- split: test
1749
- revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1750
- metrics:
1751
- - type: accuracy
1752
- value: 88.36855452240067
1753
- - type: f1
1754
- value: 87.35458822097442
1755
- - task:
1756
- type: Classification
1757
- dataset:
1758
- type: mteb/mtop_intent
1759
- name: MTEB MTOPIntentClassification (en)
1760
- config: en
1761
- split: test
1762
- revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1763
- metrics:
1764
- - type: accuracy
1765
- value: 66.48654810761514
1766
- - type: f1
1767
- value: 50.07229882504409
1768
- - task:
1769
- type: Classification
1770
- dataset:
1771
- type: mteb/mtop_intent
1772
- name: MTEB MTOPIntentClassification (de)
1773
- config: de
1774
- split: test
1775
- revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1776
- metrics:
1777
- - type: accuracy
1778
- value: 63.832065370526905
1779
- - type: f1
1780
- value: 46.283579383385806
1781
- - task:
1782
- type: Classification
1783
- dataset:
1784
- type: mteb/amazon_massive_intent
1785
- name: MTEB MassiveIntentClassification (de)
1786
- config: de
1787
- split: test
1788
- revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1789
- metrics:
1790
- - type: accuracy
1791
- value: 63.89038332212509
1792
- - type: f1
1793
- value: 61.86279849685129
1794
- - task:
1795
- type: Classification
1796
- dataset:
1797
- type: mteb/amazon_massive_intent
1798
- name: MTEB MassiveIntentClassification (en)
1799
- config: en
1800
- split: test
1801
- revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1802
- metrics:
1803
- - type: accuracy
1804
- value: 69.11230665770006
1805
- - type: f1
1806
- value: 67.44780095350535
1807
- - task:
1808
- type: Classification
1809
- dataset:
1810
- type: mteb/amazon_massive_scenario
1811
- name: MTEB MassiveScenarioClassification (de)
1812
- config: de
1813
- split: test
1814
- revision: 7d571f92784cd94a019292a1f45445077d0ef634
1815
- metrics:
1816
- - type: accuracy
1817
- value: 71.25084061869536
1818
- - type: f1
1819
- value: 71.43965023016408
1820
- - task:
1821
- type: Classification
1822
- dataset:
1823
- type: mteb/amazon_massive_scenario
1824
- name: MTEB MassiveScenarioClassification (en)
1825
- config: en
1826
- split: test
1827
- revision: 7d571f92784cd94a019292a1f45445077d0ef634
1828
- metrics:
1829
- - type: accuracy
1830
- value: 73.73907195696032
1831
- - type: f1
1832
- value: 73.69920814839061
1833
- - task:
1834
- type: Clustering
1835
- dataset:
1836
- type: mteb/medrxiv-clustering-p2p
1837
- name: MTEB MedrxivClusteringP2P
1838
- config: default
1839
- split: test
1840
- revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1841
- metrics:
1842
- - type: v_measure
1843
- value: 31.32577306498249
1844
- - task:
1845
- type: Clustering
1846
- dataset:
1847
- type: mteb/medrxiv-clustering-s2s
1848
- name: MTEB MedrxivClusteringS2S
1849
- config: default
1850
- split: test
1851
- revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1852
- metrics:
1853
- - type: v_measure
1854
- value: 28.759349326367783
1855
- - task:
1856
- type: Reranking
1857
- dataset:
1858
- type: mteb/mind_small
1859
- name: MTEB MindSmallReranking
1860
- config: default
1861
- split: test
1862
- revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1863
- metrics:
1864
- - type: map
1865
- value: 30.401342674703425
1866
- - type: mrr
1867
- value: 31.384379585660987
1868
- - task:
1869
- type: Retrieval
1870
- dataset:
1871
- type: nfcorpus
1872
- name: MTEB NFCorpus
1873
- config: default
1874
- split: test
1875
- revision: None
1876
- metrics:
1877
- - type: map_at_1
1878
- value: 4.855
1879
- - type: map_at_10
1880
- value: 10.01
1881
- - type: map_at_100
1882
- value: 12.461
1883
- - type: map_at_1000
1884
- value: 13.776
1885
- - type: map_at_3
1886
- value: 7.252
1887
- - type: map_at_5
1888
- value: 8.679
1889
- - type: mrr_at_1
1890
- value: 41.176
1891
- - type: mrr_at_10
1892
- value: 49.323
1893
- - type: mrr_at_100
1894
- value: 49.954
1895
- - type: mrr_at_1000
1896
- value: 49.997
1897
- - type: mrr_at_3
1898
- value: 46.904
1899
- - type: mrr_at_5
1900
- value: 48.375
1901
- - type: ndcg_at_1
1902
- value: 39.318999999999996
1903
- - type: ndcg_at_10
1904
- value: 28.607
1905
- - type: ndcg_at_100
1906
- value: 26.554
1907
- - type: ndcg_at_1000
1908
- value: 35.731
1909
- - type: ndcg_at_3
1910
- value: 32.897999999999996
1911
- - type: ndcg_at_5
1912
- value: 31.53
1913
- - type: precision_at_1
1914
- value: 41.176
1915
- - type: precision_at_10
1916
- value: 20.867
1917
- - type: precision_at_100
1918
- value: 6.796
1919
- - type: precision_at_1000
1920
- value: 1.983
1921
- - type: precision_at_3
1922
- value: 30.547
1923
- - type: precision_at_5
1924
- value: 27.245
1925
- - type: recall_at_1
1926
- value: 4.855
1927
- - type: recall_at_10
1928
- value: 14.08
1929
- - type: recall_at_100
1930
- value: 28.188000000000002
1931
- - type: recall_at_1000
1932
- value: 60.07900000000001
1933
- - type: recall_at_3
1934
- value: 7.947
1935
- - type: recall_at_5
1936
- value: 10.786
1937
- - task:
1938
- type: Retrieval
1939
- dataset:
1940
- type: nq
1941
- name: MTEB NQ
1942
- config: default
1943
- split: test
1944
- revision: None
1945
- metrics:
1946
- - type: map_at_1
1947
- value: 26.906999999999996
1948
- - type: map_at_10
1949
- value: 41.147
1950
- - type: map_at_100
1951
- value: 42.269
1952
- - type: map_at_1000
1953
- value: 42.308
1954
- - type: map_at_3
1955
- value: 36.638999999999996
1956
- - type: map_at_5
1957
- value: 39.285
1958
- - type: mrr_at_1
1959
- value: 30.359
1960
- - type: mrr_at_10
1961
- value: 43.607
1962
- - type: mrr_at_100
1963
- value: 44.454
1964
- - type: mrr_at_1000
1965
- value: 44.481
1966
- - type: mrr_at_3
1967
- value: 39.644
1968
- - type: mrr_at_5
1969
- value: 42.061
1970
- - type: ndcg_at_1
1971
- value: 30.330000000000002
1972
- - type: ndcg_at_10
1973
- value: 48.899
1974
- - type: ndcg_at_100
1975
- value: 53.612
1976
- - type: ndcg_at_1000
1977
- value: 54.51200000000001
1978
- - type: ndcg_at_3
1979
- value: 40.262
1980
- - type: ndcg_at_5
1981
- value: 44.787
1982
- - type: precision_at_1
1983
- value: 30.330000000000002
1984
- - type: precision_at_10
1985
- value: 8.323
1986
- - type: precision_at_100
1987
- value: 1.0959999999999999
1988
- - type: precision_at_1000
1989
- value: 0.11800000000000001
1990
- - type: precision_at_3
1991
- value: 18.395
1992
- - type: precision_at_5
1993
- value: 13.627
1994
- - type: recall_at_1
1995
- value: 26.906999999999996
1996
- - type: recall_at_10
1997
- value: 70.215
1998
- - type: recall_at_100
1999
- value: 90.61200000000001
2000
- - type: recall_at_1000
2001
- value: 97.294
2002
- - type: recall_at_3
2003
- value: 47.784
2004
- - type: recall_at_5
2005
- value: 58.251
2006
- - task:
2007
- type: PairClassification
2008
- dataset:
2009
- type: paws-x
2010
- name: MTEB PawsX
2011
- config: default
2012
- split: test
2013
- revision: 8a04d940a42cd40658986fdd8e3da561533a3646
2014
- metrics:
2015
- - type: cos_sim_accuracy
2016
- value: 60.5
2017
- - type: cos_sim_ap
2018
- value: 57.606096528877494
2019
- - type: cos_sim_f1
2020
- value: 62.24240307369892
2021
- - type: cos_sim_precision
2022
- value: 45.27439024390244
2023
- - type: cos_sim_recall
2024
- value: 99.55307262569832
2025
- - type: dot_accuracy
2026
- value: 57.699999999999996
2027
- - type: dot_ap
2028
- value: 51.289351057160616
2029
- - type: dot_f1
2030
- value: 62.25953130465197
2031
- - type: dot_precision
2032
- value: 45.31568228105906
2033
- - type: dot_recall
2034
- value: 99.4413407821229
2035
- - type: euclidean_accuracy
2036
- value: 60.45
2037
- - type: euclidean_ap
2038
- value: 57.616461421424034
2039
- - type: euclidean_f1
2040
- value: 62.313697657913416
2041
- - type: euclidean_precision
2042
- value: 45.657826313052524
2043
- - type: euclidean_recall
2044
- value: 98.10055865921787
2045
- - type: manhattan_accuracy
2046
- value: 60.3
2047
- - type: manhattan_ap
2048
- value: 57.580565271667325
2049
- - type: manhattan_f1
2050
- value: 62.24240307369892
2051
- - type: manhattan_precision
2052
- value: 45.27439024390244
2053
- - type: manhattan_recall
2054
- value: 99.55307262569832
2055
- - type: max_accuracy
2056
- value: 60.5
2057
- - type: max_ap
2058
- value: 57.616461421424034
2059
- - type: max_f1
2060
- value: 62.313697657913416
2061
- - task:
2062
- type: Retrieval
2063
- dataset:
2064
- type: quora
2065
- name: MTEB QuoraRetrieval
2066
- config: default
2067
- split: test
2068
- revision: None
2069
- metrics:
2070
- - type: map_at_1
2071
- value: 70.21300000000001
2072
- - type: map_at_10
2073
- value: 84.136
2074
- - type: map_at_100
2075
- value: 84.796
2076
- - type: map_at_1000
2077
- value: 84.812
2078
- - type: map_at_3
2079
- value: 81.182
2080
- - type: map_at_5
2081
- value: 83.027
2082
- - type: mrr_at_1
2083
- value: 80.91000000000001
2084
- - type: mrr_at_10
2085
- value: 87.155
2086
- - type: mrr_at_100
2087
- value: 87.27000000000001
2088
- - type: mrr_at_1000
2089
- value: 87.271
2090
- - type: mrr_at_3
2091
- value: 86.158
2092
- - type: mrr_at_5
2093
- value: 86.828
2094
- - type: ndcg_at_1
2095
- value: 80.88
2096
- - type: ndcg_at_10
2097
- value: 87.926
2098
- - type: ndcg_at_100
2099
- value: 89.223
2100
- - type: ndcg_at_1000
2101
- value: 89.321
2102
- - type: ndcg_at_3
2103
- value: 85.036
2104
- - type: ndcg_at_5
2105
- value: 86.614
2106
- - type: precision_at_1
2107
- value: 80.88
2108
- - type: precision_at_10
2109
- value: 13.350000000000001
2110
- - type: precision_at_100
2111
- value: 1.5310000000000001
2112
- - type: precision_at_1000
2113
- value: 0.157
2114
- - type: precision_at_3
2115
- value: 37.173
2116
- - type: precision_at_5
2117
- value: 24.476
2118
- - type: recall_at_1
2119
- value: 70.21300000000001
2120
- - type: recall_at_10
2121
- value: 95.12
2122
- - type: recall_at_100
2123
- value: 99.535
2124
- - type: recall_at_1000
2125
- value: 99.977
2126
- - type: recall_at_3
2127
- value: 86.833
2128
- - type: recall_at_5
2129
- value: 91.26100000000001
2130
- - task:
2131
- type: Clustering
2132
- dataset:
2133
- type: mteb/reddit-clustering
2134
- name: MTEB RedditClustering
2135
- config: default
2136
- split: test
2137
- revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
2138
- metrics:
2139
- - type: v_measure
2140
- value: 47.754688783184875
2141
- - task:
2142
- type: Clustering
2143
- dataset:
2144
- type: mteb/reddit-clustering-p2p
2145
- name: MTEB RedditClusteringP2P
2146
- config: default
2147
- split: test
2148
- revision: 282350215ef01743dc01b456c7f5241fa8937f16
2149
- metrics:
2150
- - type: v_measure
2151
- value: 54.875736374329364
2152
- - task:
2153
- type: Retrieval
2154
- dataset:
2155
- type: scidocs
2156
- name: MTEB SCIDOCS
2157
- config: default
2158
- split: test
2159
- revision: None
2160
- metrics:
2161
- - type: map_at_1
2162
- value: 3.773
2163
- - type: map_at_10
2164
- value: 9.447
2165
- - type: map_at_100
2166
- value: 11.1
2167
- - type: map_at_1000
2168
- value: 11.37
2169
- - type: map_at_3
2170
- value: 6.787
2171
- - type: map_at_5
2172
- value: 8.077
2173
- - type: mrr_at_1
2174
- value: 18.5
2175
- - type: mrr_at_10
2176
- value: 28.227000000000004
2177
- - type: mrr_at_100
2178
- value: 29.445
2179
- - type: mrr_at_1000
2180
- value: 29.515
2181
- - type: mrr_at_3
2182
- value: 25.2
2183
- - type: mrr_at_5
2184
- value: 27.055
2185
- - type: ndcg_at_1
2186
- value: 18.5
2187
- - type: ndcg_at_10
2188
- value: 16.29
2189
- - type: ndcg_at_100
2190
- value: 23.250999999999998
2191
- - type: ndcg_at_1000
2192
- value: 28.445999999999998
2193
- - type: ndcg_at_3
2194
- value: 15.376000000000001
2195
- - type: ndcg_at_5
2196
- value: 13.528
2197
- - type: precision_at_1
2198
- value: 18.5
2199
- - type: precision_at_10
2200
- value: 8.51
2201
- - type: precision_at_100
2202
- value: 1.855
2203
- - type: precision_at_1000
2204
- value: 0.311
2205
- - type: precision_at_3
2206
- value: 14.533
2207
- - type: precision_at_5
2208
- value: 12.0
2209
- - type: recall_at_1
2210
- value: 3.773
2211
- - type: recall_at_10
2212
- value: 17.282
2213
- - type: recall_at_100
2214
- value: 37.645
2215
- - type: recall_at_1000
2216
- value: 63.138000000000005
2217
- - type: recall_at_3
2218
- value: 8.853
2219
- - type: recall_at_5
2220
- value: 12.168
2221
- - task:
2222
- type: STS
2223
- dataset:
2224
- type: mteb/sickr-sts
2225
- name: MTEB SICK-R
2226
- config: default
2227
- split: test
2228
- revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
2229
- metrics:
2230
- - type: cos_sim_pearson
2231
- value: 85.32789517976525
2232
- - type: cos_sim_spearman
2233
- value: 80.32750384145629
2234
- - type: euclidean_pearson
2235
- value: 81.5025131452508
2236
- - type: euclidean_spearman
2237
- value: 80.24797115147175
2238
- - type: manhattan_pearson
2239
- value: 81.51634463412002
2240
- - type: manhattan_spearman
2241
- value: 80.24614721495055
2242
- - task:
2243
- type: STS
2244
- dataset:
2245
- type: mteb/sts12-sts
2246
- name: MTEB STS12
2247
- config: default
2248
- split: test
2249
- revision: a0d554a64d88156834ff5ae9920b964011b16384
2250
- metrics:
2251
- - type: cos_sim_pearson
2252
- value: 88.47050448992432
2253
- - type: cos_sim_spearman
2254
- value: 80.58919997743621
2255
- - type: euclidean_pearson
2256
- value: 85.83258918113664
2257
- - type: euclidean_spearman
2258
- value: 80.97441389240902
2259
- - type: manhattan_pearson
2260
- value: 85.7798262013878
2261
- - type: manhattan_spearman
2262
- value: 80.97208703064196
2263
- - task:
2264
- type: STS
2265
- dataset:
2266
- type: mteb/sts13-sts
2267
- name: MTEB STS13
2268
- config: default
2269
- split: test
2270
- revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
2271
- metrics:
2272
- - type: cos_sim_pearson
2273
- value: 85.95341439711532
2274
- - type: cos_sim_spearman
2275
- value: 86.59127484634989
2276
- - type: euclidean_pearson
2277
- value: 85.57850603454227
2278
- - type: euclidean_spearman
2279
- value: 86.47130477363419
2280
- - type: manhattan_pearson
2281
- value: 85.59387925447652
2282
- - type: manhattan_spearman
2283
- value: 86.50665427391583
2284
- - task:
2285
- type: STS
2286
- dataset:
2287
- type: mteb/sts14-sts
2288
- name: MTEB STS14
2289
- config: default
2290
- split: test
2291
- revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2292
- metrics:
2293
- - type: cos_sim_pearson
2294
- value: 85.39810909161844
2295
- - type: cos_sim_spearman
2296
- value: 82.98595295546008
2297
- - type: euclidean_pearson
2298
- value: 84.04681129969951
2299
- - type: euclidean_spearman
2300
- value: 82.98197460689866
2301
- - type: manhattan_pearson
2302
- value: 83.9918798171185
2303
- - type: manhattan_spearman
2304
- value: 82.91148131768082
2305
- - task:
2306
- type: STS
2307
- dataset:
2308
- type: mteb/sts15-sts
2309
- name: MTEB STS15
2310
- config: default
2311
- split: test
2312
- revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2313
- metrics:
2314
- - type: cos_sim_pearson
2315
- value: 88.02072712147692
2316
- - type: cos_sim_spearman
2317
- value: 88.78821332623012
2318
- - type: euclidean_pearson
2319
- value: 88.12132045572747
2320
- - type: euclidean_spearman
2321
- value: 88.74273451067364
2322
- - type: manhattan_pearson
2323
- value: 88.05431550059166
2324
- - type: manhattan_spearman
2325
- value: 88.67610233020723
2326
- - task:
2327
- type: STS
2328
- dataset:
2329
- type: mteb/sts16-sts
2330
- name: MTEB STS16
2331
- config: default
2332
- split: test
2333
- revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2334
- metrics:
2335
- - type: cos_sim_pearson
2336
- value: 82.96134704624787
2337
- - type: cos_sim_spearman
2338
- value: 84.44062976314666
2339
- - type: euclidean_pearson
2340
- value: 84.03642536310323
2341
- - type: euclidean_spearman
2342
- value: 84.4535014579785
2343
- - type: manhattan_pearson
2344
- value: 83.92874228901483
2345
- - type: manhattan_spearman
2346
- value: 84.33634314951631
2347
- - task:
2348
- type: STS
2349
- dataset:
2350
- type: mteb/sts17-crosslingual-sts
2351
- name: MTEB STS17 (en-de)
2352
- config: en-de
2353
- split: test
2354
- revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2355
- metrics:
2356
- - type: cos_sim_pearson
2357
- value: 87.3154168064887
2358
- - type: cos_sim_spearman
2359
- value: 86.72393652571682
2360
- - type: euclidean_pearson
2361
- value: 86.04193246174164
2362
- - type: euclidean_spearman
2363
- value: 86.30482896608093
2364
- - type: manhattan_pearson
2365
- value: 85.95524084651859
2366
- - type: manhattan_spearman
2367
- value: 86.06031431994282
2368
- - task:
2369
- type: STS
2370
- dataset:
2371
- type: mteb/sts17-crosslingual-sts
2372
- name: MTEB STS17 (en-en)
2373
- config: en-en
2374
- split: test
2375
- revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2376
- metrics:
2377
- - type: cos_sim_pearson
2378
- value: 89.91079682750804
2379
- - type: cos_sim_spearman
2380
- value: 89.30961836617064
2381
- - type: euclidean_pearson
2382
- value: 88.86249564158628
2383
- - type: euclidean_spearman
2384
- value: 89.04772899592396
2385
- - type: manhattan_pearson
2386
- value: 88.85579791315043
2387
- - type: manhattan_spearman
2388
- value: 88.94190462541333
2389
- - task:
2390
- type: STS
2391
- dataset:
2392
- type: mteb/sts22-crosslingual-sts
2393
- name: MTEB STS22 (en)
2394
- config: en
2395
- split: test
2396
- revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2397
- metrics:
2398
- - type: cos_sim_pearson
2399
- value: 67.00558145551088
2400
- - type: cos_sim_spearman
2401
- value: 67.96601170393878
2402
- - type: euclidean_pearson
2403
- value: 67.87627043214336
2404
- - type: euclidean_spearman
2405
- value: 66.76402572303859
2406
- - type: manhattan_pearson
2407
- value: 67.88306560555452
2408
- - type: manhattan_spearman
2409
- value: 66.6273862035506
2410
- - task:
2411
- type: STS
2412
- dataset:
2413
- type: mteb/sts22-crosslingual-sts
2414
- name: MTEB STS22 (de)
2415
- config: de
2416
- split: test
2417
- revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2418
- metrics:
2419
- - type: cos_sim_pearson
2420
- value: 50.83759332748726
2421
- - type: cos_sim_spearman
2422
- value: 59.066344562858006
2423
- - type: euclidean_pearson
2424
- value: 50.08955848154131
2425
- - type: euclidean_spearman
2426
- value: 58.36517305855221
2427
- - type: manhattan_pearson
2428
- value: 50.05257267223111
2429
- - type: manhattan_spearman
2430
- value: 58.37570252804986
2431
- - task:
2432
- type: STS
2433
- dataset:
2434
- type: mteb/sts22-crosslingual-sts
2435
- name: MTEB STS22 (de-en)
2436
- config: de-en
2437
- split: test
2438
- revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2439
- metrics:
2440
- - type: cos_sim_pearson
2441
- value: 59.22749007956492
2442
- - type: cos_sim_spearman
2443
- value: 55.97282077657827
2444
- - type: euclidean_pearson
2445
- value: 62.10661533695752
2446
- - type: euclidean_spearman
2447
- value: 53.62780854854067
2448
- - type: manhattan_pearson
2449
- value: 62.37138085709719
2450
- - type: manhattan_spearman
2451
- value: 54.17556356828155
2452
- - task:
2453
- type: STS
2454
- dataset:
2455
- type: mteb/stsbenchmark-sts
2456
- name: MTEB STSBenchmark
2457
- config: default
2458
- split: test
2459
- revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2460
- metrics:
2461
- - type: cos_sim_pearson
2462
- value: 87.91145397065878
2463
- - type: cos_sim_spearman
2464
- value: 88.13960018389005
2465
- - type: euclidean_pearson
2466
- value: 87.67618876224006
2467
- - type: euclidean_spearman
2468
- value: 87.99119480810556
2469
- - type: manhattan_pearson
2470
- value: 87.67920297334753
2471
- - type: manhattan_spearman
2472
- value: 87.99113250064492
2473
- - task:
2474
- type: Reranking
2475
- dataset:
2476
- type: mteb/scidocs-reranking
2477
- name: MTEB SciDocsRR
2478
- config: default
2479
- split: test
2480
- revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2481
- metrics:
2482
- - type: map
2483
- value: 78.09133563707582
2484
- - type: mrr
2485
- value: 93.2415288052543
2486
- - task:
2487
- type: Retrieval
2488
- dataset:
2489
- type: scifact
2490
- name: MTEB SciFact
2491
- config: default
2492
- split: test
2493
- revision: None
2494
- metrics:
2495
- - type: map_at_1
2496
- value: 47.760999999999996
2497
- - type: map_at_10
2498
- value: 56.424
2499
- - type: map_at_100
2500
- value: 57.24399999999999
2501
- - type: map_at_1000
2502
- value: 57.278
2503
- - type: map_at_3
2504
- value: 53.68000000000001
2505
- - type: map_at_5
2506
- value: 55.442
2507
- - type: mrr_at_1
2508
- value: 50.666999999999994
2509
- - type: mrr_at_10
2510
- value: 58.012
2511
- - type: mrr_at_100
2512
- value: 58.736
2513
- - type: mrr_at_1000
2514
- value: 58.769000000000005
2515
- - type: mrr_at_3
2516
- value: 56.056
2517
- - type: mrr_at_5
2518
- value: 57.321999999999996
2519
- - type: ndcg_at_1
2520
- value: 50.666999999999994
2521
- - type: ndcg_at_10
2522
- value: 60.67700000000001
2523
- - type: ndcg_at_100
2524
- value: 64.513
2525
- - type: ndcg_at_1000
2526
- value: 65.62400000000001
2527
- - type: ndcg_at_3
2528
- value: 56.186
2529
- - type: ndcg_at_5
2530
- value: 58.692
2531
- - type: precision_at_1
2532
- value: 50.666999999999994
2533
- - type: precision_at_10
2534
- value: 8.200000000000001
2535
- - type: precision_at_100
2536
- value: 1.023
2537
- - type: precision_at_1000
2538
- value: 0.11199999999999999
2539
- - type: precision_at_3
2540
- value: 21.889
2541
- - type: precision_at_5
2542
- value: 14.866999999999999
2543
- - type: recall_at_1
2544
- value: 47.760999999999996
2545
- - type: recall_at_10
2546
- value: 72.006
2547
- - type: recall_at_100
2548
- value: 89.767
2549
- - type: recall_at_1000
2550
- value: 98.833
2551
- - type: recall_at_3
2552
- value: 60.211000000000006
2553
- - type: recall_at_5
2554
- value: 66.3
2555
- - task:
2556
- type: PairClassification
2557
- dataset:
2558
- type: mteb/sprintduplicatequestions-pairclassification
2559
- name: MTEB SprintDuplicateQuestions
2560
- config: default
2561
- split: test
2562
- revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2563
- metrics:
2564
- - type: cos_sim_accuracy
2565
- value: 99.79009900990098
2566
- - type: cos_sim_ap
2567
- value: 94.86690691995835
2568
- - type: cos_sim_f1
2569
- value: 89.37875751503007
2570
- - type: cos_sim_precision
2571
- value: 89.5582329317269
2572
- - type: cos_sim_recall
2573
- value: 89.2
2574
- - type: dot_accuracy
2575
- value: 99.76336633663367
2576
- - type: dot_ap
2577
- value: 94.26453740761586
2578
- - type: dot_f1
2579
- value: 88.00783162016641
2580
- - type: dot_precision
2581
- value: 86.19367209971237
2582
- - type: dot_recall
2583
- value: 89.9
2584
- - type: euclidean_accuracy
2585
- value: 99.7940594059406
2586
- - type: euclidean_ap
2587
- value: 94.85459757524379
2588
- - type: euclidean_f1
2589
- value: 89.62779156327544
2590
- - type: euclidean_precision
2591
- value: 88.96551724137932
2592
- - type: euclidean_recall
2593
- value: 90.3
2594
- - type: manhattan_accuracy
2595
- value: 99.79009900990098
2596
- - type: manhattan_ap
2597
- value: 94.76971336654465
2598
- - type: manhattan_f1
2599
- value: 89.35323383084577
2600
- - type: manhattan_precision
2601
- value: 88.91089108910892
2602
- - type: manhattan_recall
2603
- value: 89.8
2604
- - type: max_accuracy
2605
- value: 99.7940594059406
2606
- - type: max_ap
2607
- value: 94.86690691995835
2608
- - type: max_f1
2609
- value: 89.62779156327544
2610
- - task:
2611
- type: Clustering
2612
- dataset:
2613
- type: mteb/stackexchange-clustering
2614
- name: MTEB StackExchangeClustering
2615
- config: default
2616
- split: test
2617
- revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2618
- metrics:
2619
- - type: v_measure
2620
- value: 55.38197670064987
2621
- - task:
2622
- type: Clustering
2623
- dataset:
2624
- type: mteb/stackexchange-clustering-p2p
2625
- name: MTEB StackExchangeClusteringP2P
2626
- config: default
2627
- split: test
2628
- revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2629
- metrics:
2630
- - type: v_measure
2631
- value: 33.08330158937971
2632
- - task:
2633
- type: Reranking
2634
- dataset:
2635
- type: mteb/stackoverflowdupquestions-reranking
2636
- name: MTEB StackOverflowDupQuestions
2637
- config: default
2638
- split: test
2639
- revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2640
- metrics:
2641
- - type: map
2642
- value: 49.50367079063226
2643
- - type: mrr
2644
- value: 50.30444943128768
2645
- - task:
2646
- type: Summarization
2647
- dataset:
2648
- type: mteb/summeval
2649
- name: MTEB SummEval
2650
- config: default
2651
- split: test
2652
- revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2653
- metrics:
2654
- - type: cos_sim_pearson
2655
- value: 30.37739520909561
2656
- - type: cos_sim_spearman
2657
- value: 31.548500943973913
2658
- - type: dot_pearson
2659
- value: 29.983610104303
2660
- - type: dot_spearman
2661
- value: 29.90185869098618
2662
- - task:
2663
- type: Retrieval
2664
- dataset:
2665
- type: trec-covid
2666
- name: MTEB TRECCOVID
2667
- config: default
2668
- split: test
2669
- revision: None
2670
- metrics:
2671
- - type: map_at_1
2672
- value: 0.198
2673
- - type: map_at_10
2674
- value: 1.5810000000000002
2675
- - type: map_at_100
2676
- value: 9.064
2677
- - type: map_at_1000
2678
- value: 22.161
2679
- - type: map_at_3
2680
- value: 0.536
2681
- - type: map_at_5
2682
- value: 0.8370000000000001
2683
- - type: mrr_at_1
2684
- value: 80.0
2685
- - type: mrr_at_10
2686
- value: 86.75
2687
- - type: mrr_at_100
2688
- value: 86.799
2689
- - type: mrr_at_1000
2690
- value: 86.799
2691
- - type: mrr_at_3
2692
- value: 85.0
2693
- - type: mrr_at_5
2694
- value: 86.5
2695
- - type: ndcg_at_1
2696
- value: 73.0
2697
- - type: ndcg_at_10
2698
- value: 65.122
2699
- - type: ndcg_at_100
2700
- value: 51.853
2701
- - type: ndcg_at_1000
2702
- value: 47.275
2703
- - type: ndcg_at_3
2704
- value: 66.274
2705
- - type: ndcg_at_5
2706
- value: 64.826
2707
- - type: precision_at_1
2708
- value: 80.0
2709
- - type: precision_at_10
2710
- value: 70.19999999999999
2711
- - type: precision_at_100
2712
- value: 53.480000000000004
2713
- - type: precision_at_1000
2714
- value: 20.946
2715
- - type: precision_at_3
2716
- value: 71.333
2717
- - type: precision_at_5
2718
- value: 70.0
2719
- - type: recall_at_1
2720
- value: 0.198
2721
- - type: recall_at_10
2722
- value: 1.884
2723
- - type: recall_at_100
2724
- value: 12.57
2725
- - type: recall_at_1000
2726
- value: 44.208999999999996
2727
- - type: recall_at_3
2728
- value: 0.5890000000000001
2729
- - type: recall_at_5
2730
- value: 0.95
2731
- - task:
2732
- type: Clustering
2733
- dataset:
2734
- type: slvnwhrl/tenkgnad-clustering-p2p
2735
- name: MTEB TenKGnadClusteringP2P
2736
- config: default
2737
- split: test
2738
- revision: 5c59e41555244b7e45c9a6be2d720ab4bafae558
2739
- metrics:
2740
- - type: v_measure
2741
- value: 42.84199261133083
2742
- - task:
2743
- type: Clustering
2744
- dataset:
2745
- type: slvnwhrl/tenkgnad-clustering-s2s
2746
- name: MTEB TenKGnadClusteringS2S
2747
- config: default
2748
- split: test
2749
- revision: 6cddbe003f12b9b140aec477b583ac4191f01786
2750
- metrics:
2751
- - type: v_measure
2752
- value: 23.689557114798838
2753
- - task:
2754
- type: Retrieval
2755
- dataset:
2756
- type: webis-touche2020
2757
- name: MTEB Touche2020
2758
- config: default
2759
- split: test
2760
- revision: None
2761
- metrics:
2762
- - type: map_at_1
2763
- value: 1.941
2764
- - type: map_at_10
2765
- value: 8.222
2766
- - type: map_at_100
2767
- value: 14.277999999999999
2768
- - type: map_at_1000
2769
- value: 15.790000000000001
2770
- - type: map_at_3
2771
- value: 4.4670000000000005
2772
- - type: map_at_5
2773
- value: 5.762
2774
- - type: mrr_at_1
2775
- value: 24.490000000000002
2776
- - type: mrr_at_10
2777
- value: 38.784
2778
- - type: mrr_at_100
2779
- value: 39.724
2780
- - type: mrr_at_1000
2781
- value: 39.724
2782
- - type: mrr_at_3
2783
- value: 33.333
2784
- - type: mrr_at_5
2785
- value: 37.415
2786
- - type: ndcg_at_1
2787
- value: 22.448999999999998
2788
- - type: ndcg_at_10
2789
- value: 21.026
2790
- - type: ndcg_at_100
2791
- value: 33.721000000000004
2792
- - type: ndcg_at_1000
2793
- value: 45.045
2794
- - type: ndcg_at_3
2795
- value: 20.053
2796
- - type: ndcg_at_5
2797
- value: 20.09
2798
- - type: precision_at_1
2799
- value: 24.490000000000002
2800
- - type: precision_at_10
2801
- value: 19.796
2802
- - type: precision_at_100
2803
- value: 7.469
2804
- - type: precision_at_1000
2805
- value: 1.48
2806
- - type: precision_at_3
2807
- value: 21.769
2808
- - type: precision_at_5
2809
- value: 21.224
2810
- - type: recall_at_1
2811
- value: 1.941
2812
- - type: recall_at_10
2813
- value: 14.915999999999999
2814
- - type: recall_at_100
2815
- value: 46.155
2816
- - type: recall_at_1000
2817
- value: 80.664
2818
- - type: recall_at_3
2819
- value: 5.629
2820
- - type: recall_at_5
2821
- value: 8.437
2822
- - task:
2823
- type: Classification
2824
- dataset:
2825
- type: mteb/toxic_conversations_50k
2826
- name: MTEB ToxicConversationsClassification
2827
- config: default
2828
- split: test
2829
- revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2830
- metrics:
2831
- - type: accuracy
2832
- value: 69.64800000000001
2833
- - type: ap
2834
- value: 12.914826731261094
2835
- - type: f1
2836
- value: 53.05213503422915
2837
- - task:
2838
- type: Classification
2839
- dataset:
2840
- type: mteb/tweet_sentiment_extraction
2841
- name: MTEB TweetSentimentExtractionClassification
2842
- config: default
2843
- split: test
2844
- revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2845
- metrics:
2846
- - type: accuracy
2847
- value: 60.427277872099594
2848
- - type: f1
2849
- value: 60.78292007556828
2850
- - task:
2851
- type: Clustering
2852
- dataset:
2853
- type: mteb/twentynewsgroups-clustering
2854
- name: MTEB TwentyNewsgroupsClustering
2855
- config: default
2856
- split: test
2857
- revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2858
- metrics:
2859
- - type: v_measure
2860
- value: 40.48134168406559
2861
- - task:
2862
- type: PairClassification
2863
- dataset:
2864
- type: mteb/twittersemeval2015-pairclassification
2865
- name: MTEB TwitterSemEval2015
2866
- config: default
2867
- split: test
2868
- revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2869
- metrics:
2870
- - type: cos_sim_accuracy
2871
- value: 84.79465935506944
2872
- - type: cos_sim_ap
2873
- value: 70.24589055290592
2874
- - type: cos_sim_f1
2875
- value: 65.0994575045208
2876
- - type: cos_sim_precision
2877
- value: 63.76518218623482
2878
- - type: cos_sim_recall
2879
- value: 66.49076517150397
2880
- - type: dot_accuracy
2881
- value: 84.63968528342374
2882
- - type: dot_ap
2883
- value: 69.84683095084355
2884
- - type: dot_f1
2885
- value: 64.50606169727523
2886
- - type: dot_precision
2887
- value: 59.1719885487778
2888
- - type: dot_recall
2889
- value: 70.89709762532982
2890
- - type: euclidean_accuracy
2891
- value: 84.76485664898374
2892
- - type: euclidean_ap
2893
- value: 70.20556438685551
2894
- - type: euclidean_f1
2895
- value: 65.06796614516543
2896
- - type: euclidean_precision
2897
- value: 63.29840319361277
2898
- - type: euclidean_recall
2899
- value: 66.93931398416886
2900
- - type: manhattan_accuracy
2901
- value: 84.72313286046374
2902
- - type: manhattan_ap
2903
- value: 70.17151475534308
2904
- - type: manhattan_f1
2905
- value: 65.31379180759113
2906
- - type: manhattan_precision
2907
- value: 62.17505366086334
2908
- - type: manhattan_recall
2909
- value: 68.7862796833773
2910
- - type: max_accuracy
2911
- value: 84.79465935506944
2912
- - type: max_ap
2913
- value: 70.24589055290592
2914
- - type: max_f1
2915
- value: 65.31379180759113
2916
- - task:
2917
- type: PairClassification
2918
- dataset:
2919
- type: mteb/twitterurlcorpus-pairclassification
2920
- name: MTEB TwitterURLCorpus
2921
- config: default
2922
- split: test
2923
- revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2924
- metrics:
2925
- - type: cos_sim_accuracy
2926
- value: 88.95874568246207
2927
- - type: cos_sim_ap
2928
- value: 85.82517548264127
2929
- - type: cos_sim_f1
2930
- value: 78.22288041466125
2931
- - type: cos_sim_precision
2932
- value: 75.33875338753387
2933
- - type: cos_sim_recall
2934
- value: 81.33661841700031
2935
- - type: dot_accuracy
2936
- value: 88.836496293709
2937
- - type: dot_ap
2938
- value: 85.53430720252186
2939
- - type: dot_f1
2940
- value: 78.10616085869725
2941
- - type: dot_precision
2942
- value: 74.73269555430501
2943
- - type: dot_recall
2944
- value: 81.79858330766862
2945
- - type: euclidean_accuracy
2946
- value: 88.92769821865176
2947
- - type: euclidean_ap
2948
- value: 85.65904346964223
2949
- - type: euclidean_f1
2950
- value: 77.98774074208407
2951
- - type: euclidean_precision
2952
- value: 73.72282795035315
2953
- - type: euclidean_recall
2954
- value: 82.77640899291654
2955
- - type: manhattan_accuracy
2956
- value: 88.86366282454303
2957
- - type: manhattan_ap
2958
- value: 85.61599642231819
2959
- - type: manhattan_f1
2960
- value: 78.01480509061737
2961
- - type: manhattan_precision
2962
- value: 74.10460685833044
2963
- - type: manhattan_recall
2964
- value: 82.36064059131506
2965
- - type: max_accuracy
2966
- value: 88.95874568246207
2967
- - type: max_ap
2968
- value: 85.82517548264127
2969
- - type: max_f1
2970
- value: 78.22288041466125
2971
- - task:
2972
- type: Retrieval
2973
- dataset:
2974
- type: None
2975
- name: MTEB WikiCLIR
2976
- config: default
2977
- split: test
2978
- revision: None
2979
- metrics:
2980
- - type: map_at_1
2981
- value: 3.9539999999999997
2982
- - type: map_at_10
2983
- value: 7.407
2984
- - type: map_at_100
2985
- value: 8.677999999999999
2986
- - type: map_at_1000
2987
- value: 9.077
2988
- - type: map_at_3
2989
- value: 5.987
2990
- - type: map_at_5
2991
- value: 6.6979999999999995
2992
- - type: mrr_at_1
2993
- value: 35.65
2994
- - type: mrr_at_10
2995
- value: 45.097
2996
- - type: mrr_at_100
2997
- value: 45.83
2998
- - type: mrr_at_1000
2999
- value: 45.871
3000
- - type: mrr_at_3
3001
- value: 42.63
3002
- - type: mrr_at_5
3003
- value: 44.104
3004
- - type: ndcg_at_1
3005
- value: 29.215000000000003
3006
- - type: ndcg_at_10
3007
- value: 22.694
3008
- - type: ndcg_at_100
3009
- value: 22.242
3010
- - type: ndcg_at_1000
3011
- value: 27.069
3012
- - type: ndcg_at_3
3013
- value: 27.641
3014
- - type: ndcg_at_5
3015
- value: 25.503999999999998
3016
- - type: precision_at_1
3017
- value: 35.65
3018
- - type: precision_at_10
3019
- value: 12.795000000000002
3020
- - type: precision_at_100
3021
- value: 3.354
3022
- - type: precision_at_1000
3023
- value: 0.743
3024
- - type: precision_at_3
3025
- value: 23.403
3026
- - type: precision_at_5
3027
- value: 18.474
3028
- - type: recall_at_1
3029
- value: 3.9539999999999997
3030
- - type: recall_at_10
3031
- value: 11.301
3032
- - type: recall_at_100
3033
- value: 22.919999999999998
3034
- - type: recall_at_1000
3035
- value: 40.146
3036
- - type: recall_at_3
3037
- value: 7.146
3038
- - type: recall_at_5
3039
- value: 8.844000000000001
3040
- - task:
3041
- type: Retrieval
3042
- dataset:
3043
- type: jinaai/xmarket_de
3044
- name: MTEB XMarket
3045
- config: default
3046
- split: test
3047
- revision: 2336818db4c06570fcdf263e1bcb9993b786f67a
3048
- metrics:
3049
- - type: map_at_1
3050
- value: 4.872
3051
- - type: map_at_10
3052
- value: 10.658
3053
- - type: map_at_100
3054
- value: 13.422999999999998
3055
- - type: map_at_1000
3056
- value: 14.245
3057
- - type: map_at_3
3058
- value: 7.857
3059
- - type: map_at_5
3060
- value: 9.142999999999999
3061
- - type: mrr_at_1
3062
- value: 16.744999999999997
3063
- - type: mrr_at_10
3064
- value: 24.416
3065
- - type: mrr_at_100
3066
- value: 25.432
3067
- - type: mrr_at_1000
3068
- value: 25.502999999999997
3069
- - type: mrr_at_3
3070
- value: 22.096
3071
- - type: mrr_at_5
3072
- value: 23.421
3073
- - type: ndcg_at_1
3074
- value: 16.695999999999998
3075
- - type: ndcg_at_10
3076
- value: 18.66
3077
- - type: ndcg_at_100
3078
- value: 24.314
3079
- - type: ndcg_at_1000
3080
- value: 29.846
3081
- - type: ndcg_at_3
3082
- value: 17.041999999999998
3083
- - type: ndcg_at_5
3084
- value: 17.585
3085
- - type: precision_at_1
3086
- value: 16.695999999999998
3087
- - type: precision_at_10
3088
- value: 10.374
3089
- - type: precision_at_100
3090
- value: 3.988
3091
- - type: precision_at_1000
3092
- value: 1.1860000000000002
3093
- - type: precision_at_3
3094
- value: 14.21
3095
- - type: precision_at_5
3096
- value: 12.623000000000001
3097
- - type: recall_at_1
3098
- value: 4.872
3099
- - type: recall_at_10
3100
- value: 18.624
3101
- - type: recall_at_100
3102
- value: 40.988
3103
- - type: recall_at_1000
3104
- value: 65.33
3105
- - type: recall_at_3
3106
- value: 10.162
3107
- - type: recall_at_5
3108
- value: 13.517999999999999
3109
- ---
3110
- <!-- TODO: add evaluation results here -->
3111
- <br><br>
3112
-
3113
- <p align="center">
3114
- <img src="https://huggingface.co/datasets/jinaai/documentation-images/resolve/main/logo.webp" alt="Jina AI: Your Search Foundation, Supercharged!" width="150px">
3115
- </p>
3116
-
3117
-
3118
- <p align="center">
3119
- <b>The text embedding set trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
3120
- </p>
3121
-
3122
- ## Quick Start
3123
-
3124
- The easiest way to starting using `jina-embeddings-v2-base-de` is to use Jina AI's [Embedding API](https://jina.ai/embeddings/).
3125
-
3126
- ## Intended Usage & Model Info
3127
-
3128
- `jina-embeddings-v2-base-de` is a German/English bilingual text **embedding model** supporting **8192 sequence length**.
3129
- It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
3130
- We have designed it for high performance in mono-lingual & cross-lingual applications and trained it specifically to support mixed German-English input without bias.
3131
- Additionally, we provide the following embedding models:
3132
-
3133
- `jina-embeddings-v2-base-de` ist ein zweisprachiges **Text Embedding Modell** für Deutsch und Englisch,
3134
- welches Texteingaben mit einer Länge von bis zu **8192 Token unterstützt**.
3135
- Es basiert auf der adaptierten Bert-Modell-Architektur JinaBERT,
3136
- welche mithilfe einer symmetrische Variante von [ALiBi](https://arxiv.org/abs/2108.12409) längere Eingabetexte erlaubt.
3137
- Wir haben, das Model für hohe Performance in einsprachigen und cross-lingual Anwendungen entwickelt und speziell darauf trainiert,
3138
- gemischte deutsch-englische Eingaben ohne einen Bias zu kodieren.
3139
- Des Weiteren stellen wir folgende Embedding-Modelle bereit:
3140
-
3141
- - [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters.
3142
- - [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters.
3143
- - [`jina-embeddings-v2-base-zh`](https://huggingface.co/jinaai/jina-embeddings-v2-base-zh): 161 million parameters Chinese-English Bilingual embeddings.
3144
- - [`jina-embeddings-v2-base-de`](https://huggingface.co/jinaai/jina-embeddings-v2-base-de): 161 million parameters German-English Bilingual embeddings **(you are here)**.
3145
- - [`jina-embeddings-v2-base-es`](): Spanish-English Bilingual embeddings (soon).
3146
- - [`jina-embeddings-v2-base-code`](https://huggingface.co/jinaai/jina-embeddings-v2-base-code): 161 million parameters code embeddings.
3147
-
3148
- ## Data & Parameters
3149
-
3150
- The data and training details are described in this [technical report](https://arxiv.org/abs/2402.17016).
3151
-
3152
- ## Usage
3153
-
3154
- **<details><summary>Please apply mean pooling when integrating the model.</summary>**
3155
- <p>
3156
-
3157
- ### Why mean pooling?
3158
-
3159
- `mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
3160
- It has been proved to be the most effective way to produce high-quality sentence embeddings.
3161
- We offer an `encode` function to deal with this.
3162
-
3163
- However, if you would like to do it without using the default `encode` function:
3164
-
3165
- ```python
3166
- import torch
3167
- import torch.nn.functional as F
3168
- from transformers import AutoTokenizer, AutoModel
3169
-
3170
- def mean_pooling(model_output, attention_mask):
3171
- token_embeddings = model_output[0]
3172
- input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
3173
- return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
3174
-
3175
- sentences = ['How is the weather today?', 'What is the current weather like today?']
3176
-
3177
- tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-base-de')
3178
- model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-de', trust_remote_code=True, torch_dtype=torch.bfloat16)
3179
-
3180
- encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
3181
-
3182
- with torch.no_grad():
3183
- model_output = model(**encoded_input)
3184
-
3185
- embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
3186
- embeddings = F.normalize(embeddings, p=2, dim=1)
3187
- ```
3188
-
3189
- </p>
3190
- </details>
3191
-
3192
- You can use Jina Embedding models directly from transformers package.
3193
-
3194
- ```python
3195
- !pip install transformers
3196
- import torch
3197
- from transformers import AutoModel
3198
- from numpy.linalg import norm
3199
-
3200
- cos_sim = lambda a,b: (a @ b.T) / (norm(a)*norm(b))
3201
- model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-de', trust_remote_code=True, torch_dtype=torch.bfloat16)
3202
- embeddings = model.encode(['How is the weather today?', 'Wie ist das Wetter heute?'])
3203
- print(cos_sim(embeddings[0], embeddings[1]))
3204
- ```
3205
-
3206
- If you only want to handle shorter sequence, such as 2k, pass the `max_length` parameter to the `encode` function:
3207
-
3208
- ```python
3209
- embeddings = model.encode(
3210
- ['Very long ... document'],
3211
- max_length=2048
3212
- )
3213
- ```
3214
-
3215
- Using the its latest release (v2.3.0) sentence-transformers also supports Jina embeddings (Please make sure that you are logged into huggingface as well):
3216
-
3217
- ```python
3218
- !pip install -U sentence-transformers
3219
- from sentence_transformers import SentenceTransformer
3220
- from sentence_transformers.util import cos_sim
3221
-
3222
- model = SentenceTransformer(
3223
- "jinaai/jina-embeddings-v2-base-de", # switch to en/zh for English or Chinese
3224
- trust_remote_code=True
3225
- )
3226
-
3227
- # control your input sequence length up to 8192
3228
- model.max_seq_length = 1024
3229
-
3230
- embeddings = model.encode([
3231
- 'How is the weather today?',
3232
- 'Wie ist das Wetter heute?'
3233
- ])
3234
- print(cos_sim(embeddings[0], embeddings[1]))
3235
- ```
3236
-
3237
- ## Alternatives to Using Transformers Package
3238
-
3239
- 1. _Managed SaaS_: Get started with a free key on Jina AI's [Embedding API](https://jina.ai/embeddings/).
3240
- 2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on [AWS Sagemaker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy).
3241
-
3242
- ## Benchmark Results
3243
-
3244
- We evaluated our Bilingual model on all German and English evaluation tasks availble on the [MTEB benchmark](https://huggingface.co/blog/mteb). In addition, we evaluated the models agains a couple of other German, English, and multilingual models on additional German evaluation tasks:
3245
-
3246
- <img src="de_evaluation_results.png" width="780px">
3247
-
3248
- ## Use Jina Embeddings for RAG
3249
-
3250
- According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
3251
-
3252
- > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
3253
-
3254
- <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
3255
-
3256
- ## Contact
3257
-
3258
- Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
3259
-
3260
- ## Citation
3261
-
3262
- If you find Jina Embeddings useful in your research, please cite the following paper:
3263
-
3264
- ```
3265
- @article{mohr2024multi,
3266
- title={Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings},
3267
- author={Mohr, Isabelle and Krimmel, Markus and Sturua, Saba and Akram, Mohammad Kalim and Koukounas, Andreas and G{\"u}nther, Michael and Mastrapas, Georgios and Ravishankar, Vinit and Mart{\'\i}nez, Joan Fontanals and Wang, Feng and others},
3268
- journal={arXiv preprint arXiv:2402.17016},
3269
- year={2024}
3270
- }
3271
- ```