tomaarsen HF Staff commited on
Commit
277396c
·
verified ·
1 Parent(s): ce21f06

Add MTEB metrics

Browse files
Files changed (1) hide show
  1. README.md +47 -1
README.md CHANGED
@@ -182,7 +182,7 @@ Then you can load this model and run inference.
182
  from sentence_transformers import SparseEncoder
183
 
184
  # Download from the 🤗 Hub
185
- model = SparseEncoder("tomaarsen/splade-robbert-dutch-base")
186
  # Run inference
187
  queries = [
188
  "hoe maak je een keldervloer glad",
@@ -231,6 +231,52 @@ You can finetune this model on your own dataset.
231
 
232
  ### Metrics
233
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
234
  #### Sparse Information Retrieval
235
 
236
  * Dataset: `msmarco-eval-1k`
 
182
  from sentence_transformers import SparseEncoder
183
 
184
  # Download from the 🤗 Hub
185
+ model = SparseEncoder("sparse-encoder/splade-robbert-dutch-base-v1")
186
  # Run inference
187
  queries = [
188
  "hoe maak je een keldervloer glad",
 
231
 
232
  ### Metrics
233
 
234
+ #### MTEB
235
+
236
+ To evaluate this model, we've evaluated it on [BelebeleRetrieval](https://arxiv.org/abs/2308.16884) and WikipediaRetrievalMultilingual: the two Dutch Retrieval tasks recommended by [MMTEB](https://huggingface.co/spaces/mteb/leaderboard).
237
+
238
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317233cc92fd6fee317e030/mnrPBcvDBGImGGVi4EBdW.png)
239
+
240
+ As shown in this figure, `splade-robbert-dutch-base-v1` heavily outperforms the only other Dutch-capable Sparse embedding model, and outperforms all equally sized dense embedding models, despite only using an average of ~250 active (non-zero) dimensions for documents (during training).
241
+
242
+ <details><summary>Click to see the full table</summary>
243
+
244
+ | Model | Number of Parameters | BelebeleRetrieval | WikipediaRetrievalMultilingual |
245
+ |---------------------------------------------------|----------------------|-------------------|--------------------------------|
246
+ | multilingual-e5-large-instruct | 560M | 94.725 | 92.342 |
247
+ | multilingual-e5-large | 560M | 94.607 | - |
248
+ | Solon-embeddings-large-0.1 | 559M | 93.651 | 91.239 |
249
+ | snowflake-arctic-embed-l-v2.0 | 568M | 93.318 | 90.902 |
250
+ | bge-m3 | 568M | 93.859 | 90.106 |
251
+ | multilingual-e5-base | 278M | 93.731 | 89.905 |
252
+ | jina-embeddings-v3 | 572M | 93.105 | 90.296 |
253
+ | **splade-robbert-dutch-base-v1** | 124M | 93.389 | 88.937 |
254
+ | multilingual-e5-small | 118M | 92.859 | 88.662 |
255
+ | KaLM-embedding-multilingual-mini-v1 | 494M | 91.453 | 88.413 |
256
+ | Qwen3-Embedding-0.6B | 595M | 91.686 | 88.121 |
257
+ | snowflake-arctic-embed-m-v2.0 | 305M | 88.358 | 88.898 |
258
+ | granite-embedding-278m-multilingual | 278M | 87.039 | 86.324 |
259
+ | gte-multilingual-base | 305M | 89.204 | 83.976 |
260
+ | KaLM-embedding-multilingual-mini-instruct-v1 | 494M | 85.648 | 85.877 |
261
+ | granite-embedding-107m-multilingual | 107M | 85.068 | 85.097 |
262
+ | robbert-2022-dutch-sentence-transformers | 124M | 86.146 | 82.553 |
263
+ | opensearch-neural-sparse-encoding-multilingual-v1 | 167M | 80.101 | 85.529 |
264
+ | paraphrase-multilingual-mpnet-base-v2 | 278M | 83.910 | 76.676 |
265
+ | e5-large-v2 | 335M | 76.433 | 79.711 |
266
+ | STS-multilingual-mpnet-base-v2 | 278M | 80.625 | 73.803 |
267
+ | paraphrase-multilingual-MiniLM-L12-v2 | 118M | 81.021 | 71.091 |
268
+ | snowflake-arctic-embed-m | 109M | 65.511 | 74.801 |
269
+ | potion-multilingual-128M | 128M | 72.454 | 65.559 |
270
+ | static-similarity-mrl-multilingual-v1 | 108M | 67.375 | 69.050 |
271
+ | snowflake-arctic-embed-m-long | 137M | 67.947 | 65.988 |
272
+ | snowflake-arctic-embed-m-v1.5 | 109M | 65.511 | 67.920 |
273
+ | bge-base-en-v1.5 | 109M | 61.073 | 72.093 |
274
+ | snowflake-arctic-embed-s | 32M | 58.683 | 70.887 |
275
+ | potion-base-8M | 7M | 22.563 | 40.107 |
276
+
277
+ </details>
278
+
279
+
280
  #### Sparse Information Retrieval
281
 
282
  * Dataset: `msmarco-eval-1k`