Extra typo, and add more focus on Nomic Embed v2 in the table
Browse filesHello!
Apologies for writing this in a separate PR - it's a bit easier to do it like this via the web UI.
- Tom Aarsen
README.md
CHANGED
@@ -117,14 +117,14 @@ language:
|
|
117 |
nomic-embed-text-v2-moe is SoTA multilingual MoE text embedding model:
|
118 |
|
119 |
- **High Performance**: SoTA Multilingual performance compared to ~300M parameter models, competitive with models 2x in size
|
120 |
-
- **Multilinguality**: Supports ~100 languages and trained over 1.6B pairs
|
121 |
-
- **Flexible Embedding Dimension**: Trained with [Matryoshka Embeddings](https://arxiv.org/abs/2205.13147) with 3x reductions in storage cost with minimal performance
|
122 |
- **Fully Open-Source**: Model weights, [code](https://github.com/nomic-ai/contrastors), and training data (see code repo) released
|
123 |
|
124 |
|
125 |
| Model | Params (M) | Emb Dim | BEIR | MIRACL | Pretrain Data | Finetune Data | Code |
|
126 |
|-------|------------|----------|------|---------|---------------|---------------|------|
|
127 |
-
| Nomic Embed v2 | 305 | 768 | 52.86 | **65.80** | ✅ | ✅ | ✅ |
|
128 |
| mE5 Base | 278 | 768 | 48.88 | 62.30 | ❌ | ❌ | ❌ |
|
129 |
| mGTE Base | 305 | 768 | 51.10 | 63.40 | ❌ | ❌ | ❌ |
|
130 |
| Arctic Embed v2 Base | 305 | 768 | **55.40** | 59.90 | ❌ | ❌ | ❌ |
|
|
|
117 |
nomic-embed-text-v2-moe is SoTA multilingual MoE text embedding model:
|
118 |
|
119 |
- **High Performance**: SoTA Multilingual performance compared to ~300M parameter models, competitive with models 2x in size
|
120 |
+
- **Multilinguality**: Supports ~100 languages and trained on over 1.6B pairs
|
121 |
+
- **Flexible Embedding Dimension**: Trained with [Matryoshka Embeddings](https://arxiv.org/abs/2205.13147) with 3x reductions in storage cost with minimal performance degradations
|
122 |
- **Fully Open-Source**: Model weights, [code](https://github.com/nomic-ai/contrastors), and training data (see code repo) released
|
123 |
|
124 |
|
125 |
| Model | Params (M) | Emb Dim | BEIR | MIRACL | Pretrain Data | Finetune Data | Code |
|
126 |
|-------|------------|----------|------|---------|---------------|---------------|------|
|
127 |
+
| **Nomic Embed v2** | 305 | 768 | 52.86 | **65.80** | ✅ | ✅ | ✅ |
|
128 |
| mE5 Base | 278 | 768 | 48.88 | 62.30 | ❌ | ❌ | ❌ |
|
129 |
| mGTE Base | 305 | 768 | 51.10 | 63.40 | ❌ | ❌ | ❌ |
|
130 |
| Arctic Embed v2 Base | 305 | 768 | **55.40** | 59.90 | ❌ | ❌ | ❌ |
|