hotchpotch
commited on
Commit
•
fa1683d
1
Parent(s):
f37f157
Update README.md
Browse files
README.md
CHANGED
@@ -153,7 +153,24 @@ japanese-splade-base-v2 は [JMTEB をスパースベクトルで評価できる
|
|
153 |
| sarashina-embedding-v1-1b | 0.7168 | **0.7279** | 0.4195 | 0.9696 | 0.9394 | 0.8833 | 0.7085 | **0.7761** |
|
154 |
| OpenAI/text-embedding-3-large | 0.7241 | 0.4821 | 0.3488 | 0.9655 | **0.9933** | **0.9547** | 0.6301 | 0.7448 |
|
155 |
|
156 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
157 |
|
158 |
- [hpprc/emb](https://huggingface.co/datasets/hpprc/emb)
|
159 |
- auto-wiki-qa
|
@@ -169,4 +186,8 @@ japanese-splade-base-v2 は [JMTEB をスパースベクトルで評価できる
|
|
169 |
- mqa
|
170 |
- msmarco-ja
|
171 |
- [hotchpotch/mmarco-hard-negatives-reranker-score](https://huggingface.co/datasets/hotchpotch/mmarco-hard-negatives-reranker-score)
|
172 |
-
- english
|
|
|
|
|
|
|
|
|
|
153 |
| sarashina-embedding-v1-1b | 0.7168 | **0.7279** | 0.4195 | 0.9696 | 0.9394 | 0.8833 | 0.7085 | **0.7761** |
|
154 |
| OpenAI/text-embedding-3-large | 0.7241 | 0.4821 | 0.3488 | 0.9655 | **0.9933** | **0.9547** | 0.6301 | 0.7448 |
|
155 |
|
156 |
+
## スパース性
|
157 |
+
|
158 |
+
v1 ではスパース性が強すぎたので、v2 ではバランスをとったスパース性を持たせています。
|
159 |
+
|
160 |
+
- https://github.com/hotchpotch/yast/blob/main/utils/JMTEB_L0.py
|
161 |
+
|
162 |
+
で計測しています。
|
163 |
+
|
164 |
+
| Target | jaqket-query | jaqket-docs | mrtydi-query | mrtydi-docs | jagovfaqs_22k-query | jagovfaqs_22k-docs | nlp_journal_title_abs-query | nlp_journal_title_abs-docs | nlp_journal_title_intro-query | nlp_journal_title_intro-docs | nlp_journal_abs_intro-query | nlp_journal_abs_intro-docs |
|
165 |
+
|-----------------------------------------|--------------|-------------|--------------|-------------|---------------------|--------------------|-----------------------------|----------------------------|------------------------------|-----------------------------|-----------------------------|----------------------------|
|
166 |
+
| v1 | 23.3 | 146.2 | 13.8 | 89.3 | 27.9 | 73.2 | 19 | 75.2 | 19 | 95.7 | 75.3 | 95.7 |
|
167 |
+
| v1-mmarco-only | 38.9 | 231.8 | 20.5 | 100.4 | 43.4 | 97.9 | 26.4 | 126.9 | 26.4 | 182 | 127.2 | 182 |
|
168 |
+
| v1_5 | 36.7 | 268.7 | 22.8 | 237.6 | 47.9 | 237.3 | 34.9 | 225.6 | 34.9 | 235.2 | 224.5 | 235.2 |
|
169 |
+
| v2 | 29.8 | 379.6 | 19.4 | 176.4 | 42 | 189.8 | 29 | 235.8 | 29 | 304.9 | 233.8 | 304.9 |
|
170 |
+
|
171 |
+
|
172 |
+
|
173 |
+
# 学習元データセット
|
174 |
|
175 |
- [hpprc/emb](https://huggingface.co/datasets/hpprc/emb)
|
176 |
- auto-wiki-qa
|
|
|
186 |
- mqa
|
187 |
- msmarco-ja
|
188 |
- [hotchpotch/mmarco-hard-negatives-reranker-score](https://huggingface.co/datasets/hotchpotch/mmarco-hard-negatives-reranker-score)
|
189 |
+
- english
|
190 |
+
|
191 |
+
# ライセンス
|
192 |
+
|
193 |
+
MIT
|