Sentence Similarity
PEFT
yano0 commited on
Commit
eb0e691
·
verified ·
1 Parent(s): 9d12ba8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: sentence-similarity
8
  These are LoRA adaption weights for [mT5](https://huggingface.co/google/mt5-xxl) encoder.
9
 
10
  ## Multilingual Sentence T5
11
- This model is a multilingual extension of Sentence T5 and was created using the [mT5](https://huggingface.co/google/mt5-xxl) encoder. It is proposed in this [paper](hoge).
12
  It is an encoder for sentence embedding, and its performance has been verified in cross-lingual STS and sentence retrieval.
13
 
14
  ### Traning Data
@@ -51,4 +51,12 @@ last_hidden_state = outputs.last_hidden_state
51
  last_hidden_state[inputs.attention_mask == 0, :] = 0
52
  sent_len = inputs.attention_mask.sum(dim=1, keepdim=True)
53
  sent_emb = last_hidden_state.sum(dim=1) / sent_len
54
- ```
 
 
 
 
 
 
 
 
 
8
  These are LoRA adaption weights for [mT5](https://huggingface.co/google/mt5-xxl) encoder.
9
 
10
  ## Multilingual Sentence T5
11
+ This model is a multilingual extension of Sentence T5 and was created using the [mT5](https://huggingface.co/google/mt5-xxl) encoder. It is proposed in this [paper](https://arxiv.org/abs/2403.17528).
12
  It is an encoder for sentence embedding, and its performance has been verified in cross-lingual STS and sentence retrieval.
13
 
14
  ### Traning Data
 
51
  last_hidden_state[inputs.attention_mask == 0, :] = 0
52
  sent_len = inputs.attention_mask.sum(dim=1, keepdim=True)
53
  sent_emb = last_hidden_state.sum(dim=1) / sent_len
54
+ ```
55
+
56
+ ## BenchMarks
57
+ Please check the paper for details.
58
+
59
+ | | Tatoeba-14 | Tatoeba-36 | BUCC | XSTS(ar-ar)|XSTS(ar-en)|XSTS(es-es)|XSTS(es-en)|XSTS(tr-en)|
60
+ | ----- | :----------: | :----------: | :----: | :---:|:----:|:----:|:----:|:----:|
61
+ | m-ST5 | 96.3 | 94.7 | 97.6 | 76.2|78.6|84.4|76.2|75.1|
62
+ | LaBSE | 95.3 | 95.0 | 93.5 | 69.1|74.5|80.8|65.5|72.0|