File size: 6,867 Bytes
a21dc9e
 
c2d6420
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a21dc9e
c2d6420
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
---

license: mit
base_model: intfloat/multilingual-e5-base
datasets:
 - E-FAQ
language:
 - pt
 - es
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@10
- cosine_recall@1
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@1
- cosine_map@10
- dot_accuracy@1
- dot_accuracy@10
- dot_precision@1
- dot_precision@10
- dot_recall@1
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@1
- dot_map@10
- euclidean_accuracy@1
- euclidean_accuracy@10
- euclidean_precision@1
- euclidean_precision@10
- euclidean_recall@1
- euclidean_recall@10
- euclidean_ndcg@10
- euclidean_mrr@10
- euclidean_map@1
- euclidean_map@10
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:119448
- loss:CompositionLoss
widget:
- source_sentence: Tem mandril com outras medidas
  sentences:
  - Bom dia vem tudo no kit conforme a foto?maquina de solda ,esquadro,máscara, 2
    rolos de arame é isso?
  - Você tem da magneti Marelli código 40421702 PARATI BOLA G2 96 MONOPONTO AP 1.6
    GASOLINA
  - 'Hola buenas. Es compatible para NEW Mitsubishi Montero cr 4x4 3.2 N. Chasis:

    JMBMNV88W8J000791'
- source_sentence: Hola tienes disponible de mono talla 12 a 18 meses?
  sentences:
  - Hola buen dia! Necesito una malla sombra como la de esta publicación pero de 4
    x 3.40 mts, en cuanto sale?
  - Serve na Duster automática 2.0
  - Lo que pasa es que no me deja agregar más de 1
- source_sentence: Viene con kit de instalacion y tornillería?
  sentences:
  - Bom dia. Tem como fixar no chão. Na grama?
  - La base para conectar ese foco la tendrá???
  - Pod ser usado para instalação de farol d milha ?
- source_sentence: corsa 2004 1.8 con ultimos 8 digitos NIV 4C210262
  sentences:
  - Le queda a un Derby 2007 1.8?
  - Serve no Corsa clacic 97 sedã
  - Boa tarde vc so tem.um ?
- source_sentence: Buenos días, es compatible con las apps bancarias?
  sentences:
  - Hola....el bulon de q diámetro es?
  - Se le puede quitar el microfono?
  - Serve para cachorrinha que está no cio?
model-index:
- name: SentenceTransformer based on intfloat/multilingual-e5-base
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: E-FAQ
      type: text-retrieval
    metrics:
    - type: cosine_accuracy@1
      value: 0.7941531042796866
      name: Cosine Accuracy@1
    - type: cosine_accuracy@10
      value: 0.9483875828812538
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.7941531042796866
      name: Cosine Precision@1
    - type: cosine_precision@10
      value: 0.17701928872814954
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.5563725301557428
      name: Cosine Recall@1
    - type: cosine_recall@10
      value: 0.9093050609545924
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8420320427198602
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.8476323229713864
      name: Cosine Mrr@10
    - type: cosine_map@1
      value: 0.7941531042796866
      name: Cosine Map@1
    - type: cosine_map@10
      value: 0.8004156235676744
      name: Cosine Map@10
    - type: dot_accuracy@1
      value: 0.7941531042796866
      name: Dot Accuracy@1
    - type: dot_accuracy@10
      value: 0.9483875828812538
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.7941531042796866
      name: Dot Precision@1
    - type: dot_precision@10
      value: 0.17701928872814954
      name: Dot Precision@10
    - type: dot_recall@1
      value: 0.5563725301557428
      name: Dot Recall@1
    - type: dot_recall@10
      value: 0.9093050609545924
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.8420320427198602
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.8476323229713864
      name: Dot Mrr@10
    - type: dot_map@1
      value: 0.7941531042796866
      name: Dot Map@1
    - type: dot_map@10
      value: 0.8004156235676744
      name: Dot Map@10
    - type: euclidean_accuracy@1
      value: 0.7941531042796866
      name: Euclidean Accuracy@1
    - type: euclidean_accuracy@10
      value: 0.9483875828812538
      name: Euclidean Accuracy@10
    - type: euclidean_precision@1
      value: 0.7941531042796866
      name: Euclidean Precision@1
    - type: euclidean_precision@10
      value: 0.17701928872814954
      name: Euclidean Precision@10
    - type: euclidean_recall@1
      value: 0.5563725301557428
      name: Euclidean Recall@1
    - type: euclidean_recall@10
      value: 0.9093050609545924
      name: Euclidean Recall@10
    - type: euclidean_ndcg@10
      value: 0.8420320427198602
      name: Euclidean Ndcg@10
    - type: euclidean_mrr@10
      value: 0.8476323229713864
      name: Euclidean Mrr@10
    - type: euclidean_map@1
      value: 0.7941531042796866
      name: Euclidean Map@1
    - type: euclidean_map@10
      value: 0.8004156235676744
      name: Euclidean Map@10
---


# Multilingual E5 Base Self-Distilled on E-FAQ

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

### Full Model Architecture

```

SentenceTransformer(

  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 

  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})

  (2): Normalize()

)

```

### Framework Versions
- Python: 3.12.4
- Sentence Transformers: 3.0.1
- Transformers: 4.42.4
- PyTorch: 2.3.1+cu121
- Accelerate: 0.32.1
- Datasets: 2.20.0
- Tokenizers: 0.19.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex

@inproceedings{reimers-2019-sentence-bert,

    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",

    author = "Reimers, Nils and Gurevych, Iryna",

    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",

    month = "11",

    year = "2019",

    publisher = "Association for Computational Linguistics",

    url = "https://arxiv.org/abs/1908.10084",

}

```