--- license: mit base_model: intfloat/multilingual-e5-base datasets: - E-FAQ language: - pt - es library_name: sentence-transformers metrics: - cosine_accuracy@1 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@10 - cosine_recall@1 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@1 - cosine_map@10 - dot_accuracy@1 - dot_accuracy@10 - dot_precision@1 - dot_precision@10 - dot_recall@1 - dot_recall@10 - dot_ndcg@10 - dot_mrr@10 - dot_map@1 - dot_map@10 - euclidean_accuracy@1 - euclidean_accuracy@10 - euclidean_precision@1 - euclidean_precision@10 - euclidean_recall@1 - euclidean_recall@10 - euclidean_ndcg@10 - euclidean_mrr@10 - euclidean_map@1 - euclidean_map@10 pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:119448 - loss:CompositionLoss widget: - source_sentence: Tem mandril com outras medidas sentences: - Bom dia vem tudo no kit conforme a foto?maquina de solda ,esquadro,máscara, 2 rolos de arame é isso? - Você tem da magneti Marelli código 40421702 PARATI BOLA G2 96 MONOPONTO AP 1.6 GASOLINA - 'Hola buenas. Es compatible para NEW Mitsubishi Montero cr 4x4 3.2 N. Chasis: JMBMNV88W8J000791' - source_sentence: Hola tienes disponible de mono talla 12 a 18 meses? sentences: - Hola buen dia! Necesito una malla sombra como la de esta publicación pero de 4 x 3.40 mts, en cuanto sale? - Serve na Duster automática 2.0 - Lo que pasa es que no me deja agregar más de 1 - source_sentence: Viene con kit de instalacion y tornillería? sentences: - Bom dia. Tem como fixar no chão. Na grama? - La base para conectar ese foco la tendrá??? - Pod ser usado para instalação de farol d milha ? - source_sentence: corsa 2004 1.8 con ultimos 8 digitos NIV 4C210262 sentences: - Le queda a un Derby 2007 1.8? - Serve no Corsa clacic 97 sedã - Boa tarde vc so tem.um ? - source_sentence: Buenos días, es compatible con las apps bancarias? sentences: - Hola....el bulon de q diámetro es? - Se le puede quitar el microfono? - Serve para cachorrinha que está no cio? model-index: - name: SentenceTransformer based on intfloat/multilingual-e5-base results: - task: type: information-retrieval name: Information Retrieval dataset: name: E-FAQ type: text-retrieval metrics: - type: cosine_accuracy@1 value: 0.7941531042796866 name: Cosine Accuracy@1 - type: cosine_accuracy@10 value: 0.9483875828812538 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.7941531042796866 name: Cosine Precision@1 - type: cosine_precision@10 value: 0.17701928872814954 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.5563725301557428 name: Cosine Recall@1 - type: cosine_recall@10 value: 0.9093050609545924 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.8420320427198602 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.8476323229713864 name: Cosine Mrr@10 - type: cosine_map@1 value: 0.7941531042796866 name: Cosine Map@1 - type: cosine_map@10 value: 0.8004156235676744 name: Cosine Map@10 - type: dot_accuracy@1 value: 0.7941531042796866 name: Dot Accuracy@1 - type: dot_accuracy@10 value: 0.9483875828812538 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.7941531042796866 name: Dot Precision@1 - type: dot_precision@10 value: 0.17701928872814954 name: Dot Precision@10 - type: dot_recall@1 value: 0.5563725301557428 name: Dot Recall@1 - type: dot_recall@10 value: 0.9093050609545924 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.8420320427198602 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.8476323229713864 name: Dot Mrr@10 - type: dot_map@1 value: 0.7941531042796866 name: Dot Map@1 - type: dot_map@10 value: 0.8004156235676744 name: Dot Map@10 - type: euclidean_accuracy@1 value: 0.7941531042796866 name: Euclidean Accuracy@1 - type: euclidean_accuracy@10 value: 0.9483875828812538 name: Euclidean Accuracy@10 - type: euclidean_precision@1 value: 0.7941531042796866 name: Euclidean Precision@1 - type: euclidean_precision@10 value: 0.17701928872814954 name: Euclidean Precision@10 - type: euclidean_recall@1 value: 0.5563725301557428 name: Euclidean Recall@1 - type: euclidean_recall@10 value: 0.9093050609545924 name: Euclidean Recall@10 - type: euclidean_ndcg@10 value: 0.8420320427198602 name: Euclidean Ndcg@10 - type: euclidean_mrr@10 value: 0.8476323229713864 name: Euclidean Mrr@10 - type: euclidean_map@1 value: 0.7941531042796866 name: Euclidean Map@1 - type: euclidean_map@10 value: 0.8004156235676744 name: Euclidean Map@10 --- # Multilingual E5 Base Self-Distilled on E-FAQ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ### Framework Versions - Python: 3.12.4 - Sentence Transformers: 3.0.1 - Transformers: 4.42.4 - PyTorch: 2.3.1+cu121 - Accelerate: 0.32.1 - Datasets: 2.20.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```