juampahc
/

bge-m3-m2v-256

Sentence Similarity

sentence-transformers

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

juampahc commited on 22 days ago

Commit

cc84c30

•

1 Parent(s): f32789f

Update README.md

Files changed (1) hide show

README.md +12 -2

README.md CHANGED Viewed

@@ -1,13 +1,23 @@
 ---
 library_name: sentence-transformers
 pipeline_tag: sentence-similarity
 tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
 ---
-# SentenceTransformer
 This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 256-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
@@ -16,7 +26,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps
 ### Model Description
 - **Model Type:** Sentence Transformer
 <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
-- **Maximum Sequence Length:** inf tokens
 - **Output Dimensionality:** 256 tokens
 - **Similarity Function:** Cosine Similarity
 <!-- - **Training Dataset:** Unknown -->

 ---
+license: mit
+base_model:
+  - BAAI/bge-m3
 library_name: sentence-transformers
 pipeline_tag: sentence-similarity
 tags:
 - sentence-transformers
 - sentence-similarity
 - feature-extraction
+- model2vec
+- multilingual
 ---
+For more details please refer to the original github repo: https://github.com/FlagOpen/FlagEmbedding
+# BGE-M3 ([paper](https://arxiv.org/pdf/2402.03216.pdf), [code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3))
+This repo contains the original `BAAI/bge-m3` distilled to a Static Embedding module using [Model2Vec](https://github.com/MinishLab/model2vec/) and exported with SentenceTransformer.
+## SentenceTransformer
 This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 256-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ### Model Description
 - **Model Type:** Sentence Transformer
 <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
+- **Maximum Sequence Length:** 8194 tokens
 - **Output Dimensionality:** 256 tokens
 - **Similarity Function:** Cosine Similarity
 <!-- - **Training Dataset:** Unknown -->