metadata
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
embaas/sentence-transformers-e5-large-v2
This is a the sentence-transformers version of the intfloat/e5-large-v2 model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
pip install -U sentence-transformers
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('embaas/sentence-transformers-e5-large-v2')
embeddings = model.encode(sentences)
print(embeddings)
Using with API
You can use the embaas API to encode your input. Get your free API key from embaas.io
import requests
url = "https://api.embaas.io/v1/embeddings/"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer ${YOUR_API_KEY}"
}
data = {
"texts": ["This is an example sentence.", "Here is another sentence."],
"instruction": "query"
"model": "e5-large-v2"
}
response = requests.post(url, json=data, headers=headers)
Evaluation Results
Find the results of the e5 at the MTEB leaderboard
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
(2): Normalize()
)