Add first-party Sentence Transformers support + README snippet
Hello!
Foreword
First of all, congratulations on this release! e5-mistral-7b-instruct is a fascinating model, and this seems like a solid continuation on it. I'd love to learn more about the training details in the future.
Pull Request overview
- Add 1st party Sentence Transformers support
- Add README snippet showing how to use SFR-Embedding-Mistral with Sentence Transformers
- Add
"add_eos_token": true,
to the tokenizer, no longer requiringtokenizer.add_eos_token = True
in the user their code.
Details
In this PR, I'm proposing to add Sentence Transformers support; many users produce their embeddings via ST, and this would allow convenient access of your model. The configuration files specify that the model requires last-token pooling, and I've set the max_seq_length
in ST to 4096 by default (can be overridden with model.max_seq_length = ...
) as you adopt the recommended maximum sequence length from the e5-mistral model here.
Feel free to let me know if you have any questions!
- Tom Aarsen
Excited for this to merge! Wondering what the ETA is on this PR? Also could you please enable the ability to get the raw embedding in addition to the cosine similarity scores?
Thanks!
cc: @memray @yliu279 @yeliu918
I'd like to ask you to have a look at this PR!
Anyone can already use this model like so:
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer("Salesforce/SFR-Embedding-Mistral", revision="refs/pr/1")
def get_detailed_instruct(task_description: str, query: str) -> str:
return f'Instruct: {task_description}\nQuery: {query}'
# Each query must come with a one-sentence instruction that describes the task
task = 'Given a web search query, retrieve relevant passages that answer the query'
queries = [
get_detailed_instruct(task, 'How to bake a chocolate cake'),
get_detailed_instruct(task, 'Symptoms of the flu')
]
# No need to add instruction for retrieval documents
passages = [
"To bake a delicious chocolate cake, you'll need the following ingredients: all-purpose flour, sugar, cocoa powder, baking powder, baking soda, salt, eggs, milk, vegetable oil, and vanilla extract. Start by preheating your oven to 350°F (175°C). In a mixing bowl, combine the dry ingredients (flour, sugar, cocoa powder, baking powder, baking soda, and salt). In a separate bowl, whisk together the wet ingredients (eggs, milk, vegetable oil, and vanilla extract). Gradually add the wet mixture to the dry ingredients, stirring until well combined. Pour the batter into a greased cake pan and bake for 30-35 minutes. Let it cool before frosting with your favorite chocolate frosting. Enjoy your homemade chocolate cake!",
"The flu, or influenza, is an illness caused by influenza viruses. Common symptoms of the flu include a high fever, chills, cough, sore throat, runny or stuffy nose, body aches, headache, fatigue, and sometimes nausea and vomiting. These symptoms can come on suddenly and are usually more severe than the common cold. It's important to get plenty of rest, stay hydrated, and consult a healthcare professional if you suspect you have the flu. In some cases, antiviral medications can help alleviate symptoms and reduce the duration of the illness."
]
embeddings = model.encode(queries + passages)
scores = util.cos_sim(embeddings[:2], embeddings[2:]) * 100
print(scores.tolist())
# [[86.71537780761719, 36.645721435546875], [35.00497055053711, 82.07388305664062]]
After merging, users won't have to include revision="refs/pr/1"
anymore.
- Tom Aarsen
Thanks for the above response! Yes, I have been using this code on a local GPU for the last few weeks and it works great!
I just want to be able to deploy this on a HuggingFace Inference Endpoint for easy deployment and management (like with gtr-t5-xxl and all-mpnet-base-v2). It's nice to have a lot of the deployment steps abstracted away. Thanks again!