Add Sentence Transformers support

#2
by tomaarsen HF staff - opened

Hello!

Preface

First of all, congratulations on this release! I will be updating the MTEB leaderboard shortly, which should place this model above bge-large-en-v1.5 as you mentioned in your paper as well.
I'm looking forward to delving deeper into your paper soon :)

Pull Request overview

  • Add Sentence Transformers support

Details

Adding support for Sentence Transformers is fairly simple with this model: it's mostly configuring the Pooling & adding a Normalization module, after which the snippet from the README should work well. This support should also allow this model to be more easily used in third-party implementations like LangChain.

Sidenote

In the near future I will be updating Sentence Transformers to add prompt templating via configuration. Then it will be possible to add the prompts directly in the config_sentence_transformers.json file, e.g.:

{
    ...
    "prompts": {
        "web_search_query": "Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery: {}",
        "...",
    },
    "default_prompt_name": null,
}

Users can then just use model.encode(my_queries, prompt_name="web_search_query"). Once I move forward with this update, then I will make a PR for this model to add some prompts to the config.

  • Tom Aarsen
tomaarsen changed pull request status to open

That's amazing! Thanks for your contribution!

intfloat changed pull request status to merged
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment