Reranker
#30
by
Totole
- opened
Hi, thanks a lot for your work !
Two questions:
- Is the model.compute_score(sentence_pairs, max_passage_length, weights_for_different_modes) just making a score (e.g. cosine) with the embeddings (dense, sparse, colbert) done by the model ? In other words, is it cross-encoding or bi-encoding ?
- Why does the max_length_token of this model seems to be 514 and not 8000 ?
Thanks for your interest in our work!
- The bge-m3 is bi-encoding model. Its
compute_score
function will summarize the scores from different embedding mode(dense, sparse, colbert) - The max length is 8192. You can see the config: https://huggingface.co/BAAI/bge-m3/blob/main/tokenizer_config.json
Besides, we release some new rerankers(cross-encoders): https://huggingface.co/BAAI/bge-reranker-v2-m3#model-list . Feel free to use them and provide your feedback.
Hello, I need more detailed information about the error.
- Can you run the code here successfully?
- Maybe you can paste your full code here, and then I will test it to see if this error can be reproduced.
For a very weird reason, it works on Colab but not on Azure ML...