How did you evaluate BEIR?

by freethenation - opened Aug 16, 2024

Aug 16, 2024

I am trying to reproduce your BEIR results. Some results match exactly but others are lower than the numbers in your model card. What tool did you use to evaluate BEIR? I am currently using pyserini & the beir package.

zhichao-geng

opensearch-project org Aug 19, 2024

Hi @freethenation , we are using OpenSearch as the evaluate engine. The max input length is 512 tokens. Please note that for some BEIR dataset, we need to filter out the query id from the search results, because for these datasets queries and documents are from the same space.

zhichao-geng

opensearch-project org Apr 18

Evaluation code is available here: https://github.com/zhichao-aws/opensearch-sparse-model-tuning-sample/tree/main

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment