|
--- |
|
license: apache-2.0 |
|
pipeline_tag: text-ranking |
|
library_name: lightning-ir |
|
base_model: |
|
- google-bert/bert-base-uncased |
|
tags: |
|
- bi-encoder |
|
--- |
|
|
|
# Lightning IR BERT Bi-Encoder |
|
|
|
This model is a BERT-based bi-encoder[^1] model fine-tuned using [Lightning IR](https://github.com/webis-de/lightning-ir). |
|
|
|
See the [Lightning IR Model Zoo](https://webis-de.github.io/lightning-ir/models.html) for a comparison with other models. |
|
|
|
## Reproduction |
|
|
|
To reproduce the model training, install Lightning IR and run the following command using the [fine-tune.yaml](./configs/fine-tune.yaml) configuration file: |
|
|
|
```bash |
|
lightning-ir fit --config fine-tune.yaml |
|
``` |
|
|
|
To index MS~MARCO passages, use the following command and the [index.yaml](./configs/index.yaml) configuration file: |
|
|
|
```bash |
|
lightning-ir index --config index.yaml |
|
``` |
|
|
|
After indexing, to evaluate the model on TREC Deep Learning 2019 and 2020, use the following command and the [search.yaml](./configs/search.yaml) configuration file: |
|
|
|
```bash |
|
lightning-ir search --config search.yaml |
|
``` |
|
|
|
[^1]: Reimers and Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://arxiv.org/abs/1908.10084) |