File size: 1,605 Bytes
01c1ff6
 
4afbca2
0cecf3a
4afbca2
 
eed1b4a
 
01c1ff6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
license: apache-2.0
pipeline_tag: text-ranking
library_name: lightning-ir
base_model:
- google/electra-base-discriminator
tags:
- cross-encoder
---

# Set-Encoder

This repository contains the code for the paper: [`Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders`](https://arxiv.org/abs/2404.06912).

We use [`lightning-ir`](https://github.com/webis-de/lightning-ir) to train and fine-tune models. Download and install the library to use the code in this repository.

## Model Zoo

We provide the following pre-trained models:

| Model Name                                                          | TREC DL 19 (BM25) | TREC DL 20 (BM25) | TREC DL 19 (ColBERTv2) | TREC DL 20 (ColBERTv2) |
| ------------------------------------------------------------------- | ----------------- | ----------------- | ---------------------- | ---------------------- |
| [set-encoder-base](https://huggingface.co/webis/set-encoder-base)   | 0.724             | 0.710             | 0.788                  | 0.777                  |
| [set-encoder-large](https://huggingface.co/webis/set-encoder-large) | 0.727             | 0.735             | 0.789                  | 0.790                  |

## Inference

We recommend using the `lightning-ir` cli to run inference. The following command can be used to run inference using the `set-encoder-base` model on the TREC DL 19 and TREC DL 20 datasets:

```bash
lightning-ir re_rank --config configs/re-rank.yaml --config configs/set-encoder-finetuned.yaml --config configs/trec-dl.yaml
```

## Fine-Tuning

WIP