Text Ranking
Transformers
Safetensors
English
passage-ranking
Information-retrieval
Reasoning
File size: 10,900 Bytes
54edf77
392d577
 
54edf77
 
 
ca65b1a
54edf77
 
392d577
 
 
 
 
 
 
54edf77
 
392d577
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54edf77
 
392d577
54edf77
 
392d577
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
---
base_model:
- Qwen/Qwen2.5-7B-Instruct
datasets:
- liuwenhan/reasonrank_data_sft
- liuwenhan/reasonrank_data_rl
- liuwenhan/reasonrank_data_13k
language:
- en
license: mit
pipeline_tag: text-ranking
library_name: transformers
tags:
- qwen
- reranker
- passage-ranking
---

# ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

<p align="center">
  <a href="https://arxiv.org/pdf/2508.07050" target="_blank"><img src="https://img.shields.io/badge/Paper-arXiv-b5212f.svg?logo=arxiv"></a>
  <a href="https://github.com/8421BCD/ReasonRank" target="_blank"><img src="https://img.shields.io/badge/GitHub-Repo-181717.svg?logo=github"></a>
  <a href="https://brightbenchmark.github.io/" target="_blank"><img src="https://img.shields.io/badge/Project%20Page-BRIGHT-blue.svg"></a>
  <a href="https://opensource.org/licenses/MIT"><img alt="License" src="https://img.shields.io/badge/LICENSE-MIT-green.svg"></a>
</p>

<p align="center">
🤗 <a href="https://huggingface.co/liuwenhan/reasonrank-7B" target="_blank">reasonrank-7B</a> |
🤗 <a href="https://huggingface.co/liuwenhan/reasonrank-32B" target="_blank">reasonrank-32B</a>
  </p>
<p align="center">
🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k" target="_blank">reasonrank_data_13k</a> |
🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft" target="_blank">reasonrank_data_sft</a> |
🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl" target="_blank">reasonrank_data_rl</a>
</p>
<h5 align="center"> If you like our project, please give us a star ⭐ on GitHub.</h5>

## 📣 Latest News
- **[Aug 9, 2025]**: 🏆 Our ReasonRank (32B) has achieved **SOTA performance 40.8** on **[BRIGHT leaderboard](https://brightbenchmark.github.io/)**!
- **[Aug 9, 2025]**: 📄 We uploaded our paper to the **[arXiv](https://arxiv.org/pdf/2508.07050)** and **[Hugging Face](https://huggingface.co/papers/2508.07050)**.
- **[Aug 9, 2025]**: 🔥 We released our **[🤗full reasonrank training data (13k)](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)**, **[🤗cold-start SFT data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft)** and **[🤗RL data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl)**.
- **[Aug 9, 2025]**: 🔥 We released our reasoning-intensive reranker **[🤗reasonrank-7B](https://huggingface.co/liuwenhan/reasonrank-7B)** and **[🤗reasonrank-32B](https://huggingface.co/liuwenhan/reasonrank-32B)**.
- **[Aug 9, 2025]**: 🚀 We released our full codebase, including inference, SFT training, and RL training.

## 1. ReasonRank

### 💡 1.1 Overview

**ReasonRank** is a **reasoning-intensive passage reranker** tailored for reasoning-intensive ranking tasks. To train it, we first design an automated reasoning-intensive training data synthesis framework and synthesize 1.3k high-quality training data.

<p align="center">
<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002302377.png" />
</p>

Based on the training data, we design a two-stage training approach including **cold-start SFT** and **multi-view ranking reward RL** to inject listwise ranking ability to our ReasonRank.

<p align="center">
<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002546838.png" />
</p>

### 📊 1.2 Overall Performance

When using ReasonIR as initial passage retriever, our ReasonRank demonstrates strong overall ranking performance on BRIGHT benchmark, while showing superior efficiency compared with pointwise reasoning-intensive reranker Rank1.

<p align="center">
<img width="50%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809003636871.png" />
</p>

Besides, when using a higher-quality retrieval results (RaDeR + BM25 hybrid, provided by [RaDeR](https://github.com/Debrup-61/RaDeR/blob/main/BRIGHT_score_files/RaDeR-gte-Qwen2-LLMq_CoT_lexical/aops/hybrid_BM25_Rader.json)), our ReasonRank (32B) achieves SOTA performance **40.8** on [BRIGHT leaderboard](https://brightbenchmark.github.io/).

## 📂 2. The Introduction of ReasonRank Training Data

An important contribution of our work is our reasoning-intensive training data ([reasonrank_data_13k](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)). The dataset fields of ``training_data_all.jsonl`` are as follows:

#### **Dataset Fields & Descriptions**

1. **`dataset`** *(str)*
   - The dataset name of each piece of data (e.g., `"math-qa"`).
2. **`qid`** *(str)*
   - The query ID. The content is provided in ``id_query/`` directory.
3. **`initial_list`** *(List[str])*
   - The initial list of passage IDs before DeepSeek-R1 reranking. The content of each passage ID is provided in ``id_doc/`` directory.
4. **`final_list`** *(List[str])*
   - The re-ranked list of passage IDs after listwisely reranking with DeepSeek-R1.
   - Reflects the improved ranking based on reasoning-enhanced relevance scoring.
5. **`reasoning`** *(str)*
   - A  **step-by-step reasoning chain** outputted by DeepSeek-R1 while performing the listwise reranking.
6. **`relevant_docids`** *(List[str])*
   - The ids of relevant passages in ``initial_list`` mined by DeepSeek-R1. The remaining passage ids in ``initial_list`` are irrelevant ones. 
   - Note that **`relevant_docids`** are not necessarily ranked at the top of **`final_list`** by the DeepSeek-R1, which may stem from inconsistencies in DeepSeek-R1’s judgments. To address this, you can apply the **self-consistency data filtering** technique proposed in our paper to select higher-quality data.

The statistics of dataset is shown in the figure below:
<p align="center">
<img width="80%" alt="image" src="https://github.com/user-attachments/assets/c04b9d1a-2f21-46f1-b23d-ad1f50d22fb8" />
</p>

#### **Example Entry**

```json
{
  "dataset": "math-qa",
  "qid": "math_1001",
  "initial_list": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", ...],
  "final_list": ["math_test_intermediate_algebra_808", "math_test_intermediate_algebra_1678", ...],
  "reasoning": "Okay, I need to rank the 20 passages based on their relevance...",
  "relevant_docids": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", "math_train_intermediate_algebra_993"]
}
```

#### **Application**

1. Training passage reranker: Given the reranked passage list, one can use our data to train a listwise reranker
2. Training passage retriever: Using the **`relevant_docids`** and the remaining irrelevant ids, one can train a passage retriever.

## ⚡ 3. Quick Start

This section provides a general guide on how to use ReasonRank for inference. For detailed environment setup, specific inference commands (including usage with ReasonIR or custom retrieval results), and in-depth training procedures (Cold-Start SFT, Multi-reward ranking RL), please refer to the [official GitHub repository](https://github.com/8421BCD/ReasonRank).

## Sample Usage

This model can be loaded and used with the `transformers` library. Below is a basic example demonstrating how to use the model for passage re-ranking. The model expects a specific chat-like format for input, including a system prompt and a user query with listed passages.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "liuwenhan/reasonrank-7B" # Or "liuwenhan/reasonrank-32B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
).eval()

# System prompt as used in the paper for reasoning-intensive ranking
system_prompt = (
    "You are a helpful and harmless AI assistant. You will be provided with a search query and a list of passages, "
    "and your task is to re-rank the passages based on their relevance to the query. "
    "You should follow a chain of thought to determine the most relevant passages. "
    "Your final answer should be a list of the re-ranked passage numbers, separated by commas. "
    "Do not include any other information or explanation in your final answer."
)

query = "What is the capital of France?"
passages = [
    "Paris is the capital and most populous city of France.",
    "The Eiffel Tower is a famous landmark in Paris.",
    "France is a country located in Western Europe.",
    "London is the capital of the United Kingdom."
]

# Construct the user message with query and passages
user_content = f"Search Query: {query}
"
for i, passage in enumerate(passages):
    user_content += f"[{i+1}] {passage}
"
user_content += "Please re-rank the passages based on their relevance to the query. Provide a chain of thought and then the final re-ranked list."

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_content}
]

# Apply chat template
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Tokenize input
input_ids = tokenizer.encode(input_text, return_tensors="pt").to(model.device)

# Generate response
output_ids = model.generate(
    input_ids,
    max_new_tokens=256, # Adjust as needed for reasoning length
    do_sample=False,    # Typically deterministic for ranking/reasoning
    temperature=0.1,    # Low temperature for focused output
    repetition_penalty=1.05,
    eos_token_id=tokenizer.eos_token_id
)

# Decode output
generated_text = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
print(generated_text)
```

## Citation

If you find this work helpful, please cite our papers:

```bibtex
@misc{liu2025reasonrankempoweringpassageranking,
      title={ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability}, 
      author={Wenhan Liu and Xinyu Ma and Weiwei Sun and Yutao Zhu and Yuchen Li and Dawei Yin and Zhicheng Dou},
      year={2025},
      eprint={2508.07050},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2508.07050}, 
}
```

## 🤝 Acknowledge

The inference codes and training implementation build upon [RankLLM](https://github.com/castorini/rank_llm), [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) and [verl](https://github.com/volcengine/verl). Our work is based on the [Qwen2.5](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community.

## 📄 License

This project is released under the [MIT License](LICENSE).

## 📞 Contact

For any questions or feedback, please reach out to us at [[email protected]]([email protected]).

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=8421bcd/reasonrank&type=Date)](https://www.star-history.com/#8421bcd/reasonrank&Date)