File size: 10,900 Bytes
54edf77 392d577 54edf77 ca65b1a 54edf77 392d577 54edf77 392d577 54edf77 392d577 54edf77 392d577 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
---
base_model:
- Qwen/Qwen2.5-7B-Instruct
datasets:
- liuwenhan/reasonrank_data_sft
- liuwenhan/reasonrank_data_rl
- liuwenhan/reasonrank_data_13k
language:
- en
license: mit
pipeline_tag: text-ranking
library_name: transformers
tags:
- qwen
- reranker
- passage-ranking
---
# ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
<p align="center">
<a href="https://arxiv.org/pdf/2508.07050" target="_blank"><img src="https://img.shields.io/badge/Paper-arXiv-b5212f.svg?logo=arxiv"></a>
<a href="https://github.com/8421BCD/ReasonRank" target="_blank"><img src="https://img.shields.io/badge/GitHub-Repo-181717.svg?logo=github"></a>
<a href="https://brightbenchmark.github.io/" target="_blank"><img src="https://img.shields.io/badge/Project%20Page-BRIGHT-blue.svg"></a>
<a href="https://opensource.org/licenses/MIT"><img alt="License" src="https://img.shields.io/badge/LICENSE-MIT-green.svg"></a>
</p>
<p align="center">
🤗 <a href="https://huggingface.co/liuwenhan/reasonrank-7B" target="_blank">reasonrank-7B</a> |
🤗 <a href="https://huggingface.co/liuwenhan/reasonrank-32B" target="_blank">reasonrank-32B</a>
</p>
<p align="center">
🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k" target="_blank">reasonrank_data_13k</a> |
🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft" target="_blank">reasonrank_data_sft</a> |
🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl" target="_blank">reasonrank_data_rl</a>
</p>
<h5 align="center"> If you like our project, please give us a star ⭐ on GitHub.</h5>
## 📣 Latest News
- **[Aug 9, 2025]**: 🏆 Our ReasonRank (32B) has achieved **SOTA performance 40.8** on **[BRIGHT leaderboard](https://brightbenchmark.github.io/)**!
- **[Aug 9, 2025]**: 📄 We uploaded our paper to the **[arXiv](https://arxiv.org/pdf/2508.07050)** and **[Hugging Face](https://huggingface.co/papers/2508.07050)**.
- **[Aug 9, 2025]**: 🔥 We released our **[🤗full reasonrank training data (13k)](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)**, **[🤗cold-start SFT data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft)** and **[🤗RL data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl)**.
- **[Aug 9, 2025]**: 🔥 We released our reasoning-intensive reranker **[🤗reasonrank-7B](https://huggingface.co/liuwenhan/reasonrank-7B)** and **[🤗reasonrank-32B](https://huggingface.co/liuwenhan/reasonrank-32B)**.
- **[Aug 9, 2025]**: 🚀 We released our full codebase, including inference, SFT training, and RL training.
## 1. ReasonRank
### 💡 1.1 Overview
**ReasonRank** is a **reasoning-intensive passage reranker** tailored for reasoning-intensive ranking tasks. To train it, we first design an automated reasoning-intensive training data synthesis framework and synthesize 1.3k high-quality training data.
<p align="center">
<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002302377.png" />
</p>
Based on the training data, we design a two-stage training approach including **cold-start SFT** and **multi-view ranking reward RL** to inject listwise ranking ability to our ReasonRank.
<p align="center">
<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002546838.png" />
</p>
### 📊 1.2 Overall Performance
When using ReasonIR as initial passage retriever, our ReasonRank demonstrates strong overall ranking performance on BRIGHT benchmark, while showing superior efficiency compared with pointwise reasoning-intensive reranker Rank1.
<p align="center">
<img width="50%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809003636871.png" />
</p>
Besides, when using a higher-quality retrieval results (RaDeR + BM25 hybrid, provided by [RaDeR](https://github.com/Debrup-61/RaDeR/blob/main/BRIGHT_score_files/RaDeR-gte-Qwen2-LLMq_CoT_lexical/aops/hybrid_BM25_Rader.json)), our ReasonRank (32B) achieves SOTA performance **40.8** on [BRIGHT leaderboard](https://brightbenchmark.github.io/).
## 📂 2. The Introduction of ReasonRank Training Data
An important contribution of our work is our reasoning-intensive training data ([reasonrank_data_13k](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)). The dataset fields of ``training_data_all.jsonl`` are as follows:
#### **Dataset Fields & Descriptions**
1. **`dataset`** *(str)*
- The dataset name of each piece of data (e.g., `"math-qa"`).
2. **`qid`** *(str)*
- The query ID. The content is provided in ``id_query/`` directory.
3. **`initial_list`** *(List[str])*
- The initial list of passage IDs before DeepSeek-R1 reranking. The content of each passage ID is provided in ``id_doc/`` directory.
4. **`final_list`** *(List[str])*
- The re-ranked list of passage IDs after listwisely reranking with DeepSeek-R1.
- Reflects the improved ranking based on reasoning-enhanced relevance scoring.
5. **`reasoning`** *(str)*
- A **step-by-step reasoning chain** outputted by DeepSeek-R1 while performing the listwise reranking.
6. **`relevant_docids`** *(List[str])*
- The ids of relevant passages in ``initial_list`` mined by DeepSeek-R1. The remaining passage ids in ``initial_list`` are irrelevant ones.
- Note that **`relevant_docids`** are not necessarily ranked at the top of **`final_list`** by the DeepSeek-R1, which may stem from inconsistencies in DeepSeek-R1’s judgments. To address this, you can apply the **self-consistency data filtering** technique proposed in our paper to select higher-quality data.
The statistics of dataset is shown in the figure below:
<p align="center">
<img width="80%" alt="image" src="https://github.com/user-attachments/assets/c04b9d1a-2f21-46f1-b23d-ad1f50d22fb8" />
</p>
#### **Example Entry**
```json
{
"dataset": "math-qa",
"qid": "math_1001",
"initial_list": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", ...],
"final_list": ["math_test_intermediate_algebra_808", "math_test_intermediate_algebra_1678", ...],
"reasoning": "Okay, I need to rank the 20 passages based on their relevance...",
"relevant_docids": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", "math_train_intermediate_algebra_993"]
}
```
#### **Application**
1. Training passage reranker: Given the reranked passage list, one can use our data to train a listwise reranker
2. Training passage retriever: Using the **`relevant_docids`** and the remaining irrelevant ids, one can train a passage retriever.
## ⚡ 3. Quick Start
This section provides a general guide on how to use ReasonRank for inference. For detailed environment setup, specific inference commands (including usage with ReasonIR or custom retrieval results), and in-depth training procedures (Cold-Start SFT, Multi-reward ranking RL), please refer to the [official GitHub repository](https://github.com/8421BCD/ReasonRank).
## Sample Usage
This model can be loaded and used with the `transformers` library. Below is a basic example demonstrating how to use the model for passage re-ranking. The model expects a specific chat-like format for input, including a system prompt and a user query with listed passages.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "liuwenhan/reasonrank-7B" # Or "liuwenhan/reasonrank-32B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
).eval()
# System prompt as used in the paper for reasoning-intensive ranking
system_prompt = (
"You are a helpful and harmless AI assistant. You will be provided with a search query and a list of passages, "
"and your task is to re-rank the passages based on their relevance to the query. "
"You should follow a chain of thought to determine the most relevant passages. "
"Your final answer should be a list of the re-ranked passage numbers, separated by commas. "
"Do not include any other information or explanation in your final answer."
)
query = "What is the capital of France?"
passages = [
"Paris is the capital and most populous city of France.",
"The Eiffel Tower is a famous landmark in Paris.",
"France is a country located in Western Europe.",
"London is the capital of the United Kingdom."
]
# Construct the user message with query and passages
user_content = f"Search Query: {query}
"
for i, passage in enumerate(passages):
user_content += f"[{i+1}] {passage}
"
user_content += "Please re-rank the passages based on their relevance to the query. Provide a chain of thought and then the final re-ranked list."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_content}
]
# Apply chat template
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Tokenize input
input_ids = tokenizer.encode(input_text, return_tensors="pt").to(model.device)
# Generate response
output_ids = model.generate(
input_ids,
max_new_tokens=256, # Adjust as needed for reasoning length
do_sample=False, # Typically deterministic for ranking/reasoning
temperature=0.1, # Low temperature for focused output
repetition_penalty=1.05,
eos_token_id=tokenizer.eos_token_id
)
# Decode output
generated_text = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
print(generated_text)
```
## Citation
If you find this work helpful, please cite our papers:
```bibtex
@misc{liu2025reasonrankempoweringpassageranking,
title={ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability},
author={Wenhan Liu and Xinyu Ma and Weiwei Sun and Yutao Zhu and Yuchen Li and Dawei Yin and Zhicheng Dou},
year={2025},
eprint={2508.07050},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2508.07050},
}
```
## 🤝 Acknowledge
The inference codes and training implementation build upon [RankLLM](https://github.com/castorini/rank_llm), [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) and [verl](https://github.com/volcengine/verl). Our work is based on the [Qwen2.5](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community.
## 📄 License
This project is released under the [MIT License](LICENSE).
## 📞 Contact
For any questions or feedback, please reach out to us at [[email protected]]([email protected]).
## Star History
[](https://www.star-history.com/#8421bcd/reasonrank&Date) |