base_model:
- Qwen/Qwen2.5-7B-Instruct
datasets:
- liuwenhan/reasonrank_data_sft
- liuwenhan/reasonrank_data_rl
- liuwenhan/reasonrank_data_13k
language:
- en
license: mit
pipeline_tag: text-ranking
library_name: transformers
tags:
- qwen
- reranker
- passage-ranking
ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
π€ reasonrank-7B ο½ π€ reasonrank-32B
π€ reasonrank_data_13k ο½ π€ reasonrank_data_sft ο½ π€ reasonrank_data_rl
If you like our project, please give us a star β on GitHub.
π£ Latest News
- [Aug 9, 2025]: π Our ReasonRank (32B) has achieved SOTA performance 40.8 on BRIGHT leaderboard!
- [Aug 9, 2025]: π We uploaded our paper to the arXiv and Hugging Face.
- [Aug 9, 2025]: π₯ We released our π€full reasonrank training data (13k), π€cold-start SFT data and π€RL data.
- [Aug 9, 2025]: π₯ We released our reasoning-intensive reranker π€reasonrank-7B and π€reasonrank-32B.
- [Aug 9, 2025]: π We released our full codebase, including inference, SFT training, and RL training.
1. ReasonRank
π‘ 1.1 Overview
ReasonRank is a reasoning-intensive passage reranker tailored for reasoning-intensive ranking tasks. To train it, we first design an automated reasoning-intensive training data synthesis framework and synthesize 1.3k high-quality training data.
Based on the training data, we design a two-stage training approach including cold-start SFT and multi-view ranking reward RL to inject listwise ranking ability to our ReasonRank.
π 1.2 Overall Performance
When using ReasonIR as initial passage retriever, our ReasonRank demonstrates strong overall ranking performance on BRIGHT benchmark, while showing superior efficiency compared with pointwise reasoning-intensive reranker Rank1.
Besides, when using a higher-quality retrieval results (RaDeR + BM25 hybrid, provided by RaDeR), our ReasonRank (32B) achieves SOTA performance 40.8 on BRIGHT leaderboard.
π 2. The Introduction of ReasonRank Training Data
An important contribution of our work is our reasoning-intensive training data (reasonrank_data_13k). The dataset fields of training_data_all.jsonl
are as follows:
Dataset Fields & Descriptions
dataset
(str)- The dataset name of each piece of data (e.g.,
"math-qa"
).
- The dataset name of each piece of data (e.g.,
qid
(str)- The query ID. The content is provided in
id_query/
directory.
- The query ID. The content is provided in
initial_list
(List[str])- The initial list of passage IDs before DeepSeek-R1 reranking. The content of each passage ID is provided in
id_doc/
directory.
- The initial list of passage IDs before DeepSeek-R1 reranking. The content of each passage ID is provided in
final_list
(List[str])- The re-ranked list of passage IDs after listwisely reranking with DeepSeek-R1.
- Reflects the improved ranking based on reasoning-enhanced relevance scoring.
reasoning
(str)- A step-by-step reasoning chain outputted by DeepSeek-R1 while performing the listwise reranking.
relevant_docids
(List[str])- The ids of relevant passages in
initial_list
mined by DeepSeek-R1. The remaining passage ids ininitial_list
are irrelevant ones. - Note that
relevant_docids
are not necessarily ranked at the top offinal_list
by the DeepSeek-R1, which may stem from inconsistencies in DeepSeek-R1βs judgments. To address this, you can apply the self-consistency data filtering technique proposed in our paper to select higher-quality data.
- The ids of relevant passages in
The statistics of dataset is shown in the figure below:
Example Entry
{
"dataset": "math-qa",
"qid": "math_1001",
"initial_list": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", ...],
"final_list": ["math_test_intermediate_algebra_808", "math_test_intermediate_algebra_1678", ...],
"reasoning": "Okay, I need to rank the 20 passages based on their relevance...",
"relevant_docids": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", "math_train_intermediate_algebra_993"]
}
Application
- Training passage reranker: Given the reranked passage list, one can use our data to train a listwise reranker
- Training passage retriever: Using the
relevant_docids
and the remaining irrelevant ids, one can train a passage retriever.
β‘ 3. Quick Start
This section provides a general guide on how to use ReasonRank for inference. For detailed environment setup, specific inference commands (including usage with ReasonIR or custom retrieval results), and in-depth training procedures (Cold-Start SFT, Multi-reward ranking RL), please refer to the official GitHub repository.
Sample Usage
This model can be loaded and used with the transformers
library. Below is a basic example demonstrating how to use the model for passage re-ranking. The model expects a specific chat-like format for input, including a system prompt and a user query with listed passages.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "liuwenhan/reasonrank-7B" # Or "liuwenhan/reasonrank-32B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
).eval()
# System prompt as used in the paper for reasoning-intensive ranking
system_prompt = (
"You are a helpful and harmless AI assistant. You will be provided with a search query and a list of passages, "
"and your task is to re-rank the passages based on their relevance to the query. "
"You should follow a chain of thought to determine the most relevant passages. "
"Your final answer should be a list of the re-ranked passage numbers, separated by commas. "
"Do not include any other information or explanation in your final answer."
)
query = "What is the capital of France?"
passages = [
"Paris is the capital and most populous city of France.",
"The Eiffel Tower is a famous landmark in Paris.",
"France is a country located in Western Europe.",
"London is the capital of the United Kingdom."
]
# Construct the user message with query and passages
user_content = f"Search Query: {query}
"
for i, passage in enumerate(passages):
user_content += f"[{i+1}] {passage}
"
user_content += "Please re-rank the passages based on their relevance to the query. Provide a chain of thought and then the final re-ranked list."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_content}
]
# Apply chat template
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Tokenize input
input_ids = tokenizer.encode(input_text, return_tensors="pt").to(model.device)
# Generate response
output_ids = model.generate(
input_ids,
max_new_tokens=256, # Adjust as needed for reasoning length
do_sample=False, # Typically deterministic for ranking/reasoning
temperature=0.1, # Low temperature for focused output
repetition_penalty=1.05,
eos_token_id=tokenizer.eos_token_id
)
# Decode output
generated_text = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
print(generated_text)
Citation
If you find this work helpful, please cite our papers:
@misc{liu2025reasonrankempoweringpassageranking,
title={ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability},
author={Wenhan Liu and Xinyu Ma and Weiwei Sun and Yutao Zhu and Yuchen Li and Dawei Yin and Zhicheng Dou},
year={2025},
eprint={2508.07050},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2508.07050},
}
π€ Acknowledge
The inference codes and training implementation build upon RankLLM, Llama Factory and verl. Our work is based on the Qwen2.5 model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community.
π License
This project is released under the MIT License.
π Contact
For any questions or feedback, please reach out to us at [email protected].