reasonrank-7B / README.md

Improve model card: Add pipeline tag, library name, tags, project page, and sample usage

392d577 verified 11 days ago

10.9 kB

	---
	base_model:
	- Qwen/Qwen2.5-7B-Instruct
	datasets:
	- liuwenhan/reasonrank_data_sft
	- liuwenhan/reasonrank_data_rl
	- liuwenhan/reasonrank_data_13k
	language:
	- en
	license: mit
	pipeline_tag: text-ranking
	library_name: transformers
	tags:
	- qwen
	- reranker
	- passage-ranking
	---

	# ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

	<p align="center">
	<a href="https://arxiv.org/pdf/2508.07050" target="_blank"><img src="https://img.shields.io/badge/Paper-arXiv-b5212f.svg?logo=arxiv"></a>
	<a href="https://github.com/8421BCD/ReasonRank" target="_blank"><img src="https://img.shields.io/badge/GitHub-Repo-181717.svg?logo=github"></a>
	<a href="https://brightbenchmark.github.io/" target="_blank"><img src="https://img.shields.io/badge/Project%20Page-BRIGHT-blue.svg"></a>
	<a href="https://opensource.org/licenses/MIT"><img alt="License" src="https://img.shields.io/badge/LICENSE-MIT-green.svg"></a>
	</p>

	<p align="center">
	🤗 <a href="https://huggingface.co/liuwenhan/reasonrank-7B" target="_blank">reasonrank-7B</a> ｜
	🤗 <a href="https://huggingface.co/liuwenhan/reasonrank-32B" target="_blank">reasonrank-32B</a>
	</p>
	<p align="center">
	🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k" target="_blank">reasonrank_data_13k</a> ｜
	🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft" target="_blank">reasonrank_data_sft</a> ｜
	🤗 <a href="https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl" target="_blank">reasonrank_data_rl</a>
	</p>
	<h5 align="center"> If you like our project, please give us a star ⭐ on GitHub.</h5>

	## 📣 Latest News
	- [Aug 9, 2025]: 🏆 Our ReasonRank (32B) has achieved SOTA performance 40.8 on [BRIGHT leaderboard](https://brightbenchmark.github.io/)!
	- [Aug 9, 2025]: 📄 We uploaded our paper to the [arXiv](https://arxiv.org/pdf/2508.07050) and [Hugging Face](https://huggingface.co/papers/2508.07050).
	- [Aug 9, 2025]: 🔥 We released our [🤗full reasonrank training data (13k)](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k), [🤗cold-start SFT data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_sft) and [🤗RL data](https://huggingface.co/datasets/liuwenhan/reasonrank_data_rl).
	- [Aug 9, 2025]: 🔥 We released our reasoning-intensive reranker [🤗reasonrank-7B](https://huggingface.co/liuwenhan/reasonrank-7B) and [🤗reasonrank-32B](https://huggingface.co/liuwenhan/reasonrank-32B).
	- [Aug 9, 2025]: 🚀 We released our full codebase, including inference, SFT training, and RL training.

	## 1. ReasonRank

	### 💡 1.1 Overview

	ReasonRank is a reasoning-intensive passage reranker tailored for reasoning-intensive ranking tasks. To train it, we first design an automated reasoning-intensive training data synthesis framework and synthesize 1.3k high-quality training data.

	<p align="center">
	<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002302377.png" />
	</p>

	Based on the training data, we design a two-stage training approach including cold-start SFT and multi-view ranking reward RL to inject listwise ranking ability to our ReasonRank.

	<p align="center">
	<img width="80%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809002546838.png" />
	</p>

	### 📊 1.2 Overall Performance

	When using ReasonIR as initial passage retriever, our ReasonRank demonstrates strong overall ranking performance on BRIGHT benchmark, while showing superior efficiency compared with pointwise reasoning-intensive reranker Rank1.

	<p align="center">
	<img width="50%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250809003636871.png" />
	</p>

	Besides, when using a higher-quality retrieval results (RaDeR + BM25 hybrid, provided by [RaDeR](https://github.com/Debrup-61/RaDeR/blob/main/BRIGHT_score_files/RaDeR-gte-Qwen2-LLMq_CoT_lexical/aops/hybrid_BM25_Rader.json)), our ReasonRank (32B) achieves SOTA performance 40.8 on [BRIGHT leaderboard](https://brightbenchmark.github.io/).

	## 📂 2. The Introduction of ReasonRank Training Data

	An important contribution of our work is our reasoning-intensive training data ([reasonrank_data_13k](https://huggingface.co/datasets/liuwenhan/reasonrank_data_13k)). The dataset fields of ``training_data_all.jsonl`` are as follows:

	#### Dataset Fields & Descriptions

	1. `dataset` (str)
	- The dataset name of each piece of data (e.g., `"math-qa"`).
	2. `qid` (str)
	- The query ID. The content is provided in ``id_query/`` directory.
	3. `initial_list` (List[str])
	- The initial list of passage IDs before DeepSeek-R1 reranking. The content of each passage ID is provided in ``id_doc/`` directory.
	4. `final_list` (List[str])
	- The re-ranked list of passage IDs after listwisely reranking with DeepSeek-R1.
	- Reflects the improved ranking based on reasoning-enhanced relevance scoring.
	5. `reasoning` (str)
	- A step-by-step reasoning chain outputted by DeepSeek-R1 while performing the listwise reranking.
	6. `relevant_docids` (List[str])
	- The ids of relevant passages in ``initial_list`` mined by DeepSeek-R1. The remaining passage ids in ``initial_list`` are irrelevant ones.
	- Note that `relevant_docids` are not necessarily ranked at the top of `final_list` by the DeepSeek-R1, which may stem from inconsistencies in DeepSeek-R1’s judgments. To address this, you can apply the self-consistency data filtering technique proposed in our paper to select higher-quality data.

	The statistics of dataset is shown in the figure below:
	<p align="center">
	<img width="80%" alt="image" src="https://github.com/user-attachments/assets/c04b9d1a-2f21-46f1-b23d-ad1f50d22fb8" />
	</p>

	#### Example Entry

	```json
	{
	"dataset": "math-qa",
	"qid": "math_1001",
	"initial_list": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", ...],
	"final_list": ["math_test_intermediate_algebra_808", "math_test_intermediate_algebra_1678", ...],
	"reasoning": "Okay, I need to rank the 20 passages based on their relevance...",
	"relevant_docids": ["math_test_intermediate_algebra_808", "math_train_intermediate_algebra_1471", "math_train_intermediate_algebra_993"]
	}
	```

	#### Application

	1. Training passage reranker: Given the reranked passage list, one can use our data to train a listwise reranker
	2. Training passage retriever: Using the `relevant_docids` and the remaining irrelevant ids, one can train a passage retriever.

	## ⚡ 3. Quick Start

	This section provides a general guide on how to use ReasonRank for inference. For detailed environment setup, specific inference commands (including usage with ReasonIR or custom retrieval results), and in-depth training procedures (Cold-Start SFT, Multi-reward ranking RL), please refer to the [official GitHub repository](https://github.com/8421BCD/ReasonRank).

	## Sample Usage

	This model can be loaded and used with the `transformers` library. Below is a basic example demonstrating how to use the model for passage re-ranking. The model expects a specific chat-like format for input, including a system prompt and a user query with listed passages.

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "liuwenhan/reasonrank-7B" # Or "liuwenhan/reasonrank-32B"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	).eval()

	# System prompt as used in the paper for reasoning-intensive ranking
	system_prompt = (
	"You are a helpful and harmless AI assistant. You will be provided with a search query and a list of passages, "
	"and your task is to re-rank the passages based on their relevance to the query. "
	"You should follow a chain of thought to determine the most relevant passages. "
	"Your final answer should be a list of the re-ranked passage numbers, separated by commas. "
	"Do not include any other information or explanation in your final answer."
	)

	query = "What is the capital of France?"
	passages = [
	"Paris is the capital and most populous city of France.",
	"The Eiffel Tower is a famous landmark in Paris.",
	"France is a country located in Western Europe.",
	"London is the capital of the United Kingdom."
	]

	# Construct the user message with query and passages
	user_content = f"Search Query: {query}
	"
	for i, passage in enumerate(passages):
	user_content += f"[{i+1}] {passage}
	"
	user_content += "Please re-rank the passages based on their relevance to the query. Provide a chain of thought and then the final re-ranked list."

	messages = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_content}
	]

	# Apply chat template
	input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

	# Tokenize input
	input_ids = tokenizer.encode(input_text, return_tensors="pt").to(model.device)

	# Generate response
	output_ids = model.generate(
	input_ids,
	max_new_tokens=256, # Adjust as needed for reasoning length
	do_sample=False, # Typically deterministic for ranking/reasoning
	temperature=0.1, # Low temperature for focused output
	repetition_penalty=1.05,
	eos_token_id=tokenizer.eos_token_id
	)

	# Decode output
	generated_text = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
	print(generated_text)
	```

	## Citation

	If you find this work helpful, please cite our papers:

	```bibtex
	@misc{liu2025reasonrankempoweringpassageranking,
	title={ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability},
	author={Wenhan Liu and Xinyu Ma and Weiwei Sun and Yutao Zhu and Yuchen Li and Dawei Yin and Zhicheng Dou},
	year={2025},
	eprint={2508.07050},
	archivePrefix={arXiv},
	primaryClass={cs.IR},
	url={https://arxiv.org/abs/2508.07050},
	}
	```

	## 🤝 Acknowledge

	The inference codes and training implementation build upon [RankLLM](https://github.com/castorini/rank_llm), [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) and [verl](https://github.com/volcengine/verl). Our work is based on the [Qwen2.5](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community.

	## 📄 License

	This project is released under the [MIT License](LICENSE).

	## 📞 Contact

	For any questions or feedback, please reach out to us at [[email protected]]([email protected]).

	## Star History

	[![Star History Chart](https://api.star-history.com/svg?repos=8421bcd/reasonrank&type=Date)](https://www.star-history.com/#8421bcd/reasonrank&Date)