--- license: mit --- # Cross-Encoder for MS Marco This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task. The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See our paper [R2ANKER](https://arxiv.org/pdf/2206.08063.pdf) for more details. ## Usage with Transformers ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch tokenizer = AutoTokenizer.from_pretrained("YCZhou/R2ANKER") model = AutoModelForSequenceClassification.from_pretrained("YCZhou/R2ANKER") features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt") model.eval() with torch.no_grad(): scores = model(**features).logits print(scores) ``` ## Citation ``` @article{zhou2022towards, title={Towards robust ranker for text retrieval}, author={Zhou, Yucheng and Shen, Tao and Geng, Xiubo and Tao, Chongyang and Xu, Can and Long, Guodong and Jiao, Binxing and Jiang, Daxin}, journal={arXiv preprint arXiv:2206.08063}, year={2022} } ```