Update README.md
Browse files
README.md
CHANGED
@@ -28,9 +28,94 @@ language:
|
|
28 |
<b>The crispy rerank family from <a href="https://mixedbread.com"><b>Mixedbread</b></a>.</b>
|
29 |
</p>
|
30 |
|
31 |
-
|
32 |
-
|
33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
# 🍞 mxbai-rerank-large-v2
|
36 |
|
|
|
28 |
<b>The crispy rerank family from <a href="https://mixedbread.com"><b>Mixedbread</b></a>.</b>
|
29 |
</p>
|
30 |
|
31 |
+
|
32 |
+
# Rewritten Mixedbread Reranker as classifier
|
33 |
+
|
34 |
+
This repo is the Mixedbread reranker rewritten as Classifier that is as of March 2025 the most powerful reranker, e.g. for RAG.
|
35 |
+
|
36 |
+
# FP8 Deployment on H100
|
37 |
+
|
38 |
+
```yaml
|
39 |
+
build_commands: []
|
40 |
+
environment_variables: {}
|
41 |
+
external_package_dirs: []
|
42 |
+
model_metadata:
|
43 |
+
example_model_input:
|
44 |
+
input: 'ERROR: This redirects to the embedding endpoint. Use the /sync API to
|
45 |
+
reach /sync/predict'
|
46 |
+
model_name: BEI-mixedbread-ai-mxbai-rerank-base-v2-reranker-fp8-truss-example
|
47 |
+
python_version: py39
|
48 |
+
requirements: []
|
49 |
+
resources:
|
50 |
+
accelerator: L4
|
51 |
+
cpu: '1'
|
52 |
+
memory: 10Gi
|
53 |
+
use_gpu: true
|
54 |
+
secrets: {}
|
55 |
+
system_packages: []
|
56 |
+
trt_llm:
|
57 |
+
build:
|
58 |
+
base_model: encoder
|
59 |
+
checkpoint_repository:
|
60 |
+
repo: michaelfeil/mxbai-rerank-base-v2-seq
|
61 |
+
revision: main
|
62 |
+
source: HF
|
63 |
+
max_num_tokens: 32768
|
64 |
+
max_seq_len: 1000001
|
65 |
+
num_builder_gpus: 4
|
66 |
+
quantization_type: fp8
|
67 |
+
```
|
68 |
+
|
69 |
+
To push the deployment on Baseten.co
|
70 |
+
```bash
|
71 |
+
truss push --publish
|
72 |
+
```
|
73 |
+
More info:
|
74 |
+
https://github.com/basetenlabs/truss-examples/tree/main/11-embeddings-reranker-classification-tensorrt/BEI-mixedbread-ai-mxbai-rerank-base-v2-reranker-fp8
|
75 |
+
|
76 |
+
## Usage as classifier
|
77 |
+
For usage with Baseten.co or with github.com/michaelfeil/infinity, you should use the classification API.
|
78 |
+
You need to manually create the following prompt template that is very specfic to this model.
|
79 |
+
This template follows the reference implementation at https://github.com/mixedbread-ai/mxbai-rerank/tree/main.
|
80 |
+
|
81 |
+
```python
|
82 |
+
def create_mxbai_v2_reranker_prompt_template(query: str, document: str, instruction: str = "") -> str:
|
83 |
+
"""
|
84 |
+
Create a carefully formatted chat template string (without tokenizer) for ranking relevance.
|
85 |
+
|
86 |
+
Parameters:
|
87 |
+
query (str): The search query.
|
88 |
+
document (str): The document text to evaluate.
|
89 |
+
instruction (str): Special instructions (e.g., "You are an expert for Mockingbirds.")
|
90 |
+
|
91 |
+
Returns:
|
92 |
+
str: The formatted chat template.
|
93 |
+
"""
|
94 |
+
instruction = f"instruction: {instruction}\n" if instruction else ""
|
95 |
+
# fixed system prompt, keep as is.
|
96 |
+
system_prompt = "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."
|
97 |
+
assert not "\n" in system_prompt
|
98 |
+
assert not "\n" in instruction[:-1]
|
99 |
+
assert isinstance(query, str)
|
100 |
+
assert isinstance(document, str)
|
101 |
+
templated = (
|
102 |
+
# keep spacing, newlines as is.
|
103 |
+
# template for mixedbread reranker v2
|
104 |
+
# https://huggingface.co/michaelfeil/mxbai-rerank-base-v2-seq/
|
105 |
+
f"<|endoftext|><|im_start|>system\n{system_prompt}\n"
|
106 |
+
"<|im_end|>\n"
|
107 |
+
"<|im_start|>user\n"
|
108 |
+
f"{instruction}"
|
109 |
+
f"query: {query} \n"
|
110 |
+
f"document: {document} \n"
|
111 |
+
"You are a search relevance expert who evaluates how well documents match search queries. "
|
112 |
+
"For each query-document pair, carefully analyze the semantic relationship between them, then provide your binary relevance judgment (0 for not relevant, 1 for relevant).\n"
|
113 |
+
"Relevance:<|im_end|>\n"
|
114 |
+
"<|im_start|>assistant\n"
|
115 |
+
)
|
116 |
+
return templated
|
117 |
+
```
|
118 |
+
|
119 |
|
120 |
# 🍞 mxbai-rerank-large-v2
|
121 |
|