Text Generation
Transformers
Safetensors
qwen2
reranker
conversational
text-generation-inference
ptrdvn commited on
Commit
20a6460
·
verified ·
1 Parent(s): 08d0055

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -3
README.md CHANGED
@@ -160,9 +160,9 @@ And to output a string of a number between 1-7.
160
 
161
  In order to make a continuous score that can be used for reranking query-context pairs (i.e. a method with few ties), we calculate the expectation value of the scores.
162
 
163
- We include scripts to do this in both vLLM and LMDeploy:
164
 
165
- #### vLLM
166
 
167
  Install [vLLM](https://github.com/vllm-project/vllm/) using `pip install vllm`.
168
 
@@ -208,7 +208,7 @@ print(expected_vals)
208
  # [6.66570732 1.86686378 1.01102923]
209
  ```
210
 
211
- #### LMDeploy
212
 
213
  Install [LMDeploy](https://github.com/InternLM/lmdeploy) using `pip install lmdeploy`.
214
 
@@ -266,6 +266,65 @@ print(expected_vals)
266
  # [6.66415229 1.84342025 1.01133205]
267
  ```
268
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
269
  # Evaluation
270
 
271
  We perform an evaluation on 9 datasets from the [BEIR benchmark](https://github.com/beir-cellar/beir) that none of the evaluated models have been trained upon (to our knowledge).
 
160
 
161
  In order to make a continuous score that can be used for reranking query-context pairs (i.e. a method with few ties), we calculate the expectation value of the scores.
162
 
163
+ We include scripts to do this in vLLM, LMDeploy, and OpenAI (hosted for free on Huggingface):
164
 
165
+ ### vLLM
166
 
167
  Install [vLLM](https://github.com/vllm-project/vllm/) using `pip install vllm`.
168
 
 
208
  # [6.66570732 1.86686378 1.01102923]
209
  ```
210
 
211
+ ### LMDeploy
212
 
213
  Install [LMDeploy](https://github.com/InternLM/lmdeploy) using `pip install lmdeploy`.
214
 
 
266
  # [6.66415229 1.84342025 1.01133205]
267
  ```
268
 
269
+ ### OpenAI (Hosted on Huggingface)
270
+
271
+ Install [openai](https://github.com/openai/openai-python) using `pip install openai`.
272
+
273
+ ```python
274
+ from openai import OpenAI
275
+ import numpy as np
276
+ from multiprocessing import Pool
277
+ from tqdm.auto import tqdm
278
+
279
+ client = OpenAI(
280
+ base_url="https://api-inference.huggingface.co/v1/",
281
+ api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Change this to an access token from https://huggingface.co/settings/tokens
282
+ )
283
+
284
+ def make_reranker_input(t, q):
285
+ return f"<<<Query>>>\n{q}\n\n<<<Context>>>\n{t}"
286
+
287
+ def make_reranker_inference_conversation(context, question):
288
+ system_message = "Given a query and a piece of text, output a score of 1-7 based on how related the query is to the text. 1 means least related and 7 is most related."
289
+
290
+ return [
291
+ {"role": "system", "content": system_message},
292
+ {"role": "user", "content": make_reranker_input(context, question)},
293
+ ]
294
+
295
+ def get_reranker_score(context_question_tuple):
296
+ question, context = context_question_tuple
297
+
298
+ messages = make_reranker_inference_conversation(context, question)
299
+
300
+ completion = client.chat.completions.create(
301
+ model="lightblue/lb-reranker-0.5B-v1.0",
302
+ messages=messages,
303
+ max_tokens=1,
304
+ temperature=0.0,
305
+ logprobs=True,
306
+ top_logprobs=5, # Max allowed by the openai API as top_n_tokens must be >= 0 and <= 5. If this gets changed, fix to > 7.
307
+ )
308
+
309
+ logprobs = completion.choices[0].logprobs.content[0].top_logprobs
310
+
311
+ calculated_score = sum([int(x.token) * np.exp(x.logprob) for x in logprobs])
312
+
313
+ return calculated_score
314
+
315
+ query_texts = [
316
+ ("What is the scientific name of apples?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
317
+ ("What is the Chinese word for 'apple'?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
318
+ ("What is the square root of 999?", "An apple is a round, edible fruit produced by an apple tree (Malus spp., among them the domestic or orchard apple; Malus domestica)."),
319
+ ]
320
+
321
+ with Pool(processes=16) as p: # Allows for parallel processing
322
+ expected_vals = list(tqdm(p.imap(get_reranker_score, query_texts), total=len(query_texts)))
323
+
324
+ print(expected_vals)
325
+ # [6.64866580, 1.85144404, 1.010719508]
326
+ ```
327
+
328
  # Evaluation
329
 
330
  We perform an evaluation on 9 datasets from the [BEIR benchmark](https://github.com/beir-cellar/beir) that none of the evaluated models have been trained upon (to our knowledge).