hooman650 commited on
Commit
324441d
·
1 Parent(s): 2e4d5be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -1
README.md CHANGED
@@ -1,3 +1,86 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - medical
8
+ - finance
9
+ - chemistry
10
+ - biology
11
  ---
12
+ # BGE-Renranker-Large
13
+
14
+ <!-- Provide a quick summary of what the model is/does. -->
15
+
16
+ This is an `int8` converted version of [bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large). Thanks to `c2translate` this should
17
+ be at least 3 times faster than the original hf transformer version while its smaller with minimal performance loss.
18
+
19
+ ## Model Details
20
+ Different from embedding model `bge-large-en-v1.5`, reranker uses question and document as input and directly output similarity instead of embedding.
21
+ You can get a relevance score by inputting query and passage to the reranker. The reranker is optimized based cross-entropy loss, so the relevance score is not bounded to a specific range.
22
+ Besides this is highly optimized version using `c2translate` library suitable for production environments.
23
+
24
+ ### Model Description
25
+
26
+ <!-- Provide a longer summary of what this model is. -->
27
+
28
+
29
+
30
+ - **Developed by:** [More Information Needed]
31
+ - **Funded by [optional]:** [More Information Needed]
32
+ - **Shared by [optional]:** [More Information Needed]
33
+ - **Model type:** [More Information Needed]
34
+ - **Language(s) (NLP):** [More Information Needed]
35
+ - **License:** [More Information Needed]
36
+ - **Finetuned from model [optional]:** [More Information Needed]
37
+
38
+ ### Model Sources [optional]
39
+
40
+ The original model is based on `BAAI` `BGE-Reranker` model. Please visit [bge-reranker-orignal-repo](https://huggingface.co/BAAI/bge-reranker-large)
41
+ for more details.
42
+
43
+ ## Usage
44
+
45
+ Simply `pip install ctranslate2` and then
46
+
47
+ ```python
48
+ import ctranslate2
49
+ import transformers
50
+ import torch
51
+
52
+ device_mapping="cuda" if torch.cuda.is_available() else "cpu"
53
+
54
+ model_dir = "hooman650/ct2fast-bge-reranker"
55
+
56
+ # ctranslate2 encoder heavy lifting
57
+ encoder = ctranslate2.Encoder(model_dir, device = device_mapping)
58
+
59
+ # the classification head comes from HF
60
+ model_name = "BAAI/bge-reranker-large"
61
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
62
+ classifier = transformers.AutoModelForSequenceClassification.from_pretrained(model_name).classifier
63
+
64
+ classifier.eval()
65
+ classifier.to(device_mapping)
66
+
67
+ pairs = [
68
+ ["I like Ctranslate2","Ctranslate2 makes mid range models faster"],
69
+ ["I like Ctranslate2","Using naive transformers might not be suitable for deployment"]
70
+ ]
71
+ with torch.no_grad():
72
+ tokens = tokenizer(pairs, padding=True, truncation=True, max_length=512).input_ids
73
+ output = encoder.forward_batch(tokens)
74
+ hidden_state = torch.as_tensor(output.last_hidden_state, device=device_mapping)
75
+ logits = classifier(hidden_state).squeeze()
76
+
77
+ print(logits)
78
+
79
+ # tensor([ 1.0474, -9.4694], device='cuda:0')
80
+ ```
81
+
82
+
83
+ #### Hardware
84
+
85
+ Supports both GPU and CPU.
86
+