MrLight commited on
Commit
775f892
·
1 Parent(s): fbce05f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -1,3 +1,57 @@
1
  ---
2
  license: llama2
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
  ---
4
+
5
+ # RankLLaMA-7B-Document
6
+
7
+ [Fine-Tuning LLaMA for Multi-Stage Text Retrieval](TODO).
8
+ Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin, arXiv 2023
9
+
10
+ This model is fine-tuned from LLaMA-2-7B using LoRA for document reranking, this model takes input length upto 4096 tokens.
11
+
12
+ ## Usage
13
+
14
+ Below is an example to compute the similarity score of a query-document pair
15
+
16
+ ```python
17
+ import torch
18
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
19
+ from peft import PeftModel, PeftConfig
20
+
21
+ def get_model(peft_model_name):
22
+ config = PeftConfig.from_pretrained(peft_model_name)
23
+ base_model = AutoModelForSequenceClassification.from_pretrained(config.base_model_name_or_path)
24
+ model = PeftModel.from_pretrained(base_model, peft_model_name)
25
+ model = model.merge_and_unload()
26
+ model.eval()
27
+ return model
28
+
29
+ # Load the tokenizer and model
30
+ tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-hf')
31
+ model = get_model('castorini/rankllama-v1-7b-lora-doc')
32
+
33
+ # Define a query-document pair
34
+ query = "What is llama?"
35
+ url = "https://en.wikipedia.org/wiki/Llama"
36
+ title = "Llama"
37
+ document = "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era."
38
+
39
+ # Tokenize the query-document pair
40
+ inputs = tokenizer(f'query: {query}', f'document: {url} {title} {document}</s>', return_tensors='pt')
41
+
42
+ # Run the model forward
43
+ with torch.no_grad():
44
+ outputs = model(**inputs)
45
+ logits = outputs.logits
46
+ score = logits[0][0]
47
+ print(score)
48
+
49
+ ```
50
+
51
+ ## Citation
52
+
53
+ If you find our paper or models helpful, please consider cite as follows:
54
+
55
+ ```
56
+ TODO
57
+ ```