Hwanjun commited on
Commit
d1cae24
·
verified ·
1 Parent(s): ee34ee3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -36,6 +36,17 @@ Standard RAG models often struggle due to:
36
  - Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
37
  - Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
38
 
 
 
 
 
 
 
 
 
 
 
 
39
  ### Prompt
40
 
41
  - query: the query to answer
 
36
  - Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
37
  - Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
38
 
39
+ ### Need a More Faster Inference?
40
+ Our Ext2Gen model writes the sentences related to the query first before generating the answer. So, it needs more latency before getting the answer.
41
+
42
+ If you don't want to see the extracted sentences but want to directly see the answer with low latency, use its variant we call Gen-8B-R2.
43
+
44
+ Link: https://huggingface.co/DISLab/Gen-8B-R2
45
+
46
+ This model skips the sentence extraction phase but remains its high robustness comparable to Ext2Gen-8B-R2.
47
+
48
+
49
+
50
  ### Prompt
51
 
52
  - query: the query to answer