Update README.md
Browse files
README.md
CHANGED
@@ -36,6 +36,17 @@ Standard RAG models often struggle due to:
|
|
36 |
- Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
|
37 |
- Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
|
38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
### Prompt
|
40 |
|
41 |
- query: the query to answer
|
|
|
36 |
- Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
|
37 |
- Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
|
38 |
|
39 |
+
### Need a More Faster Inference?
|
40 |
+
Our Ext2Gen model writes the sentences related to the query first before generating the answer. So, it needs more latency before getting the answer.
|
41 |
+
|
42 |
+
If you don't want to see the extracted sentences but want to directly see the answer with low latency, use its variant we call Gen-8B-R2.
|
43 |
+
|
44 |
+
Link: https://huggingface.co/DISLab/Gen-8B-R2
|
45 |
+
|
46 |
+
This model skips the sentence extraction phase but remains its high robustness comparable to Ext2Gen-8B-R2.
|
47 |
+
|
48 |
+
|
49 |
+
|
50 |
### Prompt
|
51 |
|
52 |
- query: the query to answer
|