DISLab
/

Ext2Gen-8B-R2

Question Answering

text-generation

text-generation-inference

Model card Files Files and versions Community

Hwanjun commited on 20 days ago

Commit

d1cae24

·

verified ·

1 Parent(s): ee34ee3

Update README.md

Files changed (1) hide show

README.md +11 -0

README.md CHANGED Viewed

@@ -36,6 +36,17 @@ Standard RAG models often struggle due to:
 - Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
 - Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
 ### Prompt
 - query: the query to answer

 - Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
 - Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
+### Need a More Faster Inference?
+Our Ext2Gen model writes the sentences related to the query first before generating the answer. So, it needs more latency before getting the answer.
+If you don't want to see the extracted sentences but want to directly see the answer with low latency, use its variant we call Gen-8B-R2.
+Link: https://huggingface.co/DISLab/Gen-8B-R2
+This model skips the sentence extraction phase but remains its high robustness comparable to Ext2Gen-8B-R2.
 ### Prompt
 - query: the query to answer