Asking about the performance of retrieving local data
Hi,
I'd like to ask if anyone really get correct and stable performance for almost questions you asked (large local data, tested several times) for RAG (Retrieval-augmented generation)?
In fact, my local data is a text file with around 150k lines in Chinese.
I used Baichuan2-13b-chat for LLM and bge-large-zh-v1.5 for embedding model.
However, when I tried to ask questions related to my local data, I got the following issues:
Ask the same questions for different times, the answer is not the same (for example, information about number values are totally different)
Answer is poor compared with the "context" (extracted from similarity search), miss a lot of important information. Also, sometimes, information is totally wrong.
I mean, the performance is not capable for industry.
If anyone still can get good performance, please share with me. I'd appreciate about your help!