A newer version of the Gradio SDK is available:
5.23.3
title: RAG
app_file: app.py
sdk: gradio
sdk_version: 5.8.0
RAG. Question answering bot.
Topics
- Data source ✔️
- Chunking ✔️
- LLM ✔️
- Retriever ✔️
- Reranker ✔️
- Citation ❌
- Web UI and deployment ✔️
Data source
I used documents found on the Internet. You can take a look at them in docs directory, and you can ask questions based on that context. There also is possibility to upload your txt file and use it as a context.
Chunking
Chunking was performed using the same method explained in live-coding session. No other libraries were involved.
LLM
As LLM I used pretrained model llama3-70b-8192.
Retriever
Retrieving can be performed in three different ways. You can either use BM25 retriever or a dense retriever by calculating semantic scores. Using both of them in hybrid approach is also an option.
Dense retriever used in this lab - sentence-transformers/all-distilroberta-v1.
Here's an example when dense retriever works better than BM25:
Reranker
As a reranker there was used cross encoder cross-encoder/stsb-roberta-base. It may be not efficient in my case, as far as amount of documents is quite small, so it takes time to process the data, but does not improve the process of extracting context.
Citation
Isn't implemented
Web UI and deployment
I used gradio lib for demo and hosting.