RAG / README.md
olenkap's picture
Update README.md
a1fe12d verified

A newer version of the Gradio SDK is available: 5.23.3

Upgrade
metadata
title: RAG
app_file: app.py
sdk: gradio
sdk_version: 5.8.0

RAG. Question answering bot.

Topics

Data source

I used documents found on the Internet. You can take a look at them in docs directory, and you can ask questions based on that context. There also is possibility to upload your txt file and use it as a context.

Chunking

Chunking was performed using the same method explained in live-coding session. No other libraries were involved.

LLM

As LLM I used pretrained model llama3-70b-8192.

Retriever

Retrieving can be performed in three different ways. You can either use BM25 retriever or a dense retriever by calculating semantic scores. Using both of them in hybrid approach is also an option.

Dense retriever used in this lab - sentence-transformers/all-distilroberta-v1.

Here's an example when dense retriever works better than BM25:

image_2024-12-08_21-59-16 image_2024-12-08_22-00-02

Reranker

As a reranker there was used cross encoder cross-encoder/stsb-roberta-base. It may be not efficient in my case, as far as amount of documents is quite small, so it takes time to process the data, but does not improve the process of extracting context.

Citation

Isn't implemented

Web UI and deployment

I used gradio lib for demo and hosting.