Spaces:
Sleeping
Sleeping
Commit
·
87370c1
1
Parent(s):
5847e55
added readme
Browse files- README.md +60 -12
- src/chat.py +1 -7
README.md
CHANGED
@@ -1,12 +1,60 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# QA with RAG
|
2 |
+
|
3 |
+
### Quick start
|
4 |
+
The script is designed to be a Hugging Face chat interface, allowing users to simply use the chat without installing any dependencies.
|
5 |
+
This is a link to the chat: [QA with RAG](https://huggingface.co/spaces/alexandraroze/rag_test_task).
|
6 |
+
|
7 |
+
This chat uses a pre-built RAG model (instructions on how to run the script with RAG building will be described below).
|
8 |
+
|
9 |
+
### Key features
|
10 |
+
1. To start the chat, write a question in the chat and press Enter.
|
11 |
+
2. The model saves the history of all conversations, so you can ask questions about previous answers.
|
12 |
+
3. For each of your questions, the model identifies the topic you are interested in and extracts this topic from the query.
|
13 |
+
4. The relevant documents are retrieved from the RAG using the extracted topic, so the model can base its answer on the retrieved documents.
|
14 |
+
5. If your question does not relate to a specific topic or clarifies the previous model's message, the topic won't be extracted, and the model will use previously retrieved documents.
|
15 |
+
6. Retrieved documents are listed below the chat, as well as the extracted topic (if the topic is not extracted, it will be empty).
|
16 |
+
|
17 |
+
### How to run RAG building script
|
18 |
+
Before launching the script, you should create a file `.env` in the root directory with the following content:
|
19 |
+
```
|
20 |
+
OPENAI_API_KEY="your_openai_token"
|
21 |
+
OPENAI_EMBEDDINGS_MODEL="text-embedding-3-large"
|
22 |
+
CHAT_MODEL="gpt-4o"
|
23 |
+
PATH_TO_DATASET="Dataset"
|
24 |
+
PATH_TO_INDEX="faiss_db"
|
25 |
+
```
|
26 |
+
Please, do not change `OPENAI_EMBEDDINGS_MODEL` value.
|
27 |
+
|
28 |
+
|
29 |
+
To run the script which builds RAG, you need to launch the following commands:
|
30 |
+
```bash
|
31 |
+
pip install -r requirements.txt
|
32 |
+
python ./build_rag.py --path_to_dataset Dataset --path_to_index faiss_db
|
33 |
+
```
|
34 |
+
|
35 |
+
The script will build the RAG model and save it to the specified path. It uses the dataset from the `Dataset` folder and saves the index to the `faiss_db` folder.
|
36 |
+
|
37 |
+
### How to test retrieval from RAG separately
|
38 |
+
If you want to look at retrieval process without chat interface, you can run the following command:
|
39 |
+
```bash
|
40 |
+
python ./test_rag.py --path_to_index faiss_db
|
41 |
+
```
|
42 |
+
After launching the script, you will be able to enter your queries and see the retrieved documents. To exit the script, enter `exit`.
|
43 |
+
|
44 |
+
### Implementation details
|
45 |
+
#### Splitting documents
|
46 |
+
I wrote my own splitter for splitting documents since existing splitters do not consider the semantic meaning of the text. (There are some splitters that consider semantic meaning, but I did not like their quality.)
|
47 |
+
|
48 |
+
- This splitter works like Agglomerative Clustering but considers the order of sentences in the text.
|
49 |
+
- It splits the text into clusters of sentences, where each cluster contains sentences that are semantically close to each other and form a sequential order.
|
50 |
+
- The splitter uses embeddings from the OpenAI embeddings model to calculate the similarity between sentences.
|
51 |
+
- Each cluster represents a separate document in the RAG index.
|
52 |
+
|
53 |
+
All implementation details are in the `src/rag.py` file.
|
54 |
+
|
55 |
+
#### Indexing
|
56 |
+
I used Faiss library and OpenAI embeddings model, namely text-embedding-3-large, since it is the latest and one of the best models for text embeddings.
|
57 |
+
|
58 |
+
#### Chat interface
|
59 |
+
I used Langchain library for the chat interface, since it allows to easily create a chat which saves history of all conversation.
|
60 |
+
Implementation details are in the `src/chat.py` file.
|
src/chat.py
CHANGED
@@ -15,11 +15,6 @@ GENERATE_ARGS = {
|
|
15 |
'max_tokens': int(os.getenv("MAX_NEW_TOKENS", 1024)),
|
16 |
}
|
17 |
|
18 |
-
GENERATE_KWARGS = {
|
19 |
-
'top_p': float(os.getenv("TOP_P", 0.6)),
|
20 |
-
'frequency_penalty': max(-2, min(float(os.getenv("FREQ_PENALTY", 0)), 2))
|
21 |
-
}
|
22 |
-
|
23 |
|
24 |
class Chat:
|
25 |
|
@@ -31,8 +26,7 @@ class Chat:
|
|
31 |
self.assistant_model = base(
|
32 |
model=model,
|
33 |
streaming=True,
|
34 |
-
**GENERATE_ARGS
|
35 |
-
model_kwargs=GENERATE_KWARGS
|
36 |
)
|
37 |
|
38 |
self.store = {}
|
|
|
15 |
'max_tokens': int(os.getenv("MAX_NEW_TOKENS", 1024)),
|
16 |
}
|
17 |
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
class Chat:
|
20 |
|
|
|
26 |
self.assistant_model = base(
|
27 |
model=model,
|
28 |
streaming=True,
|
29 |
+
**GENERATE_ARGS
|
|
|
30 |
)
|
31 |
|
32 |
self.store = {}
|