alexandraroze commited on
Commit
87370c1
·
1 Parent(s): 5847e55

added readme

Browse files
Files changed (2) hide show
  1. README.md +60 -12
  2. src/chat.py +1 -7
README.md CHANGED
@@ -1,12 +1,60 @@
1
- ---
2
- title: Rag Test Task
3
- emoji: 💬
4
- colorFrom: yellow
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 4.36.1
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # QA with RAG
2
+
3
+ ### Quick start
4
+ The script is designed to be a Hugging Face chat interface, allowing users to simply use the chat without installing any dependencies.
5
+ This is a link to the chat: [QA with RAG](https://huggingface.co/spaces/alexandraroze/rag_test_task).
6
+
7
+ This chat uses a pre-built RAG model (instructions on how to run the script with RAG building will be described below).
8
+
9
+ ### Key features
10
+ 1. To start the chat, write a question in the chat and press Enter.
11
+ 2. The model saves the history of all conversations, so you can ask questions about previous answers.
12
+ 3. For each of your questions, the model identifies the topic you are interested in and extracts this topic from the query.
13
+ 4. The relevant documents are retrieved from the RAG using the extracted topic, so the model can base its answer on the retrieved documents.
14
+ 5. If your question does not relate to a specific topic or clarifies the previous model's message, the topic won't be extracted, and the model will use previously retrieved documents.
15
+ 6. Retrieved documents are listed below the chat, as well as the extracted topic (if the topic is not extracted, it will be empty).
16
+
17
+ ### How to run RAG building script
18
+ Before launching the script, you should create a file `.env` in the root directory with the following content:
19
+ ```
20
+ OPENAI_API_KEY="your_openai_token"
21
+ OPENAI_EMBEDDINGS_MODEL="text-embedding-3-large"
22
+ CHAT_MODEL="gpt-4o"
23
+ PATH_TO_DATASET="Dataset"
24
+ PATH_TO_INDEX="faiss_db"
25
+ ```
26
+ Please, do not change `OPENAI_EMBEDDINGS_MODEL` value.
27
+
28
+
29
+ To run the script which builds RAG, you need to launch the following commands:
30
+ ```bash
31
+ pip install -r requirements.txt
32
+ python ./build_rag.py --path_to_dataset Dataset --path_to_index faiss_db
33
+ ```
34
+
35
+ The script will build the RAG model and save it to the specified path. It uses the dataset from the `Dataset` folder and saves the index to the `faiss_db` folder.
36
+
37
+ ### How to test retrieval from RAG separately
38
+ If you want to look at retrieval process without chat interface, you can run the following command:
39
+ ```bash
40
+ python ./test_rag.py --path_to_index faiss_db
41
+ ```
42
+ After launching the script, you will be able to enter your queries and see the retrieved documents. To exit the script, enter `exit`.
43
+
44
+ ### Implementation details
45
+ #### Splitting documents
46
+ I wrote my own splitter for splitting documents since existing splitters do not consider the semantic meaning of the text. (There are some splitters that consider semantic meaning, but I did not like their quality.)
47
+
48
+ - This splitter works like Agglomerative Clustering but considers the order of sentences in the text.
49
+ - It splits the text into clusters of sentences, where each cluster contains sentences that are semantically close to each other and form a sequential order.
50
+ - The splitter uses embeddings from the OpenAI embeddings model to calculate the similarity between sentences.
51
+ - Each cluster represents a separate document in the RAG index.
52
+
53
+ All implementation details are in the `src/rag.py` file.
54
+
55
+ #### Indexing
56
+ I used Faiss library and OpenAI embeddings model, namely text-embedding-3-large, since it is the latest and one of the best models for text embeddings.
57
+
58
+ #### Chat interface
59
+ I used Langchain library for the chat interface, since it allows to easily create a chat which saves history of all conversation.
60
+ Implementation details are in the `src/chat.py` file.
src/chat.py CHANGED
@@ -15,11 +15,6 @@ GENERATE_ARGS = {
15
  'max_tokens': int(os.getenv("MAX_NEW_TOKENS", 1024)),
16
  }
17
 
18
- GENERATE_KWARGS = {
19
- 'top_p': float(os.getenv("TOP_P", 0.6)),
20
- 'frequency_penalty': max(-2, min(float(os.getenv("FREQ_PENALTY", 0)), 2))
21
- }
22
-
23
 
24
  class Chat:
25
 
@@ -31,8 +26,7 @@ class Chat:
31
  self.assistant_model = base(
32
  model=model,
33
  streaming=True,
34
- **GENERATE_ARGS,
35
- model_kwargs=GENERATE_KWARGS
36
  )
37
 
38
  self.store = {}
 
15
  'max_tokens': int(os.getenv("MAX_NEW_TOKENS", 1024)),
16
  }
17
 
 
 
 
 
 
18
 
19
  class Chat:
20
 
 
26
  self.assistant_model = base(
27
  model=model,
28
  streaming=True,
29
+ **GENERATE_ARGS
 
30
  )
31
 
32
  self.store = {}