lgfunderburk commited on
Commit
4b2a9ae
1 Parent(s): cb2f301

remove readme

Browse files
Files changed (1) hide show
  1. README.md +0 -117
README.md DELETED
@@ -1,117 +0,0 @@
1
- # Welcome!
2
-
3
- This chatbot uses RAG to answer questions about the Seven Wonders of the Ancient World.
4
-
5
- Here are sample questions you can ask it:
6
-
7
- 1. What is the Great Pyramid of Giza?
8
- 2. What is the Hanging Gardens of Babylon?
9
- 3. What is the Temple of Artemis at Ephesus?
10
- 4. What is the Statue of Zeus at Olympia?
11
- 5. What is the Mausoleum at Halicarnassus?
12
- 6. Where is Gardens of Babylon?
13
- 7. Why did people build Great Pyramid of Giza?
14
- 8. What does Rhodes Statue look like?
15
- 9. Why did people visit the Temple of Artemis?
16
- 10. What is the importance of Colossus of Rhodes?
17
- 11. What happened to the Tomb of Mausolus?
18
- 12. How did Colossus of Rhodes collapse?
19
-
20
- ## How is it built?
21
-
22
- ### Poetry package management
23
-
24
- This project uses [Poetry](https://python-poetry.org/) for package management.
25
-
26
- It uses [this `pyproject.toml` file](pyproject.toml)
27
-
28
- To install dependencies:
29
-
30
- ```bash
31
- pip install poetry
32
- poetry install
33
- ```
34
-
35
- ### Data source:
36
-
37
- The data is from the [Seven Wonders dataset][1] on Hugging Face. https://huggingface.co/datasets/bilgeyucel/seven-wonders
38
-
39
- ### Method
40
-
41
- The chatbots retrieval mechanism is developed using Retrieval Augmented Generation (RAG) with [Haystack](https://haystack.deepset.ai/tutorials/22_pipeline_with_promptnode) and its user interface is built with [Chainlit](https://docs.chainlit.io/overview). It is using OpenAI GPT-3.5-turbo.
42
-
43
- ### Pipeline steps (Haystack) - check the full script here: [src/app.py](src/app.py)
44
-
45
- 1. Initialize in-memory Document store
46
-
47
- ```python
48
- # Initialize Haystack's QA system
49
- document_store = InMemoryDocumentStore(use_bm25=True)
50
- ```
51
- 2. Load dataset from HF
52
-
53
- ```python
54
- dataset = load_dataset("bilgeyucel/seven-wonders", split="train")
55
- ```
56
-
57
- 3. Transform documents and load into document store
58
-
59
- ```python
60
- document_store.write_documents(dataset)
61
- ```
62
- 4. Initialize a RAG prompt
63
-
64
- ```
65
- rag_prompt = PromptTemplate(
66
- prompt="""Synthesize a brief answer from the following text for the given question.
67
- Provide a clear and concise response that summarizes the key points and information presented in the text.
68
- Your answer should be in your own words and be no longer than 50 words.
69
- \n\n Related text: {join(documents)} \n\n Question: {query} \n\n Answer:""",
70
- output_parser=AnswerParser(),
71
- )
72
-
73
- ```
74
-
75
- 5. Set the nodes using GPT-3.5-turbo
76
-
77
- ```python
78
- Set up nodes
79
- retriever = BM25Retriever(document_store=document_store, top_k=2)
80
- pn = PromptNode("gpt-3.5-turbo",
81
- api_key=MY_API_KEY,
82
- model_kwargs={"stream":False},
83
- default_prompt_template=rag_prompt)
84
-
85
- ```
86
-
87
- 6. Build the pipeline
88
-
89
- ```python
90
- # Set up pipeline
91
- pipe = Pipeline()
92
- pipe.add_node(component=retriever, name="retriever", inputs=["Query"])
93
- pipe.add_node(component=pn, name="prompt_node", inputs=["retriever"])
94
- ```
95
-
96
- ### Connecting the pipeline to Chainlit
97
-
98
- ```python
99
-
100
- @cl.on_message
101
- async def main(message: str):
102
- # Use the pipeline to get a response
103
- output = pipe.run(query=message)
104
-
105
- # Create a Chainlit message with the response
106
- response = output['answers'][0].answer
107
- msg = cl.Message(content=response)
108
-
109
- # Send the message to the user
110
- await msg.send()
111
- ```
112
-
113
- ### Run application
114
-
115
- ``` bash
116
- poetry run chainlit run src/app.py --port 7860
117
- ```