supertskone commited on
Commit
3faef87
·
verified ·
1 Parent(s): 041d81c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -99
README.md CHANGED
@@ -1,99 +1,11 @@
1
- # Prompt Search Engine
2
-
3
- ## Overview
4
-
5
- The Prompt Search Engine is a Flask-based web application designed to search for the most similar prompts from a dataset using cosine similarity. It leverages Hugging Face's `sentence-transformers` to vectorize prompts and stores them in a Pinecone vector database for efficient querying. The frontend is built using Streamlit, providing an intuitive interface for users to input their queries and get results.
6
-
7
- ## Features
8
-
9
- - **Efficient Vector Search**: Uses Pinecone for storing and querying vector embeddings.
10
- - **Cosine Similarity Calculation**: Custom implementation for calculating cosine similarity between query vectors and stored vectors.
11
- - **Streamlit Interface**: Simple and user-friendly interface for querying and displaying results.
12
- - **Logging**: Comprehensive logging for easy debugging and monitoring.
13
-
14
- ## Prerequisites
15
-
16
- - Python 3.9 or higher
17
- - Pinecone API key
18
-
19
- ## Setup
20
-
21
- ### Clone the Repository
22
-
23
- ```
24
- git clone https://github.com/your-username/prompt-search-engine.git
25
- cd prompt-search-engine
26
- ```
27
- ## Create a Virtual Environment and Install Dependencies
28
- ```
29
- python3 -m venv venv
30
- source venv/bin/activate
31
- pip install -r requirements.txt
32
- ```
33
-
34
- ## Configure Pinecone
35
- ```
36
- # Replace YOUR_PINECONE_API_KEY with your actual Pinecone API key in the vectorizer.py file.
37
- python
38
- pinecone = Pinecone(api_key='YOUR_PINECONE_API_KEY')
39
- ```
40
- ### Initial Data Load
41
- ```
42
- # Run the script to load the initial dataset into the Pinecone database:
43
-
44
- python load_data.py
45
- ```
46
-
47
- ### Run the Flask Backend
48
- ```
49
- python run.py
50
- ```
51
-
52
- ### Run the Streamlit Frontend
53
- ```
54
- # In a new terminal (while the backend is running), start the Streamlit app:
55
-
56
- streamlit run ui/app.py
57
- ```
58
- ### Running the Tests
59
- To run the tests, navigate to the root directory and execute:
60
- ```
61
- python run_tests.py
62
- ```
63
- Make sure that you executed python load_data.py before.
64
- You should receive something like this:
65
- <img width="576" alt="image" src="https://github.com/user-attachments/assets/a9cd8acb-9280-4b55-9bec-3009a0a61b87">
66
-
67
- ### Rebuild and run Docker container
68
- ```
69
- docker build -t prompt-search-engine .
70
- docker run -p 5000:5000 prompt-search-engine
71
- ```
72
-
73
- ### Usage
74
- Open your web browser and go to http://localhost:8501.
75
- Enter a query in the input box.
76
- Adjust the number of results using the slider.
77
- Click "Search" to get the most similar prompts from the dataset.
78
-
79
- ### File Descriptions
80
- ##### app/: Contains the core logic for vectorization and search functionality.
81
- ##### search_engine.py: Implements the PromptSearchEngine class for querying the Pinecone database.
82
- ##### vectorizer.py: Implements the Vectorizer class for loading and storing vectors in Pinecone.
83
- ##### ui/: Contains the Streamlit frontend application.
84
- ##### app.py: Streamlit app for user interface.
85
- ##### load_data.py: Script to load the initial dataset into Pinecone.
86
- ##### run.py: Flask application entry point.
87
- ##### requirements.txt: Lists the Python dependencies for the project.
88
-
89
- ### Logging
90
- Logging is configured in the vectorizer.py and search_engine.py files.
91
-
92
- ### License
93
- This project is licensed under the MIT License.
94
-
95
- ## Acknowledgements
96
- ##### - Hugging Face for sentence-transformers
97
- ##### - Pinecone for vector database services
98
- ##### - Streamlit for the web app interface
99
- ##### - Feel free to reach out if you have any questions or need further assistance. Enjoy using the Prompt Search Engine!
 
1
+ ---
2
+ title: Prompt Search Engine
3
+ emoji: 🌍
4
+ colorFrom: yellow
5
+ colorTo: blue
6
+ sdk: docker
7
+ sdk_version: 4.38.1
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---