Spaces:
Runtime error
Runtime error
ej68okap
commited on
Commit
Β·
b61a6a6
1
Parent(s):
241c492
new code added
Browse files
README.md
CHANGED
@@ -1,79 +1,10 @@
|
|
1 |
-
# Multimodal RAG with Colpali, Milvus, and Visual Language Models
|
2 |
-
|
3 |
-
This repository demonstrates how to build a **Multimodal Retrieval-Augmented Generation (RAG)** application using **Colpali**, **Milvus**, and **Visual Language Models (VLMs)** like Gemini or GPT-4o. The application allows users to upload a PDF and perform Q&A queries on both textual and visual elements of the document.
|
4 |
-
|
5 |
---
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
---
|
15 |
-
|
16 |
-
## Architecture Overview
|
17 |
-
|
18 |
-
1. **Colpali**:
|
19 |
-
- Generates embeddings for images (PDF pages) and text (user queries).
|
20 |
-
- Processes visual and textual data seamlessly.
|
21 |
-
|
22 |
-
2. **Milvus**:
|
23 |
-
- A vector database used for indexing and retrieving embeddings.
|
24 |
-
- Supports HNSW-based indexing for efficient similarity searches.
|
25 |
-
|
26 |
-
3. **Visual Language Models**:
|
27 |
-
- Gemini or GPT-4o performs context-aware Q&A using retrieved pages.
|
28 |
-
|
29 |
-
---
|
30 |
-
|
31 |
-
## Installation
|
32 |
-
|
33 |
-
### Prerequisites
|
34 |
-
- Python 3.8 or higher
|
35 |
-
- CUDA-compatible GPU for acceleration
|
36 |
-
- Milvus installed and running ([Installation Guide](https://milvus.io/docs/install_standalone.md))
|
37 |
-
- Required Python packages (see `requirements.txt`)
|
38 |
-
|
39 |
-
### Steps to Run the Application Locally
|
40 |
-
1. Clone the repository
|
41 |
-
2. Install dependencies as **pip install -r requirements.txt**
|
42 |
-
3. Set up environment variables
|
43 |
-
Add the following variables to your .env file or environment:
|
44 |
-
GEMINI_API_KEY=<Your_Gemini_API_Key>
|
45 |
-
4. Launch the Gradio App as **python app.py**
|
46 |
-
|
47 |
-
|
48 |
-
### Deploying the Gradio App on Hugging Face Spaces
|
49 |
-
1. Prepare the Repository
|
50 |
-
git clone https://github.com/saumitras/colpali-milvus-rag.git
|
51 |
-
cd colpali-milvus-rag
|
52 |
-
|
53 |
-
2. Organize the Repository:
|
54 |
-
Ensure the app file (e.g., app.py) contains the Gradio application code.
|
55 |
-
Include the requirements.txt file for dependencies.
|
56 |
-
|
57 |
-
Update the Hugging Face API Configuration:
|
58 |
-
|
59 |
-
3. Add necessary environment variables like GEMINI_API_KEY or OPENAI_API_KEY to the Hugging Face Spaces Secrets:
|
60 |
-
Navigate to your Hugging Face Space.
|
61 |
-
Go to the Settings tab and add the required secrets under Repository secrets.
|
62 |
-
|
63 |
-
4. Create a New Space
|
64 |
-
Visit Hugging Face Spaces.
|
65 |
-
Click New Space.
|
66 |
-
Fill in the details:
|
67 |
-
Name: Give your Space a unique name (e.g., multimodal_rag).
|
68 |
-
SDK: Select Gradio as the SDK.
|
69 |
-
Visibility: Choose between Public or Private.
|
70 |
-
Click Create Space.
|
71 |
-
5. Push Code to Hugging Face
|
72 |
-
Initialize Git and push the code:
|
73 |
-
git remote add hf https://huggingface.co/spaces/ultron1996/multimodal_rag
|
74 |
-
git push hf main
|
75 |
-
|
76 |
-
6. Wait for the Hugging Face Space to build and deploy the application.
|
77 |
-
|
78 |
-
|
79 |
-
The app has been deployed on Hugging Face Spaces and Demo is running at https://huggingface.co/spaces/ultron1996/multimodal_rag
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: Multimodal Rag
|
3 |
+
emoji: π¨
|
4 |
+
colorFrom: indigo
|
5 |
+
colorTo: blue
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.12.0
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|