Spaces:

ultron1996
/

multimodal_rag

Runtime error

App Files Files Community

ej68okap commited on Jan 28

Commit

b61a6a6

1 Parent(s): 241c492

new code added

Browse files

Files changed (1) hide show

README.md +8 -77

README.md CHANGED Viewed

@@ -1,79 +1,10 @@
-# Multimodal RAG with Colpali, Milvus, and Visual Language Models
-This repository demonstrates how to build a **Multimodal Retrieval-Augmented Generation (RAG)** application using **Colpali**, **Milvus**, and **Visual Language Models (VLMs)** like Gemini or GPT-4o. The application allows users to upload a PDF and perform Q&A queries on both textual and visual elements of the document.
 ---
-## Features
-- **Multimodal Q&A**: Combines visual and textual embeddings for robust query answering.
-- **PDF as Images**: Treats PDF pages as images to preserve layout and visual context.
-- **Efficient Retrieval**: Utilizes Milvus for fast and accurate vector search.
-- **Advanced Query Processing**: Integrates Colpali and VLMs for embeddings and response generation.
 ---
-## Architecture Overview
-1. **Colpali**:
-   - Generates embeddings for images (PDF pages) and text (user queries).
-   - Processes visual and textual data seamlessly.
-2. **Milvus**:
-   - A vector database used for indexing and retrieving embeddings.
-   - Supports HNSW-based indexing for efficient similarity searches.
-3. **Visual Language Models**:
-   - Gemini or GPT-4o performs context-aware Q&A using retrieved pages.
----
-## Installation
-### Prerequisites
-- Python 3.8 or higher
-- CUDA-compatible GPU for acceleration
-- Milvus installed and running ([Installation Guide](https://milvus.io/docs/install_standalone.md))
-- Required Python packages (see `requirements.txt`)
-### Steps to Run the Application Locally
-1. Clone the repository
-2. Install dependencies as **pip install -r requirements.txt**
-3. Set up environment variables
-    Add the following variables to your .env file or environment:
-    GEMINI_API_KEY=<Your_Gemini_API_Key>
-4.  Launch the Gradio App as **python app.py**
-### Deploying the Gradio App on Hugging Face Spaces
-1. Prepare the Repository
-git clone https://github.com/saumitras/colpali-milvus-rag.git
-cd colpali-milvus-rag
-2. Organize the Repository:
-Ensure the app file (e.g., app.py) contains the Gradio application code.
-Include the requirements.txt file for dependencies.
-Update the Hugging Face API Configuration:
-3. Add necessary environment variables like GEMINI_API_KEY or OPENAI_API_KEY to the Hugging Face Spaces Secrets:
-Navigate to your Hugging Face Space.
-Go to the Settings tab and add the required secrets under Repository secrets.
-4. Create a New Space
-    Visit Hugging Face Spaces.
-    Click New Space.
-    Fill in the details:
-    Name: Give your Space a unique name (e.g., multimodal_rag).
-    SDK: Select Gradio as the SDK.
-    Visibility: Choose between Public or Private.
-    Click Create Space.
-5. Push Code to Hugging Face
-    Initialize Git and push the code:
-    git remote add hf https://huggingface.co/spaces/ultron1996/multimodal_rag
-    git push hf main
-6. Wait for the Hugging Face Space to build and deploy the application.
-The app has been deployed on Hugging Face Spaces and Demo is running at https://huggingface.co/spaces/ultron1996/multimodal_rag

 ---
+title: Multimodal Rag
+emoji: 🐨
+colorFrom: indigo
+colorTo: blue
+sdk: gradio
+sdk_version: 5.12.0
+app_file: app.py
+pinned: false
 ---