Spaces:
Running
Running
# RAG Benchmark Evaluation System Architecture | |
## High-Level Architecture Overview | |
The system follows a modular architecture with the following key components: | |
### 1. Data Layer | |
- **Dataset Loading** (loaddataset.py) | |
- Handles RAGBench dataset loading from HuggingFace | |
- Processes multiple dataset configurations | |
- Extracts and normalizes data | |
- **Vector Database** (Milvus) | |
- Stores document embeddings | |
- Enables efficient similarity search | |
- Manages metadata and scores | |
### 2. Processing Layer | |
- **Document Processing** | |
- Chunking (insertmilvushelper.py) | |
- Sliding window implementation | |
- Overlap management | |
- **Embedding Generation** | |
- SentenceTransformer models | |
- Vector representation creation | |
- Dimension reduction | |
### 3. Search & Retrieval Layer | |
- **Vector Search** (searchmilvushelper.py) | |
- Cosine similarity computation | |
- Top-K retrieval | |
- Result ranking | |
- **Reranking System** (finetuneresults.py) | |
- Multiple reranker options (MS MARCO, MonoT5) | |
- Context relevance scoring | |
- Result refinement | |
### 4. Generation Layer | |
- **LLM Integration** (generationhelper.py) | |
- Multiple model support (LLaMA, Mistral) | |
- Context-aware response generation | |
- Prompt engineering | |
### 5. Evaluation Layer | |
- **Metrics Calculation** (calculatescores.py) | |
- RMSE computation | |
- AUCROC calculation | |
- Context relevance/utilization scoring | |
### 6. Presentation Layer | |
- **Web Interface** (app.py) | |
- Gradio-based UI | |
- Interactive model selection | |
- Real-time result display | |
## Data Flow | |
1. User submits query through Gradio interface | |
2. Query is embedded and searched in Milvus | |
3. Retrieved documents are reranked | |
4. LLM generates response using context | |
5. Response is evaluated and scored | |
6. Results are displayed to user | |
## Architecture Diagram | |
```mermaid | |
graph TB | |
%% User Interface Layer | |
UI[Web Interface - Gradio] | |
%% Data Layer | |
subgraph Data Layer | |
DS[RAGBench Dataset] | |
VDB[(Milvus Vector DB)] | |
end | |
%% Processing Layer | |
subgraph Processing Layer | |
DP[Document Processing] | |
EG[Embedding Generation] | |
style DP fill:#f9f,stroke:#333 | |
style EG fill:#f9f,stroke:#333 | |
end | |
%% Search & Retrieval Layer | |
subgraph Search & Retrieval | |
VS[Vector Search] | |
RR[Reranking System] | |
style VS fill:#bbf,stroke:#333 | |
style RR fill:#bbf,stroke:#333 | |
end | |
%% Generation Layer | |
subgraph Generation Layer | |
LLM[LLM Models] | |
PR[Prompt Engineering] | |
style LLM fill:#bfb,stroke:#333 | |
style PR fill:#bfb,stroke:#333 | |
end | |
%% Evaluation Layer | |
subgraph Evaluation Layer | |
ME[Metrics Evaluation] | |
SC[Score Calculation] | |
style ME fill:#ffb,stroke:#333 | |
style SC fill:#ffb,stroke:#333 | |
end | |
%% Flow Connections | |
UI --> DP | |
DS --> DP | |
DP --> EG | |
EG --> VDB | |
UI --> VS | |
VS --> VDB | |
VS --> RR | |
RR --> LLM | |
LLM --> PR | |
PR --> ME | |
ME --> SC | |
SC --> UI | |
%% Model Components | |
subgraph Models | |
ST[SentenceTransformers] | |
RM[Reranking Models] | |
GM[Generation Models] | |
style ST fill:#dfd,stroke:#333 | |
style RM fill:#dfd,stroke:#333 | |
style GM fill:#dfd,stroke:#333 | |
end | |
%% Model Connections | |
EG --> ST | |
RR --> RM | |
LLM --> GM | |
%% Styling | |
classDef default fill:#fff,stroke:#333,stroke-width:2px; | |
classDef interface fill:#f96,stroke:#333,stroke-width:2px; | |
class UI interface; | |
``` | |