File size: 3,500 Bytes
fed116a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
# RAG Benchmark Evaluation System Architecture

## High-Level Architecture Overview

The system follows a modular architecture with the following key components:

### 1. Data Layer

- **Dataset Loading** (loaddataset.py)

  - Handles RAGBench dataset loading from HuggingFace
  - Processes multiple dataset configurations
  - Extracts and normalizes data

- **Vector Database** (Milvus)
  - Stores document embeddings
  - Enables efficient similarity search
  - Manages metadata and scores

### 2. Processing Layer

- **Document Processing**

  - Chunking (insertmilvushelper.py)
  - Sliding window implementation
  - Overlap management

- **Embedding Generation**
  - SentenceTransformer models
  - Vector representation creation
  - Dimension reduction

### 3. Search & Retrieval Layer

- **Vector Search** (searchmilvushelper.py)

  - Cosine similarity computation
  - Top-K retrieval
  - Result ranking

- **Reranking System** (finetuneresults.py)
  - Multiple reranker options (MS MARCO, MonoT5)
  - Context relevance scoring
  - Result refinement

### 4. Generation Layer

- **LLM Integration** (generationhelper.py)
  - Multiple model support (LLaMA, Mistral)
  - Context-aware response generation
  - Prompt engineering

### 5. Evaluation Layer

- **Metrics Calculation** (calculatescores.py)
  - RMSE computation
  - AUCROC calculation
  - Context relevance/utilization scoring

### 6. Presentation Layer

- **Web Interface** (app.py)
  - Gradio-based UI
  - Interactive model selection
  - Real-time result display

## Data Flow

1. User submits query through Gradio interface
2. Query is embedded and searched in Milvus
3. Retrieved documents are reranked
4. LLM generates response using context
5. Response is evaluated and scored
6. Results are displayed to user

## Architecture Diagram

```mermaid
graph TB
    %% User Interface Layer
    UI[Web Interface - Gradio]

    %% Data Layer
    subgraph Data Layer
        DS[RAGBench Dataset]
        VDB[(Milvus Vector DB)]
    end

    %% Processing Layer
    subgraph Processing Layer
        DP[Document Processing]
        EG[Embedding Generation]
        style DP fill:#f9f,stroke:#333
        style EG fill:#f9f,stroke:#333
    end

    %% Search & Retrieval Layer
    subgraph Search & Retrieval
        VS[Vector Search]
        RR[Reranking System]
        style VS fill:#bbf,stroke:#333
        style RR fill:#bbf,stroke:#333
    end

    %% Generation Layer
    subgraph Generation Layer
        LLM[LLM Models]
        PR[Prompt Engineering]
        style LLM fill:#bfb,stroke:#333
        style PR fill:#bfb,stroke:#333
    end

    %% Evaluation Layer
    subgraph Evaluation Layer
        ME[Metrics Evaluation]
        SC[Score Calculation]
        style ME fill:#ffb,stroke:#333
        style SC fill:#ffb,stroke:#333
    end

    %% Flow Connections
    UI --> DP
    DS --> DP
    DP --> EG
    EG --> VDB
    UI --> VS
    VS --> VDB
    VS --> RR
    RR --> LLM
    LLM --> PR
    PR --> ME
    ME --> SC
    SC --> UI

    %% Model Components
    subgraph Models
        ST[SentenceTransformers]
        RM[Reranking Models]
        GM[Generation Models]
        style ST fill:#dfd,stroke:#333
        style RM fill:#dfd,stroke:#333
        style GM fill:#dfd,stroke:#333
    end

    %% Model Connections
    EG --> ST
    RR --> RM
    LLM --> GM

    %% Styling
    classDef default fill:#fff,stroke:#333,stroke-width:2px;
    classDef interface fill:#f96,stroke:#333,stroke-width:2px;
    class UI interface;
```