File size: 1,706 Bytes
ea86b97
 
 
 
 
 
 
 
 
 
 
 
 
818c1d2
 
 
 
a3ef303
818c1d2
 
 
 
 
 
 
 
a3ef303
 
818c1d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
title: RAGTesting
emoji: πŸ’¬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false
license: mit
short_description: A simple RAG demo
---

# Mini RAG Demo – Retrieval-Augmented Generation on Wikipedia

This is a lightweight Retrieval-Augmented Generation (RAG) app built with Gradio. It combines semantic search over a mini Wikipedia (`rag-datasets/rag-mini-wikipedia`) corpus with reranking and language generation to answer natural language questions using real documents.

---

## What It Does

- Embeds a query using a SentenceTransformer (`all-MiniLM-L6-v2`)
- Retrieves the top-5 most semantically similar Wikipedia passages using FAISS
- Reranks them using a CrossEncoder model (`cross-encoder/ms-marco-MiniLM-L-6-v2`)
- Generates an answer using a Hugging Face language model

---

## Tech Stack

- **Gradio** – Web interface
- **FAISS** – Fast dense vector retrieval
- **Sentence-Transformers** – Embedding & reranking
- **Transformers (Hugging Face)** – Language model for generation
- **Hugging Face Datasets** – Mini Wikipedia corpus (`rag-datasets/rag-mini-wikipedia`)

---

## Models Used

| Purpose       | Model                                       |
|---------------|---------------------------------------------|
| Embedding     | `all-MiniLM-L6-v2`                          |
| Reranking     | `cross-encoder/ms-marco-MiniLM-L-6-v2`      |
| Generation    | `mistralai/Mistral-7B-Instruct-v0.2` *(optional)* or a smaller model |

---

## πŸ“¦ Running Locally

To run the app locally:

```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/mini-rag-demo
cd mini-rag-demo
pip install -r requirements.txt
python app.py
```