david
commited on
Commit
·
9fadb81
1
Parent(s):
ab7c2db
說明
Browse files
README.md
CHANGED
@@ -7,7 +7,37 @@ sdk: streamlit
|
|
7 |
sdk_version: 1.43.1
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
-
short_description: Scrape, store, and query web data using RAG and AI chat.
|
11 |
---
|
12 |
|
13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
sdk_version: 1.43.1
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
+
short_description: WEB的爬蟲與ai對話機器人。Scrape, store, and query web data using RAG and AI chat.
|
11 |
---
|
12 |
|
13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
14 |
+
|
15 |
+
|
16 |
+
---
|
17 |
+
|
18 |
+
|
19 |
+
## 繁體中文說明
|
20 |
+
|
21 |
+
這是一個結合「網頁爬蟲」與「RAG(檢索增強生成)」的 AI 對話機器人專案。
|
22 |
+
- 你可以輸入任意網址,系統會自動爬取該網頁(可設定多層遞迴與同網域限制),將內容分段並向量化存入本地資料庫。
|
23 |
+
- 之後可直接用中文或英文提問,系統會根據爬取內容檢索最相關段落,並用 Gemini LLM 生成回覆。
|
24 |
+
- 支援中文語意檢索,適合知識管理、網站摘要、FAQ 應用。
|
25 |
+
|
26 |
+
### 安裝與執行
|
27 |
+
1. 安裝依賴:`pip install -r requirements.txt`
|
28 |
+
2. 複製 `example.env` 為 `.env` 並填入你的 Gemini API 金鑰
|
29 |
+
3. 執行:`streamlit run app.py`
|
30 |
+
|
31 |
+
---
|
32 |
+
|
33 |
+
## English Description
|
34 |
+
|
35 |
+
This project is a Web Scraper & RAG-based AI Chatbot.
|
36 |
+
- Enter any website URL, and the system will crawl the page (with configurable recursion depth and same-domain restriction), split and vectorize the content, and store it in a local database.
|
37 |
+
- You can then ask questions in Chinese or English. The system retrieves the most relevant content and generates answers using Gemini LLM.
|
38 |
+
- Optimized for Chinese semantic search, suitable for knowledge management, website summarization, and FAQ scenarios.
|
39 |
+
|
40 |
+
### Installation & Usage
|
41 |
+
1. Install dependencies: `pip install -r requirements.txt`
|
42 |
+
2. Copy `example.env` to `.env` and fill in your Gemini API key
|
43 |
+
3. Run: `streamlit run app.py`
|