Spaces:

tbdavid2019
/

web-scraper-and-chatbot-rag

Running

App Files Files Community

david commited on May 29

Commit

9fadb81

1 Parent(s): ab7c2db

說明

Browse files

Files changed (1) hide show

README.md +31 -1

README.md CHANGED Viewed

@@ -7,7 +7,37 @@ sdk: streamlit
 sdk_version: 1.43.1
 app_file: app.py
 pinned: false
-short_description: Scrape, store, and query web data using RAG and AI chat.
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 sdk_version: 1.43.1
 app_file: app.py
 pinned: false
+short_description: WEB的爬蟲與ai對話機器人。Scrape, store, and query web data using RAG and AI chat.
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+---
+## 繁體中文說明
+這是一個結合「網頁爬蟲」與「RAG（檢索增強生成）」的 AI 對話機器人專案。
+- 你可以輸入任意網址，系統會自動爬取該網頁（可設定多層遞迴與同網域限制），將內容分段並向量化存入本地資料庫。
+- 之後可直接用中文或英文提問，系統會根據爬取內容檢索最相關段落，並用 Gemini LLM 生成回覆。
+- 支援中文語意檢索，適合知識管理、網站摘要、FAQ 應用。
+### 安裝與執行
+1. 安裝依賴：`pip install -r requirements.txt`
+2. 複製 `example.env` 為 `.env` 並填入你的 Gemini API 金鑰
+3. 執行：`streamlit run app.py`
+---
+## English Description
+This project is a Web Scraper & RAG-based AI Chatbot.
+- Enter any website URL, and the system will crawl the page (with configurable recursion depth and same-domain restriction), split and vectorize the content, and store it in a local database.
+- You can then ask questions in Chinese or English. The system retrieves the most relevant content and generates answers using Gemini LLM.
+- Optimized for Chinese semantic search, suitable for knowledge management, website summarization, and FAQ scenarios.
+### Installation & Usage
+1. Install dependencies: `pip install -r requirements.txt`
+2. Copy `example.env` to `.env` and fill in your Gemini API key
+3. Run: `streamlit run app.py`