rag-tool

Running

App Files Files Community

Chris4K commited on Apr 28

Commit

ecd7ff9

verified ·

1 Parent(s): 69fde89

Create Smolagent.md

Browse files

Files changed (1) hide show

Smolagent.md +127 -0

Smolagent.md ADDED Viewed

	@@ -0,0 +1,127 @@

+# Advanced RAG Tool for smolagents
+This repository contains an improved Retrieval-Augmented Generation (RAG) tool built for the `smolagents` library from Hugging Face. This tool allows you to:
+- Create vector stores from various document types (PDF, TXT, HTML, etc.)
+- Choose different embedding models for better semantic understanding
+- Configure chunk sizes and overlaps for optimal text splitting
+- Select between different vector stores (FAISS or Chroma)
+- Share your tool on the Hugging Face Hub
+## Installation
+```bash
+pip install smolagents langchain-community langchain-text-splitters faiss-cpu chromadb sentence-transformers pypdf2 gradio
+```
+## Basic Usage
+```python
+from rag_tool import RAGTool
+# Initialize the RAG tool
+rag_tool = RAGTool()
+# Configure with custom settings
+rag_tool.configure(
+    documents_path="./my_document.pdf",
+    embedding_model="BAAI/bge-small-en-v1.5",
+    vector_store_type="faiss",
+    chunk_size=1000,
+    chunk_overlap=200,
+    persist_directory="./vector_store",
+    device="cpu"  # Use "cuda" for GPU acceleration
+)
+# Query the documents
+result = rag_tool("What is attention in transformer architecture?", top_k=3)
+print(result)
+```
+## Using with an Agent
+```python
+import warnings
+# Suppress LangChain deprecation warnings
+warnings.filterwarnings("ignore", category=DeprecationWarning)
+from smolagents import CodeAgent, InferenceClientModel
+from rag_tool import RAGTool
+# Initialize and configure the RAG tool
+rag_tool = RAGTool()
+rag_tool.configure(documents_path="./my_document.pdf")
+# Create an agent model
+model = InferenceClientModel(
+    model_id="mistralai/Mistral-7B-Instruct-v0.2",
+    token="your_huggingface_token"
+)
+# Create the agent with our RAG tool
+agent = CodeAgent(tools=[rag_tool], model=model, add_base_tools=True)
+# Run the agent
+result = agent.run("Explain the key components of the transformer architecture")
+print(result)
+```
+## Gradio Interface
+For an interactive experience, run the Gradio app:
+```bash
+python gradio_app.py
+```
+This provides a web interface where you can:
+- Upload documents
+- Configure embedding models and chunk settings
+- Query your documents with semantic search
+## Customization Options
+### Embedding Models
+You can choose from various embedding models:
+- `sentence-transformers/all-MiniLM-L6-v2` (fast, smaller model)
+- `BAAI/bge-small-en-v1.5` (good balance of performance and speed)
+- `BAAI/bge-base-en-v1.5` (better performance, slower)
+- `thenlper/gte-small` (good for general text embeddings)
+- `thenlper/gte-base` (larger GTE model)
+### Vector Store Types
+- `faiss`: Fast, in-memory vector database (better for smaller collections)
+- `chroma`: Persistent vector database with metadata filtering capabilities
+### Document Types
+The tool supports multiple document types:
+- PDF documents
+- Text files (.txt)
+- Markdown files (.md)
+- HTML files (.html)
+- Entire directories of mixed document types
+## Sharing Your Tool
+You can share your tool on the Hugging Face Hub:
+```python
+rag_tool.push_to_hub("your-username/rag-retrieval-tool", token="your_huggingface_token")
+```
+## Limitations
+- The tool currently doesn't support image content from PDFs
+- Very large documents may require additional memory
+- Some embedding models may be slow on CPU-only environments
+## Contributing
+Contributions are welcome! Feel free to open an issue or submit a pull request.
+## License
+MIT