rag-tool

Running

App Files Files Community

rag-tool / Smolagent.md

Chris4K

Create Smolagent.md

ecd7ff9 verified 2 months ago

preview code

raw

history blame contribute delete

3.41 kB

	# Advanced RAG Tool for smolagents

	This repository contains an improved Retrieval-Augmented Generation (RAG) tool built for the `smolagents` library from Hugging Face. This tool allows you to:

	- Create vector stores from various document types (PDF, TXT, HTML, etc.)
	- Choose different embedding models for better semantic understanding
	- Configure chunk sizes and overlaps for optimal text splitting
	- Select between different vector stores (FAISS or Chroma)
	- Share your tool on the Hugging Face Hub

	## Installation

	```bash
	pip install smolagents langchain-community langchain-text-splitters faiss-cpu chromadb sentence-transformers pypdf2 gradio
	```

	## Basic Usage

	```python
	from rag_tool import RAGTool

	# Initialize the RAG tool
	rag_tool = RAGTool()

	# Configure with custom settings
	rag_tool.configure(
	documents_path="./my_document.pdf",
	embedding_model="BAAI/bge-small-en-v1.5",
	vector_store_type="faiss",
	chunk_size=1000,
	chunk_overlap=200,
	persist_directory="./vector_store",
	device="cpu" # Use "cuda" for GPU acceleration
	)

	# Query the documents
	result = rag_tool("What is attention in transformer architecture?", top_k=3)
	print(result)
	```

	## Using with an Agent

	```python
	import warnings
	# Suppress LangChain deprecation warnings
	warnings.filterwarnings("ignore", category=DeprecationWarning)

	from smolagents import CodeAgent, InferenceClientModel
	from rag_tool import RAGTool

	# Initialize and configure the RAG tool
	rag_tool = RAGTool()
	rag_tool.configure(documents_path="./my_document.pdf")

	# Create an agent model
	model = InferenceClientModel(
	model_id="mistralai/Mistral-7B-Instruct-v0.2",
	token="your_huggingface_token"
	)

	# Create the agent with our RAG tool
	agent = CodeAgent(tools=[rag_tool], model=model, add_base_tools=True)

	# Run the agent
	result = agent.run("Explain the key components of the transformer architecture")
	print(result)
	```

	## Gradio Interface

	For an interactive experience, run the Gradio app:

	```bash
	python gradio_app.py
	```

	This provides a web interface where you can:
	- Upload documents
	- Configure embedding models and chunk settings
	- Query your documents with semantic search

	## Customization Options

	### Embedding Models

	You can choose from various embedding models:
	- `sentence-transformers/all-MiniLM-L6-v2` (fast, smaller model)
	- `BAAI/bge-small-en-v1.5` (good balance of performance and speed)
	- `BAAI/bge-base-en-v1.5` (better performance, slower)
	- `thenlper/gte-small` (good for general text embeddings)
	- `thenlper/gte-base` (larger GTE model)

	### Vector Store Types

	- `faiss`: Fast, in-memory vector database (better for smaller collections)
	- `chroma`: Persistent vector database with metadata filtering capabilities

	### Document Types

	The tool supports multiple document types:
	- PDF documents
	- Text files (.txt)
	- Markdown files (.md)
	- HTML files (.html)
	- Entire directories of mixed document types

	## Sharing Your Tool

	You can share your tool on the Hugging Face Hub:

	```python
	rag_tool.push_to_hub("your-username/rag-retrieval-tool", token="your_huggingface_token")
	```

	## Limitations

	- The tool currently doesn't support image content from PDFs
	- Very large documents may require additional memory
	- Some embedding models may be slow on CPU-only environments

	## Contributing

	Contributions are welcome! Feel free to open an issue or submit a pull request.

	## License

	MIT