Spaces:
Running
Search Arena
Search Arena is a comprehensive platform designed to rigorously evaluate and compare search-based web agents. Leveraging a variety of metrics, Search Arena ensures that users can identify the most effective solutions for their needs.
Key Features
- Output Evaluations: Analyze the quality and relevance of search results.
- Perplexity: Measure the predictive uncertainty of language models used by the agents.
- Exa (Exhaustiveness Analysis): Assess the breadth and depth of search coverage.
- Multi-Agent Comparison: Compare multiple agents side-by-side.
- Customizable Benchmarks: Define specific benchmarks and criteria for evaluation.
- User Feedback Integration: Incorporate user feedback to improve agent performance.
- Performance Metrics: Detailed reports on response time, precision, recall, and F1 score.
Benefits
- Enhanced Decision-Making: Make informed decisions with clear, data-driven evaluations.
- Optimization: Help developers optimize their search agents.
- Innovation: Foster innovation by promoting the best-performing search technologies.
Getting Started
Follow these steps to set up the project on your local machine.
Prerequisites
- Git
- Python 3.x
- Virtualenv (optional but recommended)
Installation
Clone the Repository:
git clone https://github.com/leowalker89/SearchArena
Navigate to the Project Directory:
cd search-arena
Create a Virtual Environment:
python -m venv env
Activate the Virtual Environment:
On Windows:
.\env\Scripts\activate
On macOS and Linux:
source env/bin/activate
Install the Required Dependencies:
pip install -r requirements.txt
Running the Project
Start the Development Server:
streamlit run app.py
Open your Browser:
Navigate to
http://localhost:5000
to view the platform.
Usage
- Follow the on-screen instructions to evaluate and compare search-based web agents.
- Customize benchmarks and criteria as needed.
- Analyze detailed reports and visualizations to make informed decisions.
Contributing
We welcome contributions! Please read our Contributing Guide to get started.
License
This project is licensed under the MIT License. See the LICENSE file for more details.
Contact
If you have any questions, feel free to open an issue or contact us at [email protected].