222
AI2 WildBench Leaderboard (V2)
🦁
Display and explore model leaderboards and chat history
Display and explore model leaderboards and chat history
Display chatbot leaderboard statistics
Track, rank and evaluate open LLMs and chatbots
Select benchmarks and languages for text embeddings evaluation
Explore LLM performance across hardware
Submit code models for evaluation on benchmarks
Request evaluation for speech models
Explore and analyze RewardBench leaderboard data
Jailbreak the LLM and privacy guardrails