Running on CPU Upgrade 76 76 Open Japanese LLM Leaderboard 🌸 Explore and compare LLM models through interactive leaderboards and submissions
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models Paper • 2502.07346 • Published Feb 11 • 51
multilingual_domain_datasets Collection Multilingual datasets. Excluding those which are just a cleaned version of CC. • 3 items • Updated 25 days ago
multilingual_domain_datasets Collection Multilingual datasets. Excluding those which are just a cleaned version of CC. • 3 items • Updated 25 days ago
multilingual_benchmark Collection For evaluating multilingual ability of LLMs • 1 item • Updated 29 days ago
Corpus: Evaluation datasets for ES & LATAM Collection Corpus of La Leaderboard, the open LLM leaderboard for ES & LATAM • 56 items • Updated Feb 5 • 4