-
216
MMLU-Pro Leaderboard
๐ฅMore advanced and challenging multi-task evaluation
-
49
Stick To Your Role! Leaderboard
๐ญBenchmarking LLMs on the stability of simulated populations
-
52
ZeroEval Leaderboard
๐Embed and use ZeroEval for evaluation tasks
-
26
Decentralized Arena Leaderboard
๐ฅDisplay model leaderboard evaluations
Hristo Panev
hppdqdq
AI & ML interests
None yet
Recent Activity
liked
a Space
about 3 hours ago
Wildminder/comfyui-sampler-scheduler
liked
a model
about 7 hours ago
nunchaku-tech/nunchaku-flux.1-krea-dev
liked
a model
about 23 hours ago
Lingyuzhou/Hyper_Flux.1_Dev_4_step_Lora
Organizations
None yet