judgerbench_leaderboard / data /detail_b_acc.csv
linjunyao
added leaderboard data; added Class coloring
0bb476f
raw
history blame contribute delete
461 Bytes
Models,AlignBench,Fofo,WildBench,ArenaHard,Average,Class
CJ-1-32B,0.857,0.806,0.596,0.621,0.72,Judge
CJ-1-14B,0.839,0.787,0.566,0.602,0.699,Judge
CJ-1-7B,0.816,0.783,0.564,0.586,0.687,Judge
Qwen2.5-72B-Chat,0.878,0.677,0.599,0.57,0.681,General
CJ-1-1.5B,0.822,0.712,0.55,0.43,0.629,Judge
Qwen2-72B-Chat,0.867,0.692,0.564,0.376,0.625,General
Selftaught-llama3.1-70B,0.755,0.627,0.538,0.472,0.598,Judge
Qwen2.5-7B-Chat,0.777,0.67,0.47,0.444,0.59,General