linjunyao
added leaderboard data; added Class coloring
0bb476f
raw
history blame contribute delete
552 Bytes
Models,JDB-A EN,JDB-A CN,JDB-B Acc,JDB-B Corr,JudgerBench,Class
GPT-4o-0806,0.664,0.608,1,1,0.818,API
CJ-1-32B,0.614,0.612,0.72,0.963,0.727,Judge
CJ-1-14B,0.599,0.615,0.699,0.959,0.718,Judge
Qwen2.5-72B-Chat,0.615,0.59,0.681,0.937,0.706,General
CJ-1-7B,0.57,0.583,0.687,0.948,0.697,Judge
Qwen2-72B-Chat,0.588,0.584,0.625,0.935,0.683,General
CJ-1-1.5B,0.553,0.527,0.629,0.905,0.654,Judge
Qwen2.5-7B-Chat,0.567,0.535,0.59,0.874,0.641,General
Selftaught-llama3.1-70B,0.443,0.57,0.598,0.869,0.62,Judge
Skywork-llama3.1-8B,0.63,0.605,-,-,-,Judge