-
846
Open VLM Leaderboard
🌎VLMEvalKit Evaluation Results Collection
-
117
Open VLM Video Leaderboard
🌎VLMEvalKit Eval Results in video understanding benchmark
-
40
Open LMM Reasoning Leaderboard
🥇A Leaderboard that demonstrates LMM reasoning capabilities
-
22
MMBench Leaderboard
🚀View and filter MMBench leaderboard data
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
👋 join us on Discord and WeChat
follow us on Github
OpenCompass is a platform focused on evaluation of AGI, include Large Language Model and Multi-modality Model. We aim to:
- develop high-quality libraries to reduce the difficulties in evaluation
- provide convincing leaderboards for improving the understanding of the large models
- create powerful toolchains targeting a variety of abilities and tasks
- build solid benchmarks to support the large model research
-
846
Open VLM Leaderboard
🌎VLMEvalKit Evaluation Results Collection
-
117
Open VLM Video Leaderboard
🌎VLMEvalKit Eval Results in video understanding benchmark
-
40
Open LMM Reasoning Leaderboard
🥇A Leaderboard that demonstrates LMM reasoning capabilities
-
22
MMBench Leaderboard
🚀View and filter MMBench leaderboard data
CompassVerifier: A Unified and Robust Verifier for Large Language Models
spaces
17
pinned
Running
24
RISEBench Gallery
👀
A Gallery of Generation Results on RISEBench
pinned
Running
3
Open LMM Spatial Leaderboard
🥇
A Leaderboard for LMM spatial understanding capabilities
pinned
Running
26
Open LMM Subjective Leaderboard
🌎
VLMEvalKit Subjectivce Benchmark Results
pinned
Running
3
CompassAcademic Leaderboard Full Version
🦀
Compass Academic Leaderboard Full Version
pinned
Running
40
Open LMM Reasoning Leaderboard
🥇
A Leaderboard that demonstrates LMM reasoning capabilities
pinned
Running
6
Compass Academic Leaderboard
🦀
Compass Academic Leaderboard
models
13

opencompass/CompassJudger-2-7B-Instruct
Text Ranking
•
8B
•
Updated
•
266
•
2

opencompass/CompassJudger-2-32B-Instruct
Text Ranking
•
33B
•
Updated
•
117
•
2

opencompass/CompassVerifier-32B
33B
•
Updated
•
14
•
5

opencompass/CompassVerifier-7B
8B
•
Updated
•
79
•
4

opencompass/CompassVerifier-3B
3B
•
Updated
•
45
•
2

opencompass/anah-7b
Text Classification
•
8B
•
Updated
•
3

opencompass/anah-20b
Text Classification
•
20B
•
Updated
•
4

opencompass/anah-v2
Text Classification
•
8B
•
Updated
•
5
•
4

opencompass/CompassJudger-1-14B-Instruct
Text Generation
•
15B
•
Updated
•
6
•
2

opencompass/CompassJudger-1-32B-Instruct
Text Generation
•
33B
•
Updated
•
15
•
17
datasets
14
opencompass/LiveMathBench
Viewer
•
Updated
•
483
•
1.24k
•
9
opencompass/CodeForce_SAGA
Viewer
•
Updated
•
5.57k
•
238
•
1
opencompass/CodeCompass
Updated
•
316
•
1
opencompass/VerifierBench
Viewer
•
Updated
•
2.82k
•
246
•
1
opencompass/NeedleBench
Viewer
•
Updated
•
6.8k
•
8.39k
•
5
opencompass/compass_academic_predictions
Viewer
•
Updated
•
4.42M
•
65
opencompass/Creation-MMBench
Viewer
•
Updated
•
765
•
105
•
2
opencompass/anah
Viewer
•
Updated
•
783
•
87
•
3
opencompass/AIME2025
Viewer
•
Updated
•
30
•
7.1k
•
26
opencompass/mmmlu_lite
Viewer
•
Updated
•
20k
•
46
•
2