--- title: README emoji: 🌍 colorFrom: blue colorTo: blue sdk: static pinned: false ---

On Path to Multimodal Generalist: Levels and Benchmarks

[📖 Project] [🏆 Leaderboard] [📄 Paper] [🤗 Dataset-HF] [📝 Dataset-Github]

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/license/mit) ---

Does higher performance across tasks indicate a stronger capability of MLLM, and closer to AGI?
NO! Synergy does.

This project introduces: 1. **General-Level**, a 5-scale level evaluation system with a new norm for assessing the multimodal generalists (multimodal LLMs/agents). The core is the use of Synergy as the evaluative criterion, categorizing capabilities based on whether MLLMs preserve synergy across comprehension and generation, as well as across multimodal interactions. 2. **General-Bench**, a companion massive multimodal benchmark dataset, encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325K instances.