WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences Paper β’ 2406.11069 β’ Published Jun 16, 2024 β’ 14
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos Paper β’ 2406.08407 β’ Published Jun 12, 2024 β’ 25
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? Paper β’ 2406.07546 β’ Published Jun 11, 2024 β’ 8