Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper • 2412.18319 • Published 20 days ago • 35
YuLan-Mini: An Open Data-efficient Language Model Paper • 2412.17743 • Published 20 days ago • 61
Open Image Preferences Collection Containing all artifacts for the Stable Diffusion 3.5L vs Flux Dev image preference community sprint. • 14 items • Updated 25 days ago • 6
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 54
data-is-better-together/open-image-preferences-v1-binarized Viewer • Updated Dec 9, 2024 • 7.46k • 1.6k • 43
Running 25 🥇 Open LMM Reasoning Leaderboard A Leaderboard that demonstrates LMM reasoning capabilities
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published about 1 month ago • 137
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published Dec 12, 2024 • 35
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 92