Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V Paper • 2310.11441 • Published Oct 17, 2023 • 27
YOLOv12: Attention-Centric Real-Time Object Detectors Paper • 2502.12524 • Published 13 days ago • 10
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Paper • 2502.14669 • Published 11 days ago • 11
Ola Collection Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment • 4 items • Updated 10 days ago • 2
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published 11 days ago • 13
Rethinking Diverse Human Preference Learning through Principal Component Analysis Paper • 2502.13131 • Published 13 days ago • 35
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 14 days ago • 90
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 18 days ago • 182
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published 20 days ago • 36
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • 21 days ago • 45
view article Article From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other • 20 days ago • 25
view article Article Topic 27: What are Chain-of-Agents and Chain-of-RAG? By Kseniase and 1 other • 18 days ago • 12
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub 20 days ago • 49
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 21 days ago • 141
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published 21 days ago • 126