Submitted by EricW123456 43 CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering · 5 authors 12 1
Submitted by llwswyn 39 Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning · 18 authors 61 1
Submitted by yuntian-deng 37 NeuralOS: Towards Simulating Operating Systems via Neural Generative Models · 5 authors 4 4
Submitted by yukimasano 24 KV Cache Steering for Inducing Reasoning in Small Language Models · 6 authors 3
Submitted by iliashum 23 Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities · 3303 authors 2
Submitted by JacobYuan 20 Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective · 14 authors 56 1
Submitted by xwen99 11 Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation · 8 authors 1
Submitted by Ksgk-fy 6 What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models · 4 authors 1
Submitted by Raincleared 3 BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity · 8 authors 3 1
Submitted by ustc-zhangzm 3 Robust Multimodal Large Language Models Against Modality Conflict · 4 authors 2 1
Submitted by maitysubhajit 1 Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection · 6 authors 1
Submitted by nverma 1 DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging · 3 authors 1
Submitted by Sreyan88 - Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models · 11 authors 1