OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Paper • 2407.00114 • Published Jun 27 • 12
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11 • 32
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Paper • 2406.05132 • Published Jun 7 • 27
Tulu V2 Suite Collection The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" • 19 items • Updated 2 days ago • 42