Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published 11 days ago • 8
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper • 2411.06176 • Published Nov 9 • 44