RetrieveGPT: Merging Prompts and Mathematical Models for Enhanced Code-Mixed Information Retrieval Paper • 2411.04752 • Published 9 days ago • 15
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published 9 days ago • 25
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models Paper • 2411.05005 • Published 8 days ago • 13
DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation Paper • 2411.04999 • Published 8 days ago • 16
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? Paper • 2411.05000 • Published 8 days ago • 20
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 8 days ago • 46
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion Paper • 2411.04928 • Published 9 days ago • 42
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published 9 days ago • 100