meta-llama/Llama-4-Scout-17B-16E-Instruct Image-Text-to-Text β’ 109B β’ Updated May 22 β’ 777k β’ β’ 1.05k
meta-llama/Llama-4-Maverick-17B-128E-Instruct Image-Text-to-Text β’ 402B β’ Updated May 22 β’ 51.1k β’ β’ 394
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper β’ 2411.14402 β’ Published Nov 21, 2024 β’ 47
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Paper β’ 2408.15998 β’ Published Aug 28, 2024 β’ 88
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper β’ 2408.14717 β’ Published Aug 27, 2024 β’ 27
Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation Paper β’ 2407.13481 β’ Published Jul 18, 2024 β’ 10
Fast Matrix Multiplications for Lookup Table-Quantized LLMs Paper β’ 2407.10960 β’ Published Jul 15, 2024 β’ 13
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Paper β’ 2407.14482 β’ Published Jul 19, 2024 β’ 27
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study Paper β’ 2406.07057 β’ Published Jun 11, 2024 β’ 17
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS Paper β’ 2406.18009 β’ Published Jun 26, 2024 β’ 23