Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published 15 days ago • 51
TLDR: Token-Level Detective Reward Model for Large Vision Language Models Paper • 2410.04734 • Published Oct 7 • 16
On Retrieval Augmentation and the Limitations of Language Model Training Paper • 2311.09615 • Published Nov 16, 2023 • 1
DeLLMa: A Framework for Decision Making Under Uncertainty with Large Language Models Paper • 2402.02392 • Published Feb 4 • 5
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations Paper • 2404.01266 • Published Apr 1 • 2
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27 • 20
Pre-trained Large Language Models Use Fourier Features to Compute Addition Paper • 2406.03445 • Published Jun 5
BLINK: Multimodal Large Language Models Can See but Not Perceive Paper • 2404.12390 • Published Apr 18 • 24
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations Paper • 2404.01266 • Published Apr 1 • 2
DeLLMa: A Framework for Decision Making Under Uncertainty with Large Language Models Paper • 2402.02392 • Published Feb 4 • 5
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models Paper • 2312.03052 • Published Dec 5, 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering Paper • 2303.11897 • Published Mar 21, 2023
Training Language Models to Generate Text with Citations via Fine-grained Rewards Paper • 2402.04315 • Published Feb 6
Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models Paper • 2310.17086 • Published Oct 26, 2023 • 1
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations Paper • 2404.01266 • Published Apr 1 • 2
DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback Paper • 2311.17946 • Published Nov 29, 2023 • 1