ViCrop: Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal Large Language Models Paper • 2310.16033 • Published Oct 24, 2023
Exploring Perceptual Limitation of Multimodal Large Language Models Paper • 2402.07384 • Published Feb 12, 2024 • 1
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales Paper • 2211.01562 • Published Nov 3, 2022
COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes Paper • 2409.04053 • Published Sep 6, 2024
MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning Paper • 2404.13591 • Published Apr 21, 2024 • 2