RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models Paper • 2411.04097 • Published Nov 6 • 5
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published Nov 6 • 43
CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis Paper • 2407.13301 • Published Jul 18 • 55
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Paper • 2404.07972 • Published Apr 11 • 46
StanfordAIMI/XrayCLIP__vit-b-16__laion2b-s34b-b88k Zero-Shot Image Classification • Updated Mar 11 • 2.58k • 1
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts Paper • 2309.07430 • Published Sep 14, 2023 • 27
CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Paper • 2401.12208 • Published Jan 22 • 22
CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Paper • 2401.12208 • Published Jan 22 • 22