"Principal Components" Enable A New Language of Images Paper • 2503.08685 • Published 3 days ago • 10
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning Paper • 2503.06960 • Published 4 days ago • 2
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More Paper • 2502.03738 • Published Feb 6 • 11
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability Paper • 2412.18551 • Published Dec 24, 2024
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights Paper • 2405.21070 • Published May 31, 2024
CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions Paper • 2411.16828 • Published Nov 25, 2024 • 1
AdaVAE: Exploring Adaptive GPT-2s in Variational Auto-Encoders for Language Modeling Paper • 2205.05862 • Published May 12, 2022
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 36
VHELM: A Holistic Evaluation of Vision Language Models Paper • 2410.07112 • Published Oct 9, 2024 • 3
Code-Survey: An LLM-Driven Methodology for Analyzing Large-Scale Codebases Paper • 2410.01837 • Published Sep 24, 2024
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization Paper • 2410.06244 • Published Oct 8, 2024 • 19
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published Sep 23, 2024 • 36
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning Paper • 2312.11420 • Published Dec 18, 2023 • 2