ABC: Achieving Better Control of Multimodal Embeddings using VLMs Paper β’ 2503.00329 β’ Published 12 days ago β’ 18
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper β’ 2501.17703 β’ Published Jan 29 β’ 55
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation Paper β’ 2412.00927 β’ Published Dec 1, 2024 β’ 26