PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 122
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated about 1 month ago • 126
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published Dec 5, 2024 • 59
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models Paper • 2411.18613 • Published Nov 27, 2024 • 50
microsoft/LLM2CLIP-Llama-3-8B-Instruct-CC-Finetuned Zero-Shot Classification • Updated Nov 19, 2024 • 9.18k • 31
A Case Study of Web App Coding with OpenAI Reasoning Models Paper • 2409.13773 • Published Sep 19, 2024 • 6