SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 11 days ago • 121
microsoft/LLM2CLIP-Llama-3-8B-Instruct-CC-Finetuned Zero-Shot Classification • Updated Nov 19, 2024 • 3.65k • 32
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 10 items • Updated Jan 8 • 55
Salesforce/xgen-mm-phi3-mini-instruct-interleave-r-v1.5 Image-Text-to-Text • Updated 29 days ago • 5.61k • 51