Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
shail-2512
's Collections
MultiModal (Any-to-Any)
ALMs (Audio Language Models)
LLMs
TTS
Coder
Reasoning (LRMs)
Image Generation
VLMs
3D
Video Generation
Speech Recognition
Dataset to fine-tune Embeddings
Reranking Models
Embedding Models
VLMs
updated
Dec 2, 2024
Upvote
-
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
41.6k
•
319
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
967
•
1.52k
vidore/colsmolvlm-alpha
Updated
Dec 3, 2024
•
3.62k
•
46
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Dec 4, 2024
•
2.46M
•
•
1.21k
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text
•
Updated
1 day ago
•
1.66M
•
1.04k
mistral-community/pixtral-12b
Image-Text-to-Text
•
Updated
21 days ago
•
28.1k
•
81
HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
14.2k
•
261
allenai/Molmo-7B-O-0924
Image-Text-to-Text
•
Updated
Nov 15, 2024
•
13.5k
•
151
Upvote
-
Share collection
View history
Collection guide
Browse collections