Multimodal LLMs - a eloukas Collection

eloukas 's Collections

image-generation

pii

video-generation

zero-shot-models

Multimodal LLMs

small-but-good-llms

Multimodal LLMs

updated Apr 28

nvidia/NVLM-D-72B

Image-Text-to-Text • 79B • Updated Jan 14 • 51.1k • 772
gpt-omni/mini-omni2

Any-to-Any • Updated Oct 24, 2024 • 56 • 276
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.47M • • 11.2k
vikhyatk/moondream1

Text Generation • 2B • Updated Feb 7, 2024 • 32.4k • 485
HuggingFaceTB/SmolVLM-Instruct

Image-Text-to-Text • 2B • Updated Apr 8 • 82.3k • 534
MoritzLaurer/deberta-v3-large-zeroshot-v2.0

Zero-Shot Classification • 0.4B • Updated Apr 11, 2024 • 50k • 104
HuggingFaceTB/SmolVLM-256M-Instruct

Image-Text-to-Text • 0.3B • Updated Apr 8 • 516k • 274
HuggingFaceTB/SmolVLM-500M-Instruct

Image-Text-to-Text • 0.5B • Updated Apr 8 • 39k • 168
KRLabsOrg/lettucedect-large-modernbert-en-v1

Token Classification • 0.4B • Updated Apr 1 • 15.9k • • 26
Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • 73B • Updated Jun 6 • 744k • • 525
ds4sd/SmolDocling-256M-preview

Image-Text-to-Text • 0.3B • Updated May 16 • 36.6k • 1.54k