Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
merve
's Collections
Nov 15 Releases 🍂
Nov 1 Releases
MIT Talk 31/10 Papers
October 25 Releases
LOTUS 🪷
New Depth Models
BRAVE Models 🦁
Computer Vision Backbones 🧩
Image Classification Models 🐶 🐱
Object Detection Models 🥥
Image Segmentation Models 💜
Zero-shot Image Classification Models 🖼️
Image-to-Image Models 🎨
Video Classification Models 📺
Image-to-Text Models 📝
Text-to-Image Models 🥑
Foundation Models for Vision 🧩
Segment Anything Model
OWL-series 🦉
SigLIP
Awesome Document AI
SegGPT
Vision Language Models Papers 🖼️💬📝
gvhf/owl
gv-hf/owl
merve/owl2
Depth Anything v2 Release
Document VLM Papers
Vision Language Leaderboards
Video Language Models
SAM2
NVEagle
Multimodal RAG
Zero-shot Segmentation
Video Language Models
updated
Aug 1
A collection of video-language models
Upvote
1
Running
on
Zero
18
🐨
Video Llava
llava-hf/LLaVA-NeXT-Video-7B-hf
Video-Text-to-Text
•
Updated
Oct 8
•
128k
•
47
llava-hf/LLaVA-NeXT-Video-7B-DPO-hf
Video-Text-to-Text
•
Updated
Aug 16
•
2.9k
•
8
llava-hf/LLaVA-NeXT-Video-7B-32K-hf
Image-Text-to-Text
•
Updated
Aug 16
•
144
•
6
Running
on
Zero
29
🌋
Llava Interleave
Upvote
1
Share collection
View history
Collection guide
Browse collections