Vision - a diwank Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

diwank 's Collections

F

search

Vision

Art

K

S1.1

Sam

Audio

thought

Vision

updated 1 day ago

apple/DepthPro

Depth Estimation • Updated Oct 9, 2024 • 2.05k • 382
rhymes-ai/Aria

Image-Text-to-Text • Updated 25 days ago • 28.4k • 603
mit-han-lab/hart-0.7b-1024px

Unconditional Image Generation • Updated Nov 17, 2024 • 9
deepseek-ai/Janus-1.3B

Any-to-Any • Updated Nov 14, 2024 • 10.3k • 501
neulab/PangeaInstruct

Updated Oct 25, 2024 • 348 • 78
genmo/mochi-1-preview

Text-to-Video • Updated 24 days ago • 43k • 1.14k
stabilityai/stable-diffusion-3.5-large

Text-to-Image • Updated Oct 22, 2024 • 130k • • 1.85k
Freepik/flux.1-lite-8B-alpha

Text-to-Image • Updated 13 days ago • 4.72k • 403
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 1.03k • 1.52k
mistralai/Pixtral-12B-Base-2409

Updated Oct 30, 2024 • 69
neulab/Pangea-7B

Updated Oct 24, 2024 • 7.12k • 122
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • Updated 4 days ago • 450 • 59
OpenGVLab/InternVL2-1B

Image-Text-to-Text • Updated 25 days ago • 58.4k • 59
OpenGVLab/InternVL2-2B

Image-Text-to-Text • Updated 25 days ago • 64.6k • 65
OpenGVLab/Mono-InternVL-2B

Image-Text-to-Text • Updated Nov 21, 2024 • 4.14k • 30
OpenGVLab/OmniCorpus-YT

Updated Nov 17, 2024 • 283 • 9
OpenGVLab/OmniCorpus-CC-210M

Viewer • Updated Nov 17, 2024 • 208M • 626 • 19
OpenGVLab/OmniCorpus-CC

Viewer • Updated Nov 17, 2024 • 986M • 12.5k • 12
OpenGVLab/InternVideo2_chat_8B_HD

Video-Text-to-Text • Updated 25 days ago • 791 • 17
OpenGVLab/ViCLIP

Updated Jun 7, 2024 • 33
OpenGVLab/ASMv2

Text Generation • Updated Feb 29, 2024 • 70 • 17
OpenGVLab/VideoChat2-IT

Viewer • Updated Jun 29, 2024 • 1.82M • 658 • 47
NimVideo/cogvideox-2b-img2vid

Image-to-Video • Updated Oct 28, 2024 • 250 • 63
BAAI/Infinity-MM

Updated about 1 month ago • 19.4k • 87
nvidia/RADIO-H

Updated Dec 2, 2024 • 2.11k • 9
Spawning/PD12M

Viewer • Updated 3 days ago • 12.4M • 1.5k • 146
Shitao/OmniGen-v1

Text-to-Image • Updated Nov 7, 2024 • 9.66k • 280
InstantX/InstantIR

Image-to-Image • Updated Nov 7, 2024 • 6 • 162
nvidia/Cosmos-0.1-Tokenizer-DI8x8

Updated 18 days ago • 279 • 9
BAAI/Emu3-Chat

Text Generation • Updated Oct 24, 2024 • 805 • 71
briaai/RMBG-2.0

Image Segmentation • Updated 20 days ago • 281k • 569
Watermark Anything with Localized Messages

Paper • 2411.07231 • Published Nov 11, 2024 • 20
rain1011/pyramid-flow-miniflux

Text-to-Video • Updated Nov 13, 2024 • 160
OpenGVLab/InternVL2-8B-MPO

Image-Text-to-Text • Updated 23 days ago • 1.48k • 34
mistralai/Pixtral-Large-Instruct-2411

Image-Text-to-Text • Updated 17 days ago • 2 • 385
briaai/BRIA-2.3

Text-to-Image • Updated Nov 19, 2024 • 451 • 30
microsoft/Reducio-VAE

Updated Nov 21, 2024 • 8 • 15
Lightricks/LTX-Video

Image-to-Video • Updated 24 days ago • 83.9k • 848
apple/aimv2-3B-patch14-448

Image Feature Extraction • Updated Nov 28, 2024 • 404 • 8
THUdyh/Insight-V-Reason

Text Generation • Updated Nov 22, 2024 • 32 • 9
black-forest-labs/FLUX.1-Fill-dev

Updated Nov 25, 2024 • 35.4k • 456
Efficient-Large-Model/Sana_1600M_512px

Text-to-Image • Updated 1 day ago • 674 • 37
Efficient-Large-Model/Sana_1600M_1024px

Text-to-Image • Updated 1 day ago • 14.3k • 154
AIDC-AI/Ovis1.6-Gemma2-27B

Image-Text-to-Text • Updated Dec 10, 2024 • 1.06k • 59
HuggingFaceTB/SmolVLM-Base

Image-Text-to-Text • Updated Nov 28, 2024 • 9.71k • 51
THUDM/glm-edge-v-5b

Image-Text-to-Text • Updated 10 days ago • 132 • 11
rhymes-ai/Aria-Base-64K

Image-Text-to-Text • Updated Dec 1, 2024 • 3.29k • 11
allenai/pixmo-point-explanations

Viewer • Updated Dec 5, 2024 • 79.6k • 240 • 6
tencent/HunyuanVideo

Text-to-Video • Updated 25 days ago • 8.68k • 1.39k
tencent/HunyuanVideo-PromptRewrite

Updated Dec 6, 2024 • 74 • 40
google/paligemma2-28b-pt-896

Image-Text-to-Text • Updated Dec 5, 2024 • 880 • 42
OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • Updated 25 days ago • 6.5k • 159
MAmmoTH-VL/MAmmoTH-VL-8B

Updated Dec 9, 2024 • 383 • 15
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M

Viewer • Updated 7 days ago • 37M • 5.41k • 36
OpenGVLab/PVC-InternVL2-8B

Image-Text-to-Text • Updated 26 days ago • 92 • 8
BGLab/BioTrove

Viewer • Updated 30 days ago • 163M • 600 • 7
TencentARC/NVComposer

Image-to-3D • Updated 27 days ago • 165 • 7
deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated 25 days ago • 2.4k • 130
FastVideo/FastHunyuan

Text-to-Video • Updated 4 days ago • 757 • 143
BAAI/nova-d48w1536-sdxl1024

Text-to-Image • Updated 22 days ago • 50 • 7
IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated 18 days ago • 17.3k • 576
Infinigence/Megrez-3B-Omni

Updated 26 days ago • 716 • 122
microsoft/VidTok

Updated 1 day ago • 27
TIGER-Lab/Mantis-8B-siglip-llama3

Image-Text-to-Text • Updated Nov 15, 2024 • 11.6k • 32
OpenGVLab/HoVLE-HD

Image-Text-to-Text • Updated 18 days ago • 100 • 7
nyu-visionx/cambrian-34b

Text Generation • Updated Jun 28, 2024 • 56 • 28
nyu-visionx/cambrian-phi3-3b

Text Generation • Updated Jul 6, 2024 • 48 • 11
nyu-visionx/Cambrian-Alignment

Viewer • Updated Jul 23, 2024 • 292k • 1.4k • 32
nvidia/Cosmos-1.0-Autoregressive-13B-Video2World

Updated 2 days ago • 280 • 24
nvidia/Cosmos-1.0-Diffusion-14B-Video2World

Updated 2 days ago • 693 • 40
nvidia/Cosmos-1.0-Diffusion-14B-Text2World

Updated 2 days ago • 850 • 32
nvidia/Cosmos-1.0-Autoregressive-12B

Updated 2 days ago • 266 • 22
StephanST/WALDO30

Object Detection • Updated Oct 9, 2024 • 195
ByteDance/Sa2VA-8B

Image-Text-to-Text • Updated 2 days ago • 389 • 29

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs