Foundation Models for Vision - a itseffi Collection

itseffi 's Collections

Most influential papers

LLMs

Foundation Models for Vision

Multi-agent collaboration

Foundation Models for Vision

updated Feb 20, 2024

Running

86

86

Grounding DINO Demo

💻

Cutting edge open-vocabulary object detection app
Running

89

89

Owlv2

👀

State-of-the-art Zero-shot Object Detection
openai/clip-vit-base-patch32

Zero-Shot Image Classification • Updated Feb 29, 2024 • 18.1M • 739
openai/clip-vit-large-patch14

Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 9.74M • 1.83k
google/pix2struct-large

Image-to-Text • 1B • Updated Sep 6, 2023 • 4.96k • 34
google/pix2struct-ai2d-base

Visual Question Answering • 0.3B • Updated Dec 24, 2023 • 842 • 43
HuggingFaceM4/idefics-80b-instruct

Text Generation • 80B • Updated Oct 12, 2023 • 1.57k • 189
Runtime error

41

41

BLIP2 with transformers

🌖

BLIP2 (cutting edge image captioning) in 🤗transformers