Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
3
5
23
Andrea Pedrotti
andreapdr
Follow
alemiaschi's profile picture
1 follower
·
5 following
https://andreapdr.github.io
andreapdr
AI & ML interests
Vision and Language Models
Recent Activity
liked
a dataset
6 days ago
tatsu-lab/alpaca
liked
a dataset
6 days ago
GAIR/lima
reacted
to
andito
's
post
with 🔥
10 days ago
🧠👁️ Can AI visualize solutions? Humans often solve visual problems by sketching ideas in our minds. What if Vision-Language Models (VLMs) could do something similar, not by generating full images, but by using internal “mental sketches”? That’s the idea behind Mirage, a new framework that empowers VLMs to reason using latent visual tokens. Instead of just thinking in words, Mirage mixes in abstract visual representations that help the model solve complex tasks. These aren't photorealistic images. They're compact, internal representations optimized purely to support reasoning. 🔧 Mirage is trained in two phases: 1) Grounding: It learns to produce latent tokens anchored in real images. 2) Refinement: The model drops the images and learns to generate visual tokens on its own. 📈 And yes, it works! On challenging benchmarks like Visual Spatial Planning, Jigsaw puzzles, and Spatial Attention Tasks, Mirage clearly outperforms GPT-4o and other strong baselines. Smart sketches > empty words. By mimicking the way humans visualize solutions, Mirage gives AI a new kind of imagination, one that’s faster, more efficient, and more human-like. Kudos to the teams at UMass Amherst and MIT behind this exciting work. Check the paper: https://huggingface.co/papers/2506.17218
View all activity
Organizations
andreapdr
's models
8
Sort: Recently updated
andreapdr/LID-Llama-3.1-8b-M4ABS-ling
Updated
10 days ago
•
7
•
1
andreapdr/LID-Llama-3.1-8b-XSUM-ling
Updated
10 days ago
•
12
•
2
andreapdr/LID-gemma-2-2b-M4ABS
Updated
10 days ago
•
4
•
1
andreapdr/LID-gemma-2-2b-M4ABS-ling
Updated
10 days ago
•
6
•
1
andreapdr/LID-gemma-2-2b-XSUM
Updated
10 days ago
•
20
•
1
andreapdr/LID-gemma-2-2b-XSUM-ling
Updated
10 days ago
•
7
•
1
andreapdr/LID-Llama-3.1-8b-XSUM
Updated
10 days ago
•
18
•
2
andreapdr/LID-Llama-3.1-8b-M4ABS
Updated
10 days ago
•
9
•
1