wuziheng's picture

1 6

wuziheng

wuziheng

·

wuziheng

AI & ML interests

CV/SSL/MultiMedia

Recent Activity

updated a model 2 days ago

bytedance-research/Valley-Eagle-7B

reacted to tianchez's post with 🚀 about 1 month ago

Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1

updated a model 2 months ago

bytedance-research/Valley-Eagle-7B

View all activity

Organizations

wuziheng's activity

liked a model 3 months ago

bytedance-research/Valley-Eagle-7B

Updated 2 days ago • 398 • 35

liked a model 8 months ago

KangarooGroup/kangaroo

Video-Text-to-Text • Updated Nov 13, 2024 • 368 • 12

liked a model about 1 year ago

liuhaotian/llava-v1.6-34b

Image-Text-to-Text • Updated May 9, 2024 • 9.18k • 349

liked a Space over 1 year ago

PixArt LCM

Generate images from text prompts

liked a model almost 2 years ago

alibaba-pai/pai-bloom-1b1-text2prompt-sd

Text Generation • Updated Mar 6, 2024 • 335 • 35

liked a Space about 2 years ago

Stable Diffusion 2-1

Generate images from text descriptions