6 24 44

Haoqin Tu

PahaII

https://www.haqtu.me/

ImKeTT

AI & ML interests

generation, latent variable models

Recent Activity

updated a dataset 5 days ago

PahaII/spatialthinker_vqa_10k_filtered

published a dataset 5 days ago

PahaII/spatialthinker_vqa_10k_filtered

updated a model 8 days ago

PahaII/maplillary_results

View all activity

Organizations

updated a dataset 5 days ago

PahaII/spatialthinker_vqa_10k_filtered

Preview • Updated 5 days ago • 8

published a dataset 5 days ago

PahaII/spatialthinker_vqa_10k_filtered

Preview • Updated 5 days ago • 8

updated a model 8 days ago

PahaII/maplillary_results

Updated 8 days ago

liked a dataset 10 days ago

UCSC-VLAA/GPT-Image-Edit-1.5M

Viewer • Updated 17 days ago • 2.78M • 39.5k • 50

upvoted a paper 19 days ago

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Paper • 2507.21033 • Published 19 days ago • 20

updated a dataset 19 days ago

PahaII/seedbench

Updated 19 days ago • 16

published a dataset 19 days ago

PahaII/seedbench

Updated 19 days ago • 16

liked a dataset 26 days ago

stanford-crfm/CoReBench_v1

Updated 27 days ago • 82 • 1

published a model about 1 month ago

PahaII/maplillary_results

Updated 8 days ago

liked a model 2 months ago

UCSC-VLAA/VLAA-Thinker-Qwen2.5VL-3B

Image-Text-to-Text • 4B • Updated 17 days ago • 7.03k • 5

upvoted an article 3 months ago

Article

Vision Language Models Explained

and 1 other •

Apr 11, 2024

• 436

updated a dataset 3 months ago

UCSC-VLAA/PARADE_audio

Viewer • Updated May 11 • 938 • 6

upvoted 2 papers 3 months ago

X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains

Paper • 2505.03981 • Published May 6 • 15

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7 • 27

commented a paper 3 months ago

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7 • 27 •

upvoted 2 collections 3 months ago

VLAA-Thinker

Collection

6 items • Updated 1 day ago • 5

OpenVision

Collection

27 items • Updated 1 day ago • 29

liked a model 3 months ago

Skywork/Skywork-VL-Reward-7B

Image-Text-to-Text • 8B • Updated Jun 10 • 344 • 44

authored 2 papers 4 months ago

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Paper • 2412.18551 • Published Dec 24, 2024

Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning

Paper • 2502.11751 • Published Feb 17

Haoqin Tu

AI & ML interests

Recent Activity

Organizations

PahaII's activity

Vision Language Models Explained