17 28 2

kaipeng

kpzhang996

AI & ML interests

None yet

Recent Activity

liked a model 24 days ago

stdstu123/Yume-I2V-540P

upvoted a paper 24 days ago

Yume: An Interactive World Generation Model

commented on a paper 24 days ago

Yume: An Interactive World Generation Model

View all activity

Organizations

commented a paper 24 days ago

Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published 25 days ago • 83 •

commented a paper about 1 month ago

Neural-Driven Image Editing

Paper • 2507.05397 • Published Jul 7 • 26 •

commented a paper about 2 months ago

Sekai: A Video Dataset towards World Exploration

Paper • 2506.15675 • Published Jun 18 • 64 •

commented a paper 2 months ago

A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

Paper • 2506.09427 • Published Jun 11 • 9 •

commented a paper 3 months ago

SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model

Paper • 2505.22126 • Published May 28 • 4 •

commented a paper 4 months ago

MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models

Paper • 2504.05782 • Published Apr 8 • 4 •

commented 6 papers 5 months ago

CLS-RL: Image Classification with Rule-Based Reinforcement Learning

Paper • 2503.16188 • Published Mar 20 • 11 •

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction

Paper • 2503.16194 • Published Mar 20 • 8 •

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

Paper • 2503.12505 • Published Mar 16 • 11 •

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

Paper • 2503.12545 • Published Mar 16 • 5 •

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Paper • 2503.06553 • Published Mar 9 • 7 •

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Paper • 2503.06542 • Published Mar 9 • 7 •

commented a paper 8 months ago

ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality

Paper • 2412.04062 • Published Dec 5, 2024 • 9 •

commented a paper 9 months ago

GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

Paper • 2411.18499 • Published Nov 27, 2024 • 18 •

commented a paper 10 months ago

ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression

Paper • 2410.08584 • Published Oct 11, 2024 • 12 •

commented a paper 12 months ago

T3M: Text Guided 3D Human Motion Synthesis from Speech

Paper • 2408.12885 • Published Aug 23, 2024 • 13 •

kaipeng

AI & ML interests

Recent Activity

Organizations

kpzhang996's activity