Guanzhou Ke's picture

1 80 5

Guanzhou Ke

guanzhouk

·

Guanzhou-Ke

AI & ML interests

Multi-modal learning

Recent Activity

upvoted a paper 1 day ago

MoBA: Mixture of Block Attention for Long-Context LLMs

upvoted a paper 6 days ago

Qwen2.5-VL Technical Report

liked a dataset 7 days ago

lmms-lab/LLaVA-Video-178K

View all activity

Organizations

None yet

guanzhouk's activity

upvoted a paper 1 day ago

MoBA: Mixture of Block Attention for Long-Context LLMs

Paper • 2502.13189 • Published 8 days ago • 12

upvoted a paper 6 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 7 days ago • 145

upvoted a paper 7 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 10 days ago • 134

upvoted a paper 9 days ago

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Paper • 2502.09696 • Published 13 days ago • 38

upvoted 2 papers 15 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 62

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 63

upvoted a paper 23 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 26 days ago • 106

upvoted 2 papers about 1 month ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 332

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 98

upvoted 5 papers about 2 months ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 257

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 90

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published Dec 30, 2024 • 39

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Paper • 2412.20070 • Published Dec 28, 2024 • 46

upvoted 6 papers 2 months ago

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 51

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 41

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 92

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 92

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 24