10 6

Zhang

Zhenru

AI & ML interests

None yet

Recent Activity

authored a paper 7 days ago

START: Self-taught Reasoner with Tools

upvoted a paper 7 days ago

START: Self-taught Reasoner with Tools

upvoted a paper about 2 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

View all activity

Organizations

Zhenru's activity

authored a paper 7 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 8 days ago • 87

upvoted a paper 7 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 8 days ago • 87

upvoted a paper about 2 months ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 63

updated 2 models about 2 months ago

Qwen/Qwen2.5-Math-7B-PRM800K

Text Classification • Updated Jan 17 • 2.71k • 13

Qwen/Qwen2.5-Math-PRM-72B

Text Classification • Updated Jan 17 • 1.02k • 71

New activity in Qwen/Qwen2.5-Math-PRM-7B about 2 months ago

Fix backslashes in prompt example

#5 opened about 2 months ago by

sorokin

"<extra_0>" is not special token ? I got 5 token_ids ，is it right？

#4 opened about 2 months ago by

ShelterW

commented a paper about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92 •

updated a model about 2 months ago

Qwen/Qwen2.5-Math-PRM-7B

Text Classification • Updated Jan 17 • 40.6k • 61

New activity in Qwen/Qwen2.5-Math-PRM-7B about 2 months ago

Could you clarify whether the PRM800K deduplication was performed using the original 5000-test set from MATH or the MATH500 dataset?

#2 opened about 2 months ago by

masterLan

question about the step separato "\n\n"

#3 opened about 2 months ago by

pixas

authored a paper about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92

commented a paper about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92 •

upvoted a paper about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92

updated a collection about 2 months ago

Qwen2.5-Math

Collection

Math-specific model series based on Qwen2.5 • 11 items • Updated Jan 14 • 78

upvoted a paper 2 months ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 50

upvoted 2 papers 3 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 352

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 80