ytaewon's picture

ytaewon

hamzzi

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 19 days ago

Group Sequence Policy Optimization

commented on a paper 19 days ago

Group Sequence Policy Optimization

upvoted a paper 28 days ago

How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation

View all activity

Organizations

upvoted a paper 19 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 25 days ago • 289

upvoted 2 papers 28 days ago

How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation

Paper • 2312.17115 • Published Dec 28, 2023 • 2

Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States

Paper • 2505.17663 • Published May 23 • 15

upvoted a paper about 1 month ago

LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling

Paper • 2505.19187 • Published May 25 • 13

upvoted 3 papers 3 months ago

Learning from Peers in Reasoning Models

Paper • 2505.07787 • Published May 12 • 46

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 182

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7 • 66

upvoted 3 papers 4 months ago

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Paper • 2504.00891 • Published Apr 1 • 14

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31 • 301

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3 • 57

upvoted 10 papers 5 months ago

JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published Mar 31 • 62

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

Paper • 2504.00883 • Published Apr 1 • 66

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27 • 42

OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning

Paper • 2503.16081 • Published Mar 20 • 28

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

Paper • 2503.22675 • Published Mar 28 • 37

Effectively Controlling Reasoning Models through Thinking Intervention

Paper • 2503.24370 • Published Mar 31 • 20

Efficient Inference for Large Reasoning Models: A Survey

Paper • 2503.23077 • Published Mar 29 • 47

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 63

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Paper • 2503.19693 • Published Mar 25 • 77

Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Paper • 2503.22230 • Published Mar 28 • 46