arxiv:2502.03275
Yuandong Tian
tydsh
AI & ML interests
Reinforcement Learning, Optimization, Representation Learning
Recent Activity
authored
a paper
3 days ago
Token Assorted: Mixing Latent and Text Tokens for Improved Language
Model Reasoning
authored
a paper
12 days ago
Towards General-Purpose Model-Free Reinforcement Learning
authored
a paper
16 days ago
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
Organizations
None yet
Papers
21
models
None public yet
datasets
None public yet