Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
3
15
1
zhu
xuekai
Follow
lindsay-qu's profile picture
1 follower
·
1 following
AI & ML interests
None yet
Recent Activity
upvoted
an
article
about 19 hours ago
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge
upvoted
a
paper
10 days ago
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
upvoted
an
article
25 days ago
Putting RL back in RLHF
View all activity
Organizations
Papers
2
arxiv:
2412.14689
arxiv:
2305.13888
models
None public yet
datasets
1
xuekai/pad_train
Viewer
•
Updated
Mar 21, 2024
•
184k
•
12