reproducing DeepSeek R1 Zero with Qwen2.5-0.5B on two 4090 GPUs
rasdani PRO
rasdani
AI & ML interests
None yet
Recent Activity
liked
a dataset
7 days ago
R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories
published
a dataset
9 days ago
rasdani/SkyRL-v0-293-data-oracle-4k-context-100-epochs
liked
a model
9 days ago
StringChaos/R2E-TestgenAgent