dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16 Text Generation • Updated 9 days ago • 24 • 1
dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 Text Generation • Updated 9 days ago • 27 • 1
Medical LLMs Collection My experiments to push AI in Medicine, not to replace doctors but to empower them • 4 items • Updated about 5 hours ago
Reasoning Work Collection Models I've trained to think like DeepSeek R1 using online learning - Group Relative Policy Optimization (GRPO) introduced by DeepSeekMath • 6 items • Updated about 5 hours ago
dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 Text Generation • Updated 9 days ago • 27 • 1