Running 1.38k 1.38k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper β’ 2502.06781 β’ Published 13 days ago β’ 59
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation β’ Updated 3 days ago β’ 884 β’ 43
Running on Zero 1.34k 1.34k FLUX LoRa the Explorer π Generate images based on prompts and LoRA models
agentica-org/DeepScaleR-1.5B-Preview Text Generation β’ Updated about 17 hours ago β’ 22.5k β’ β’ 470
Running on CPU Upgrade 9 9 The Arabic RAG Leaderboard π The only leaderboard you will require for your RAG needs π
view article Article What is test-time compute and how to scale it? By Kseniase and 1 other β’ 17 days ago β’ 39
Improving Transformer World Models for Data-Efficient RL Paper β’ 2502.01591 β’ Published 20 days ago β’ 9
Reasoning Datasets Collection Distilled synthetic Reasoning datasets β’ 7 items β’ Updated 21 days ago β’ 55