Running 1.49k 1.49k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 9 days ago • 134
Step-Audio Collection Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 3 items • Updated 7 days ago • 28
Lucy-in-the-Sky/Mistral-Small-24B-Instruct-2501-reasoning-Q6_K-GGUF Text Generation • Updated 7 days ago • 69 • 1
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation • Updated 4 days ago • 951 • 43
bartowski/SicariusSicariiStuff_Phi-lthy4-GGUF Text Generation • Updated 12 days ago • 2.07k • 6