view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 153
Running 1.89k 1.89k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Running 47 47 Visualize Dataset (v2.0+ latest dataset format) 💻 Explore robot datasets by entering a dataset ID
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 15 days ago • 138
SYNTHETIC-1 Collection A collection of tasks & verifiers for reasoning datasets • 9 items • Updated 11 days ago • 49
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated 25 days ago • 50
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 27 days ago • 196
view article Article Replicating DeepSeek R1 for Information Extraction By Ihor • about 1 month ago • 36