view article Article Luth: Efficient French Specialization for Small Language Models By MaxLSB and 1 other • 6 days ago • 9
🧠SmolLM3 Collection Smol, multilingual, long-context reasoner • 12 items • Updated 12 days ago • 69
Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 25
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Apr 30 • 88
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 15 items • Updated 16 minutes ago • 87
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 17 days ago • 51
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 626
view article Article Mixture of Experts Explained By osanseviero and 5 others • Dec 11, 2023 • 827
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 505
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 36
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7 • 197
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 410
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 153