SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 11 days ago β’ 164
Jan 17 Releases βοΈ Collection Models and datasets of the second week of Jan 2025. β’ 23 items β’ Updated 29 days ago β’ 11
GGUF LoRA adapters Collection Adapters extracted from fine tuned models, using mergekit-extract-lora β’ 16 items β’ Updated 23 days ago β’ 3
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 16 items β’ Updated 10 days ago β’ 234
view article Article Decoding Strategies in Large Language Models By mlabonne β’ Oct 29, 2024 β’ 41
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi β’ 13 items β’ Updated Sep 18, 2024 β’ 227
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper β’ 2408.10914 β’ Published Aug 20, 2024 β’ 42