SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published Feb 4 β’ 214
Towards Best Practices for Open Datasets for LLM Training Paper β’ 2501.08365 β’ Published Jan 14 β’ 59