Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published 7 days ago β’ 103
RedPajama: an Open Dataset for Training Large Language Models Paper β’ 2411.12372 β’ Published Nov 19 β’ 47
FluidML: Fast and Memory Efficient Inference Optimization Paper β’ 2411.09242 β’ Published Nov 14 β’ 1
TΓLU 3: Pushing Frontiers in Open Language Model Post-Training Paper β’ 2411.15124 β’ Published Nov 22 β’ 55
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 15 items β’ Updated 2 days ago β’ 195
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 β’ 15 items β’ Updated 19 days ago β’ 548
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. β’ 2 items β’ Updated 12 days ago β’ 81
Trained Models ποΈ Collection They may be small, but they're training like giants! β’ 8 items β’ Updated 22 days ago β’ 16
Minerva LLMs Collection The first family of LLMs pretrained from scratch on Italian. β’ 6 items β’ Updated 18 days ago β’ 31