Qwen2.5-Math Collection Math-specific model series based on Qwen2.5 • 11 items • Updated Jan 14 • 80
OLMo 2 Preview Post-trained Models Collection These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions. • 6 items • Updated 13 days ago • 4
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 214
NeMo Curator - Classifier Models Collection Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated about 6 hours ago • 16
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain Paper • 2412.13018 • Published Dec 17, 2024 • 41
🔱 Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs • 34 items • Updated about 1 month ago • 26
DCLM Pools Collection Raw pools for use in DCLM competition • 5 items • Updated Jul 17, 2024 • 1
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 67
MagpieLM Collection Aligning LMs with Fully Open Recipe + Synthetic Data Generated from Open-Source LMs. • 9 items • Updated Jan 13 • 16
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published Oct 24, 2024 • 42
ScaleQuest Collection We introduce ScaleQuest, a scalable and novel data synthesis method. Project Page: https://scalequest.github.io/ • 11 items • Updated about 11 hours ago • 6
C4AI Aya Expanse Collection Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 4 items • Updated 24 days ago • 38