MMTEB: Massive Multilingual Text Embedding Benchmark Paper ā¢ 2502.13595 ā¢ Published 27 days ago ā¢ 32
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper ā¢ 2502.02737 ā¢ Published Feb 4 ā¢ 205
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect Paper ā¢ 2409.17912 ā¢ Published Sep 26, 2024 ā¢ 29
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M ā¢ 16 items ā¢ Updated 25 days ago ā¢ 248
š Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets ā¢ 8 items ā¢ Updated Jun 12, 2024 ā¢ 37
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper ā¢ 2410.05355 ā¢ Published Oct 7, 2024 ā¢ 35
Searching for Better ViT Baselines Collection Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). ā¢ 28 items ā¢ Updated Feb 14 ā¢ 17