view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22, 2024 • 77
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 25 days ago • 142
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context Jul 23, 2024 • 230
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10, 2024 • 107
Zephyr 7B Gemma Collection Models, dataset, and Demo for Zephyr 7B Gemma. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 5 items • Updated Apr 12, 2024 • 15
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 129