Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published 7 days ago β’ 103
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper β’ 2412.11768 β’ Published 9 days ago β’ 40
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ 20 days ago β’ 70
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais β’ Nov 13 β’ 98
view article Article βοΈ π§πΌβπΎ Let's grow some Domain Specific Datasets together By burtenshaw β’ Apr 29 β’ 29
view article Article RAG Empowerment: Cohere C4AI Command-R and Transformers Unveiled By Andyrasika β’ Apr 7 β’ 10