Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated about 19 hours ago • 60
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! By lyogavin • Apr 21, 2024 • 44
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4, 2024 • 60