SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Paper • 2501.06842 • Published 23 days ago • 15
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Paper • 2412.13795 • Published Dec 18, 2024 • 19
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Paper • 2407.03203 • Published Jul 3, 2024 • 12
RLHF Workflow: From Reward Modeling to Online RLHF Paper • 2405.07863 • Published May 13, 2024 • 67
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise Paper • 2312.14567 • Published Dec 22, 2023 • 1
Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise Paper • 2312.14567 • Published Dec 22, 2023 • 1
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models Paper • 2306.12420 • Published Jun 21, 2023 • 2
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment Paper • 2304.06767 • Published Apr 13, 2023 • 2
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets Paper • 2401.01916 • Published Jan 3, 2024 • 1
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Paper • 2403.17919 • Published Mar 26, 2024 • 16
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Paper • 2403.17919 • Published Mar 26, 2024 • 16