PockEngine: Sparse and Efficient Fine-tuning in a Pocket Paper • 2310.17752 • Published Oct 26, 2023 • 12
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput Paper • 2406.14066 • Published Jun 20 • 1