TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs Paper • 2412.11242 • Published Dec 15, 2024 • 1
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning Paper • 2502.12054 • Published 27 days ago • 6
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput Paper • 2406.14066 • Published Jun 20, 2024 • 2
PockEngine: Sparse and Efficient Fine-tuning in a Pocket Paper • 2310.17752 • Published Oct 26, 2023 • 14
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 36