Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving Paper • 2407.00079 • Published Jun 24, 2024 • 5
Efficiently Programming Large Language Models using SGLang Paper • 2312.07104 • Published Dec 12, 2023 • 7