Draft Models Collection Draft models for Llama, Qwen, QwQ, Mistral ... • 5 items • Updated 1 day ago
Draft Models Collection Draft models for Llama, Qwen, QwQ, Mistral ... • 5 items • Updated 1 day ago
Open-R1 Reproduce Collection Reproduce Deepseek distilled models based on open-r1. • 3 items • Updated 5 days ago
Open-R1 Reproduce Collection Reproduce Deepseek distilled models based on open-r1. • 3 items • Updated 5 days ago
Draft Models Collection Draft models for Llama, Qwen, QwQ, Mistral ... • 5 items • Updated 1 day ago
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading Paper • 2502.12574 • Published 25 days ago • 11