A Closer Look into Mixture-of-Experts in Large Language Models Paper • 2406.18219 • Published Jun 26 • 15
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 20 days ago • 104
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay Paper • 2412.04449 • Published 20 days ago • 6
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published 6 days ago • 7