An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models Paper • 2403.06764 • Published Mar 11 • 26
Efficient Inference of Vision Instruction-Following Models with Elastic Cache Paper • 2407.18121 • Published Jul 25 • 16
Don't Look Twice: Faster Video Transformers with Run-Length Tokenization Paper • 2411.05222 • Published Nov 7 • 2