LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation Paper • 2410.13846 • Published Oct 17, 2024 • 1
When Attention Sink Emerges in Language Models: An Empirical View Paper • 2410.10781 • Published Oct 14, 2024
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization Paper • 2305.19420 • Published May 30, 2023
MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction Paper • 2305.18969 • Published May 30, 2023