27 84 1

Lee Gao

leegao19

AI & ML interests

None yet

Recent Activity

upvoted a paper about 8 hours ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

upvoted a paper about 10 hours ago

Entropy-Guided Attention for Private LLMs

commented a paper about 11 hours ago

Entropy-Guided Attention for Private LLMs

View all activity

Organizations

leegao19's activity

upvoted a paper about 8 hours ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 4 days ago • 72

upvoted a paper about 10 hours ago

Entropy-Guided Attention for Private LLMs

Paper • 2501.03489 • Published 6 days ago • 12

upvoted 8 papers 9 days ago

Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings

Paper • 2501.00073 • Published 14 days ago • 1

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Paper • 2412.07171 • Published Dec 10, 2024 • 1

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Paper • 2501.00712 • Published 12 days ago • 5

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Paper • 2501.00658 • Published 12 days ago • 7

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published 20 days ago • 39

upvoted a paper about 2 months ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 48

upvoted 9 papers 10 months ago

Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

Paper • 2403.09636 • Published Mar 14, 2024 • 2

Video Editing via Factorized Diffusion Distillation

Paper • 2403.09334 • Published Mar 14, 2024 • 21

Scattered Mixture-of-Experts Implementation

Paper • 2403.08245 • Published Mar 13, 2024 • 1

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 49

Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study

Paper • 2401.17981 • Published Jan 31, 2024 • 1

In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 42

Function Vectors in Large Language Models

Paper • 2310.15213 • Published Oct 23, 2023 • 1

What Algorithms can Transformers Learn? A Study in Length Generalization

Paper • 2310.16028 • Published Oct 24, 2023 • 2

Empower Your Model with Longer and Better Context Comprehension

Paper • 2307.13365 • Published Jul 25, 2023 • 1