ML Optimization Papers - a hasanar1f Collection

hasanar1f 's Collections

Agents

ML Optimization Papers

ML Optimization Papers

updated 4 days ago

FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 23
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 85
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Paper • 2501.06842 • Published Jan 12 • 16
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 52
LTX-Video: Realtime Video Latent Diffusion

Paper • 2501.00103 • Published Dec 30, 2024 • 44
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
TRecViT: A Recurrent Video Transformer

Paper • 2412.14294 • Published Dec 18, 2024 • 13
iFormer: Integrating ConvNet and Transformer for Mobile Application

Paper • 2501.15369 • Published Jan 26 • 12
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21 • 11
Return of the Encoder: Maximizing Parameter Efficiency for SLMs

Paper • 2501.16273 • Published Jan 27 • 5
Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Paper • 2503.09579 • Published 12 days ago • 5
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval

Paper • 2503.00540 • Published 23 days ago • 1
MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding

Paper • 2502.03183 • Published Feb 5 • 1
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

Paper • 2503.08686 • Published 13 days ago • 19
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

Paper • 2503.08689 • Published 13 days ago • 4
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Paper • 2503.07677 • Published 14 days ago • 80
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Paper • 2503.08619 • Published 13 days ago • 20