gary109
's Collections
Optimization
updated
Large Language Models for Compiler Optimization
Paper
•
2309.07062
•
Published
•
23
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Paper
•
2310.17157
•
Published
•
12
FP8-LM: Training FP8 Large Language Models
Paper
•
2310.18313
•
Published
•
33
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Paper
•
2310.19102
•
Published
•
10
AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning
Paper
•
2311.00257
•
Published
•
8
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper
•
2311.01282
•
Published
•
35
Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying
Paper
•
2311.09578
•
Published
•
14
Open-Sourcing Highly Capable Foundation Models: An evaluation of risks,
benefits, and alternative methods for pursuing open-source objectives
Paper
•
2311.09227
•
Published
•
6
Adaptive Shells for Efficient Neural Radiance Field Rendering
Paper
•
2311.10091
•
Published
•
18
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
603
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
Paper
•
2406.13457
•
Published
•
16