WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation Paper • 2312.14187 • Published Dec 20, 2023 • 49
SparQ Attention: Bandwidth-Efficient LLM Inference Paper • 2312.04985 • Published Dec 8, 2023 • 38
OneLLM: One Framework to Align All Modalities with Language Paper • 2312.03700 • Published Dec 6, 2023 • 20
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians Paper • 2312.03029 • Published Dec 5, 2023 • 23
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper • 2310.17680 • Published Oct 26, 2023 • 70
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models Paper • 2310.16795 • Published Oct 25, 2023 • 26
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 96