Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 23 days ago • 593
Proactive Detection of Voice Cloning with Localized Watermarking Paper • 2401.17264 • Published Jan 30 • 16
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9 • 41
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 257
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit Paper • 2312.09911 • Published Dec 15, 2023 • 52
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning Paper • 2312.06134 • Published Dec 11, 2023 • 2
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration Paper • 2311.04257 • Published Nov 7, 2023 • 20
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis Paper • 2312.03491 • Published Dec 6, 2023 • 34
Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models Paper • 2312.03632 • Published Dec 6, 2023 • 4
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 138
Merlin:Empowering Multimodal LLMs with Foresight Minds Paper • 2312.00589 • Published Nov 30, 2023 • 24
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis Paper • 2311.12454 • Published Nov 21, 2023 • 29
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Paper • 2310.00704 • Published Oct 1, 2023 • 19
Music ControlNet: Multiple Time-varying Controls for Music Generation Paper • 2311.07069 • Published Nov 13, 2023 • 43
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models Paper • 2310.11954 • Published Oct 18, 2023 • 24
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 31
Neurons in Large Language Models: Dead, N-gram, Positional Paper • 2309.04827 • Published Sep 9, 2023 • 16
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65
Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior Paper • 2309.00359 • Published Sep 1, 2023 • 20
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers Paper • 2308.13494 • Published Aug 25, 2023 • 9
TokenFlow: Consistent Diffusion Features for Consistent Video Editing Paper • 2307.10373 • Published Jul 19, 2023 • 56
Meta-Transformer: A Unified Framework for Multimodal Learning Paper • 2307.10802 • Published Jul 20, 2023 • 43
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 241
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs Paper • 2307.08581 • Published Jul 17, 2023 • 27
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models Paper • 2307.06949 • Published Jul 13, 2023 • 50
DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks Paper • 2307.05628 • Published Jul 11, 2023 • 9
VampNet: Music Generation via Masked Acoustic Token Modeling Paper • 2307.04686 • Published Jul 10, 2023 • 20
Lost in the Middle: How Language Models Use Long Contexts Paper • 2307.03172 • Published Jul 6, 2023 • 35