Efficient Large Language Model Collection Shortened LLMs from Depth Pruning; https://github.com/Nota-NetsPresso/shortened-llm • 15 items • Updated 8 days ago • 4
Assessing the Answerability of Queries in Retrieval-Augmented Code Generation Paper • 2411.05547 • Published Nov 8
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights Paper • 2404.11936 • Published Apr 18 • 1
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5 • 14
Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy Paper • 2111.09635 • Published Nov 18, 2021 • 1
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation Paper • 2304.00471 • Published Apr 2, 2023 • 1
Deep Model Compression Also Helps Models Capture Ambiguity Paper • 2306.07061 • Published Jun 12, 2023
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation Paper • 2304.00471 • Published Apr 2, 2023 • 1
On Architectural Compression of Text-to-Image Diffusion Models Paper • 2305.15798 • Published May 25, 2023 • 4