DistAya
's Collections
Knowledge Distillation
updated
shayekh/aya8b-distillkit-hidden
shayekh/aya8b-distillkit-logits
Updated
AhmadMustafa/distAyaQwen
Less is More: Task-aware Layer-wise Distillation for Language Model
Compression
Paper
•
2210.01351
•
Published
•
2
A Survey on Knowledge Distillation of Large Language Models
Paper
•
2402.13116
•
Published
•
3
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
Labelling
Paper
•
2311.00430
•
Published
•
57
On-Policy Distillation of Language Models: Learning from Self-Generated
Mistakes
Paper
•
2306.13649
•
Published
•
17
Compact Language Models via Pruning and Knowledge Distillation
Paper
•
2407.14679
•
Published
•
39
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper
•
2408.11796
•
Published
•
57
DistiLLM: Towards Streamlined Distillation for Large Language Models
Paper
•
2402.03898
•
Published
•
1
Relational Knowledge Distillation
Paper
•
1904.05068
•
Published
•
1
Distilling Step-by-Step! Outperforming Larger Language Models with Less
Training Data and Smaller Model Sizes
Paper
•
2305.02301
•
Published
•
2