R1 distill to Mistral Small?
#99
by
nfunctor
- opened
Thanks a lot for your great work! Your distill models are pretty nice, and I wondered if you could consider making a distill to mistralai/Mistral-Small-24B-Instruct-2501
(also available in Base model). With Apache license, their performance for 24B size is very attractive, and such a model can do well for long-context generation on a single 24gb gpu when quantised. Thanks!
would be interested too
Really hoping to see this distillation! 🙏
Being able to run a powerful R1-distilled model on a single 24GB card would be incredible. Would love to experiment with this - please consider it!
Absolutely needed