R1 distill to Mistral Small?

#99
by nfunctor - opened

Thanks a lot for your great work! Your distill models are pretty nice, and I wondered if you could consider making a distill to mistralai/Mistral-Small-24B-Instruct-2501 (also available in Base model). With Apache license, their performance for 24B size is very attractive, and such a model can do well for long-context generation on a single 24gb gpu when quantised. Thanks!

would be interested too

Really hoping to see this distillation! 🙏
Being able to run a powerful R1-distilled model on a single 24GB card would be incredible. Would love to experiment with this - please consider it!

Absolutely needed

Sign up or log in to comment