unsloth/Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit

alecauduro

Feb 5

Anybody got ir running with vLLM?

shimmyshimmer

Unsloth AI org Feb 6

Anybody got ir running with vLLM?

Not supported at the moment but will soon :)

pty819

Feb 14

•

edited Feb 14

Anybody got ir running with vLLM?

Not supported at the moment but will soon :)

can I convert this into gguf just like what you have done to the R1 model?

Tubbe

28 days ago

@alecauduro

Seems to be working on the latest version for me.

juanmiTC

5 days ago

I would like to complement the information. The following instruction is not working
vllm serve unsloth/Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice

I served this model as follows (for me it is working with vllm 0.8.1)
vllm serve unsloth/Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit --load_format bitsandbytes --quantization bitsandbytes --gpu-memory-utilization 0.9 --max-model-len 9000

It is mandatory for me to specify gpu-memory-utilization and max-model-len

Regards

unsloth
/

Mistral-Small-24B-Instruct-2501-unsloth-bnb-4bit

How to run this version with vLLM