can vllm launch this model?

by chopin1998 - opened Dec 16, 2024

chopin1998

Dec 16, 2024

current, it said..

ERROR 12-16 09:49:55 engine.py:366] raise ValueError(f"No supported config format found in {model}")
ERROR 12-16 09:49:55 engine.py:366] ValueError: No supported config format found in unsloth/Llama-3.3-70B-Instruct-GGUF

vllm version is 0.6.4.post1
transformers 4.47.0

shimmyshimmer

Unsloth AI org Dec 16, 2024

•

edited Jan 23

current, it said..

ERROR 12-16 09:49:55 engine.py:366] raise ValueError(f"No supported config format found in {model}")
ERROR 12-16 09:49:55 engine.py:366] ValueError: No supported config format found in unsloth/Llama-3.3-70B-Instruct-GGUF

vllm version is 0.6.4.post1
transformers 4.47.0

You need a config.json file. Copy the original config.json 16bit file from Github and it should work. I wouldnt recommend using vllm for GGUF, instead use llama.cpp

babakgh

Dec 22, 2024

Hi. Where is this file link exactly?
Thanks.

KeesGeerligs

Jan 23

•

edited Jan 23

Why would you not recommend vllm for GGUF exactly? I am looking at serving a quantized model API and they seem to be the best for this? (https://blog.vllm.ai/2024/09/05/perf-update.html)

Or is this recommendation only for individual users?

Nevermind found the answer: (https://docs.vllm.ai/en/latest/features/quantization/gguf.html)

shimmyshimmer

Unsloth AI org Jan 23

Hi. Where is this file link exactly?
Thanks.

whoops much apologies i missed your message. it's located here: https://huggingface.co/unsloth/Llama-3.3-70B-Instruct/tree/main

shimmyshimmer

Unsloth AI org Jan 23

Why would you not recommend vllm for GGUF exactly? I am looking at serving a quantized model API and they seem to be the best for this? (https://blog.vllm.ai/2024/09/05/perf-update.html)

Or is this recommendation only for individual users?

Nevermind found the answer: (https://docs.vllm.ai/en/latest/features/quantization/gguf.html)

Yea usually you'd rather 4bit / 8bit versions in VLLM rather than GGUFs

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment