How to run the model OpenGVLab/InternVL2-40B-AWQ with vllm docker image?
#2
by
andryevinnik
- opened
I used the following command:
docker run --log-opt max-size=10m --log-opt max-file=1 --rm -it --gpus '"device=0"' -p 8080:8000 --mount type=bind,source=/ssd_2/huggingface,target=/root/.cache/huggingface vllm/vllm-openai:v0.5.4 --model OpenGVLab/InternVL2-40B-AWQ --max-model-len 8192 --trust-remote-code
KeyError: 'model.layers.51.mlp.gate_up_proj.qweight'
Tried command with additional: --dtype half
the same error
Also tried: -q awq
ValueError: Cannot find the config file for awq
Please advise how to run the model
Why not trying LMDeploy: https://lmdeploy.readthedocs.io/en/latest/llm/api_server.html
Our awq model is generated by lmdeploy. Vllm recently supports Internvl. Please try again.
czczup
changed discussion status to
closed