always getting 0 in output

by xubuild - opened Jan 10, 2024

Jan 10, 2024

the model always respond with token id 0 for any input, while same prompt can get correct response from https://huggingface.co/casperhansen/mixtral-instruct-awq

tested with vllm
llm = LLM(model=model_path, quantization="awq", trust_remote_code=True, dtype="auto", enforce_eager=True, max_model_len=12288)

sd3ntato

Jan 11, 2024

same for me!

hassantrixly

Jan 12, 2024

Is there any fix for that?

hnhlester

Jan 12, 2024

Also happening for me. Seems to be an issue with this model. Other 8x7b awq models work perfectly fine, for example dolphin-2.7-8x7b.

brendanlui

Jan 15, 2024

@TheBloke Any update about this issue? Always get an empty output as well...

FYI,

python3 -m vllm.entrypoints.openai.api_server --model "$MODEL_NAME" --host 0.0.0.0 --port 8181 --quantization awq --dtype auto

    "choices": [
        {
            "index": 0,
            "text": "",
            "logprobs": null,
            "finish_reason": "length"
        }
    ],

Bourhano

Jan 15, 2024

•

edited Jan 15, 2024

same, maybe a prompt template issue?

Fouldon

Jan 15, 2024

Any update ?

dafen

Jan 17, 2024

俺也一样

hiranya911

Jan 17, 2024

Also seeing this in testing. Our vLLM setup:

sampling_params = SamplingParams(temperature=0.1, top_p=0.95)
model = 'TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ'

llm = LLM(
  model=model, 
  gpu_memory_utilization=0.7, 
  max_model_len=2048,
)