always getting 0 in output
the model always respond with token id 0 for any input, while same prompt can get correct response from https://huggingface.co/casperhansen/mixtral-instruct-awq
tested with vllm
llm = LLM(model=model_path, quantization="awq", trust_remote_code=True, dtype="auto", enforce_eager=True, max_model_len=12288)
same for me!
Is there any fix for that?
Also happening for me. Seems to be an issue with this model. Other 8x7b awq models work perfectly fine, for example dolphin-2.7-8x7b.
@TheBloke Any update about this issue? Always get an empty output as well...
FYI,
python3 -m vllm.entrypoints.openai.api_server --model "$MODEL_NAME" --host 0.0.0.0 --port 8181 --quantization awq --dtype auto
"choices": [
{
"index": 0,
"text": "",
"logprobs": null,
"finish_reason": "length"
}
],
same, maybe a prompt template issue?
Any update ?
ไฟบไนไธๆ ท
Also seeing this in testing. Our vLLM setup:
sampling_params = SamplingParams(temperature=0.1, top_p=0.95)
model = 'TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ'
llm = LLM(
model=model,
gpu_memory_utilization=0.7,
max_model_len=2048,
)
prompt็ๅๅ ๏ผๅฏไปฅๅฐ่ฏๆนๆf"""USER:{prompt}\nAssistant:"""
Same here...
same
Iโve heard that this version works fine with vLLM: https://huggingface.co/casperhansen/mixtral-instruct-awq
@umarbutler can confirm that's what I resorted to using instead
@jcole-laivly ๐๐ป I can confirm that it works for me as well.
Yes, I uploaded it since this repository has a corrupted model (somehow). Please requantize if you experience any problem.