Model only outputs "!!!"
#1
by
prats05
- opened
I don't know why but the GPTQ converted versions of EfficientQAT models seem to only output "!!!" no matter what prompt. Is there some error? I tried it with this model and on Qwen2-VL but same issue.
Code:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoProcessor
from qwen_vl_utils import process_vision_info
tokenizer = AutoTokenizer.from_pretrained("ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-GPTQ")
processor = AutoProcessor.from_pretrained("ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-GPTQ")
model = AutoModelForCausalLM.from_pretrained("ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-GPTQ")
model.to('cuda:0')
model.eval()
messages = [
{
"role": "user",
"content": "Who are you",
}
]
# Preparation for inference
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
padding=True,
return_tensors="pt",
)
inputs = inputs.to("cuda:0")
print(tokenizer.decode(model.generate(**tokenizer("who are you", return_tensors="pt").to(model.device))[0]))
# outputs "!!!!" no matter the prompt.