AutoAWQ example code

#2
by Makertyewq - opened

running the code example using "model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True, trust_remote_code=False, safetensors=True)"

gave me an output of token_input followed by token 185 repeated 512 times.

setting fuse_layers=False seems to have fixed this.
hope this is of value to someone.

setting fuse_layers=False works for me, thanks!

Sign up or log in to comment