AutoAWQ example code
#2
by
Makertyewq
- opened
running the code example using "model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True, trust_remote_code=False, safetensors=True)"
gave me an output of token_input followed by token 185 repeated 512 times.
setting fuse_layers=False seems to have fixed this.
hope this is of value to someone.
setting fuse_layers=False works for me, thanks!