Error when using this..

#1
by kcramp858 - opened

I have CUDA 11.8 and it currently works with ooba textgen with the llama-30b-supercot-4g-128, but when I run this, I get the following error:

RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([1, 832]) from checkpoint, the shape in current model is torch.Size([52, 832]).
size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([1, 6656]) from checkpoint, the shape in current model is torch.Size([52, 6656]).
size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([1, 832]) from checkpoint, the shape in current model is torch.Size([52, 832]).
size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([1, 6656]) from checkpoint, the shape in current model is

[truncated for visiblity]

I believe it is because my GPTQ version does not match.

How do we determine which GPTQ version to use?

For what it's worth, I initially got this error too and I had to edit my config-user.yaml file to change groupsize to None under the VicUnlocked-30B-LoRA-GPTQ section.

.. that solved my problem, I am just an idiot. Set groupsize to none.

kcramp858 changed discussion status to closed

Sign up or log in to comment