Converting to GGUF creates shape error

#2
by hwarnecke - opened

I converted the model into GGUF format, both via the llama.cpp tool and the HF gguf-my-repo tool.
Both a quantized version and a non-quantized version.
It successfully creates a file and I can build a model with the file via Ollama but running the model always results in a tensor dimension error:
Error: llama runner process has terminated: signal: aborted (core dumped) error loading model: check_tensor_dims: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1 llama_load_model_from_file: exception loading model

Has anyone experience with this and can point me in the right direction or has anyone already successfully converted the model?
EDIT:
I have seen the same issue mentioned with the original mistral model and after updating Ollama I can successfully run the GGUF variant of the original model. But it did not work with this fine-tune. That's why I am asking here. Are there any additional steps necessary when converting this model?

Edit 2: today it worked. Maybe llama.cpp took some time to adjust. Just weird it worked for the original model a week earlier than for this one.

hwarnecke changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment