Qwen/Qwen2-57B-A14B-Instruct

#115
by yttria - opened

ggerganov said intermediate_size should not be set to 20480, instead the convert-hf-to-gguf.py script has now been fixed to take into account the moe_intermediate_size and shared_expert_intermediate_size fields from config.json. Would using the new version without changing intermediate_size make a better quant?
https://github.com/ggerganov/llama.cpp/issues/7816#issuecomment-2155898007

Sure, thanks for bringing this to my attention (especially for the link)!

Should be in the queue, and done in the next few days at most.

mradermacher changed discussion status to closed

Failed again. I now also think that the 57B model is simply broken (see the bug report you linked).

Sign up or log in to comment