doing it wrong?

#1
by m9e - opened
bash-3.2$ ./llama-server -c 0 -m /var/tmp/models/mradermacher/Dracarys2-72B-Instruct-i1-GGUF/Dracarys2-72B-Instruct.i1-Q6_K-00001-of-00002.gguf
build: 3889 (b6d6c528) with Apple clang version 16.0.0 (clang-1600.0.26.3) for arm64-apple-darwin24.0.0
system info: n_threads = 12, n_threads_batch = 12, total_threads = 16

system_info: n_threads = 12 (n_threads_batch = 12) / 16 | AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 1 | SVE = 0 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 1 | LLAMAFILE = 1 |

main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 15
main: loading model
llama_model_load: error loading model: tensor 'blk.40.ffn_down.weight' data is not within the file bounds, model is corrupted or incomplete

I had tried with the original names, and this renaming of the weights, and the folder - kept getting this error about that layer; unsure if this is a me thing as I also had just done a rebuild of llamacpp as mine was fairly old and was using (b6d6c5289f1c9c677657c380591201ddb210b649)

You probably only downloaded or specified the part1of1. Instead, you have to concatenate the parts - the model description has a link to one of TheBloke's descriptions that shows how to concatenate all parts into a single file.

mradermacher changed discussion status to closed

Sign up or log in to comment