mradermacher/Dracarys2-72B-Instruct-i1-GGUF

m9e

Oct 6, 2024

•

edited Oct 6, 2024

bash-3.2$ ./llama-server -c 0 -m /var/tmp/models/mradermacher/Dracarys2-72B-Instruct-i1-GGUF/Dracarys2-72B-Instruct.i1-Q6_K-00001-of-00002.gguf
build: 3889 (b6d6c528) with Apple clang version 16.0.0 (clang-1600.0.26.3) for arm64-apple-darwin24.0.0
system info: n_threads = 12, n_threads_batch = 12, total_threads = 16

system_info: n_threads = 12 (n_threads_batch = 12) / 16 | AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 1 | SVE = 0 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 1 | LLAMAFILE = 1 |

main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 15
main: loading model
llama_model_load: error loading model: tensor 'blk.40.ffn_down.weight' data is not within the file bounds, model is corrupted or incomplete

I had tried with the original names, and this renaming of the weights, and the folder - kept getting this error about that layer; unsure if this is a me thing as I also had just done a rebuild of llamacpp as mine was fairly old and was using (b6d6c5289f1c9c677657c380591201ddb210b649)

mradermacher

Owner Oct 7, 2024

You probably only downloaded or specified the part1of1. Instead, you have to concatenate the parts - the model description has a link to one of TheBloke's descriptions that shows how to concatenate all parts into a single file.

mradermacher changed discussion status to closed Oct 7, 2024

mradermacher
/

Dracarys2-72B-Instruct-i1-GGUF

doing it wrong?