metadata

license: apache-2.0

Grok-1 GGUF Quantizations

As discovered by @DgDev91 there's a slight issue with file naming when using these Quant's with current llama.cpp.

A fix is already provided by @phymbert in #6192.

For ease of use i've created a branch (Quick-Fix Branch) that incorporates these fixes.

This repository contains unofficial GGUF Quantizations of Grok-1, compatible with llama.cpp as of PR- Add grok-1 support #6204.

Updates

The splits have been updated to utilize the improvements from PR: llama_model_loader: support multiple split/shard GGUFs. As a result, manual merging with gguf-split is no longer required.

With this, there is no need to merge the split files before use. Just download all splits and run llama.cpp with the first split like you would previously. It'll detect the other splits and load them as well.

Available Quantizations

The following Quantizations are currently available for download:

Quant	Split Files
Q2_K	split-1-of-9, split-2-of-9, split-3-of-9, split-4-of-9, split-5-of-9, split-6-of-9, split-7-of-9, split-8-of-9, split-9-of-9
Q4_K	split-1-of-9, split-2-of-9, split-3-of-9, split-4-of-9, split-5-of-9, split-6-of-9, split-7-of-9, split-8-of-9, split-9-of-9
Q6_K	split-1-of-9, split-2-of-9, split-3-of-9, split-4-of-9, split-5-of-9, split-6-of-9, split-7-of-9, split-8-of-9, split-9-of-9

More Quantizations will be uploaded soon. All current Quants are created without any importance matrix.