LLama 3.2 8B Instruct in gguf format

This is the text only part of the Llama 3.2 11b Vision Instruct model.

I removed the vision, multimodal and cross attention layers, and renamed the rest in line with the regular llama models.

Lastly, I trimmed off 8 rows from model.embed_tokens.weight to [128256, 4096], and removed the image token from the tokenizer.

It has a touch less perplexity than Llama 8.1 8B, so maybe it will be useful to someone.

Downloads last month
58
GGUF
Model size
8.03B params
Architecture
llama

4-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.