---
license: apache-2.0
---

# ggml versions of OpenLLaMa 7B

For use with [llama.cpp](https://github.com/ggerganov/llama.cpp).

- Version: 1T tokens final version
- Project: [OpenLLaMA: An Open Reproduction of LLaMA](https://github.com/openlm-research/open_llama)
- Model: [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b)
- llama.cpp 4,5,8-bit quantization: build 567(2d5db48) or later
- llama.cpp newer quantization formats: build 616(99009e7) or later

## Perplexity

Calculated with llama.cpp, default settings (context 512, batch 512).
Test data: [`wiki.test.raw` of WikiText-103](https://blog.salesforceairesearch.com/the-wikitext-long-term-dependency-language-modeling-dataset/):

| model                     | score  |
| ------------------------- | -----: |
| open-llama-7b-q2_K.bin    | 8.5152 |
| open-llama-7b-q3_K_S.bin  | 7.6623 |
| open-llama-7b-q3_K.bin    | 7.3837 |
| open-llama-7b-q3_K_L.bin  | 7.3043 |
| open-llama-7b-q4_0.bin    | 7.2116 |
| open-llama-7b-q4_1.bin    | 7.1609 |
| open-llama-7b-q4_K_S.bin  | 7.1516 |
| open-llama-7b-q4_K.bin    | 7.1116 |
| open-llama-7b-q5_0.bin    | 7.0353 |
| open-llama-7b-q5_K_S.bin  | 7.0325 |
| open-llama-7b-q5_1.bin    | 7.0318 |
| open-llama-7b-q5_K.bin    | 7.0272 |
| open-llama-7b-q6_K.bin    | 7.0050 |
| open-llama-7b-q8_0.bin    | 6.9968 |
| open-llama-7b-f16.bin     | 6.9966 |