|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
# ggml versions of OpenLLaMa 7B |
|
|
|
For use with [llama.cpp](https://github.com/ggerganov/llama.cpp). |
|
|
|
- Version: 1T tokens final version |
|
- Project: [OpenLLaMA: An Open Reproduction of LLaMA](https://github.com/openlm-research/open_llama) |
|
- Model: [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) |
|
- llama.cpp 4,5,8-bit quantization: build 567(2d5db48) or later |
|
- llama.cpp newer quantization formats: build 616(99009e7) or later |
|
|
|
## Perplexity |
|
|
|
Calculated with llama.cpp, default settings (context 512, batch 512). |
|
Test data: [`wiki.test.raw` of WikiText-103](https://blog.salesforceairesearch.com/the-wikitext-long-term-dependency-language-modeling-dataset/): |
|
|
|
| model | score | |
|
| ------------------------- | -----: | |
|
| open-llama-7b-q2_K.bin | 8.5152 | |
|
| open-llama-7b-q3_K_S.bin | 7.6623 | |
|
| open-llama-7b-q3_K.bin | 7.3837 | |
|
| open-llama-7b-q3_K_L.bin | 7.3043 | |
|
| open-llama-7b-q4_0.bin | 7.2116 | |
|
| open-llama-7b-q4_1.bin | 7.1609 | |
|
| open-llama-7b-q4_K_S.bin | 7.1516 | |
|
| open-llama-7b-q4_K.bin | 7.1116 | |
|
| open-llama-7b-q5_0.bin | 7.0353 | |
|
| open-llama-7b-q5_K_S.bin | 7.0325 | |
|
| open-llama-7b-q5_1.bin | 7.0318 | |
|
| open-llama-7b-q5_K.bin | 7.0272 | |
|
| open-llama-7b-q6_K.bin | 7.0050 | |
|
| open-llama-7b-q8_0.bin | 6.9968 | |
|
| open-llama-7b-f16.bin | 6.9966 | |
|
|