--- license: apache-2.0 --- # ggml versions of OpenLLaMa 7B For use with [llama.cpp](https://github.com/ggerganov/llama.cpp). - Version: 1T tokens final version - Project: [OpenLLaMA: An Open Reproduction of LLaMA](https://github.com/openlm-research/open_llama) - Model: [openlm-research/open_llama_7b](https://huggingface.co/openlm-research/open_llama_7b) - llama.cpp 4,5,8-bit quantization: build 567(2d5db48) or later - llama.cpp newer quantization formats: build 616(99009e7) or later ## Perplexity Calculated with llama.cpp, default settings (context 512, batch 512). Test data: [`wiki.test.raw` of WikiText-103](https://blog.salesforceairesearch.com/the-wikitext-long-term-dependency-language-modeling-dataset/): | model | score | | ------------------------- | -----: | | open-llama-7b-q2_K.bin | 8.5152 | | open-llama-7b-q3_K_S.bin | 7.6623 | | open-llama-7b-q3_K.bin | 7.3837 | | open-llama-7b-q3_K_L.bin | 7.3043 | | open-llama-7b-q4_0.bin | 7.2116 | | open-llama-7b-q4_1.bin | 7.1609 | | open-llama-7b-q4_K_S.bin | 7.1516 | | open-llama-7b-q4_K.bin | 7.1116 | | open-llama-7b-q5_0.bin | 7.0353 | | open-llama-7b-q5_K_S.bin | 7.0325 | | open-llama-7b-q5_1.bin | 7.0318 | | open-llama-7b-q5_K.bin | 7.0272 | | open-llama-7b-q6_K.bin | 7.0050 | | open-llama-7b-q8_0.bin | 6.9968 | | open-llama-7b-f16.bin | 6.9966 |