Chinese-LLaMA-2-13B-GGUF
This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-LLaMA-2-13B.
Performance
Metric: PPL, lower is better
Quant | original | imatrix (-im ) |
---|---|---|
Q2_K | 14.4701 +/- 0.26107 | 17.4275 +/- 0.31909 |
Q3_K | 10.1620 +/- 0.18277 | 9.7486 +/- 0.17744 |
Q4_0 | 9.8633 +/- 0.17792 | - |
Q4_K | 9.2735 +/- 0.16793 | 9.2734 +/- 0.16792 |
Q5_0 | 9.3553 +/- 0.16945 | - |
Q5_K | 9.1767 +/- 0.16634 | 9.1594 +/- 0.16590 |
Q6_K | 9.1326 +/- 0.16546 | 9.1478 +/- 0.16583 |
Q8_0 | 9.1394 +/- 0.16574 | - |
F16 | 9.1050 +/- 0.16518 | - |
The model with -im
suffix is generated with important matrix, which has generally better performance (not always though).
Others
For Hugging Face version, please see: https://huggingface.co/hfl/chinese-llama-2-13b
Please refer to https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/ for more details.
- Downloads last month
- 772
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.