Mozilla
/

Mixtral-8x22B-Instruct-v0.1-llamafile

Model card Files Files and versions Community

jartine commited on Apr 25

Commit

2b85bc3

•

1 Parent(s): aad6c1e

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -98,8 +98,8 @@ think correctly. A highly degraded quant like `Q2_K` may not make a
 great encyclopedia, but it's still capable of logical reasoning and
 the emergent capabilities LLMs exhibit.
-Good quants for reading (evaluation speed) are BF16, F16, Q4\_0, and
-Q8\_0 (ordered from fastest to slowest). Prompt evaluation is bounded by
 computation speed (flops) which means performance can be improved by
 software engineering, e.g. BLAS algorithms, in which case quantization
 starts hurting more than it helps, since it competes for CPU resources

 great encyclopedia, but it's still capable of logical reasoning and
 the emergent capabilities LLMs exhibit.
+Good quants for reading (evaluation speed) are BF16, F16, Q8\_0, and
+Q4\_0 (ordered from fastest to slowest). Prompt evaluation is bounded by
 computation speed (flops) which means performance can be improved by
 software engineering, e.g. BLAS algorithms, in which case quantization
 starts hurting more than it helps, since it competes for CPU resources