nisten
/

Reflection-70b-PreciseQuant-6bpw-gguf

Model card Files Files and versions

nisten commited on Sep 7, 2024

Commit

0b1ab1e

·

verified ·

1 Parent(s): 10d6bec

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ base_model: [mattshumer/Reflection-Llama-3.1-70B]
 # This gets 99.96% perplexity at 50gb filesize whereas fp8 (not tested on this model) is known to be 97-98.8%
 >🐧 To download faster on Linux `sudo apt install -y aria2`
 >🍎 On Mac `brew install aria2`

 # This gets 99.96% perplexity at 50gb filesize whereas fp8 (not tested on this model) is known to be 97-98.8%
+Only posting one quant because it's really annoying to make these and I haven't automated it yet, takes 30+ iterations of models as I have to recompile llama.cpp every build/test step until the lowest weight configs are found.
 >🐧 To download faster on Linux `sudo apt install -y aria2`
 >🍎 On Mac `brew install aria2`