DavidAU
/

Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

@@ -1,5 +1,9 @@
 ---
 license: apache-2.0
 ---
 <h3>Maximizing Model Performance for All Quants Types And Full-Precision using Samplers, Advance Samplers and Parameters Guide</h3>
@@ -117,7 +121,9 @@ IE: Instead of using a q4KM, you might be able to run an IQ3_M and get close to
 PRIMARY PARAMETERS:
 ------------------------------------------------------------------------------
---temp N                               		temperature (default: 0.8)
 Primary factor to control the randomness of outputs. 0 = deterministic (only the most likely token is used). Higher value = more randomness.
@@ -127,19 +133,25 @@ Too much temp can affect instruction following in some cases and sometimes not e
 Newer model archs (L3,L3.1,L3.2, Mistral Nemo, Gemma2 etc) many times NEED more temp (1+) to get their best generations.
---top-p N                               		top-p sampling (default: 0.9, 1.0 = disabled)
 If not set to 1, select tokens with probabilities adding up to less than this number. Higher value = higher range of possible random results.
 I use default of: .95 ;
---min-p N                               	min-p sampling (default: 0.1, 0.0 = disabled)
 Tokens with probability smaller than (min_p) * (probability of the most likely token) are discarded.
 I use default: .05 ;
---top-k N                               		top-k sampling (default: 40, 0 = disabled)
 Similar to top_p, but select instead only the top_k most likely tokens. Higher value = higher range of possible random results.
@@ -417,5 +429,4 @@ Smaller quants may require STRONGER settings (all classes of models) due to comp
 This is also influenced by the parameter size of the model in relation to the quant size.
-IE: a 8B model at Q2K will be far more unstable relative to a 20B model at Q2K, and as a result require stronger settings.

 ---
 license: apache-2.0
+tags:
+- parameters guide
+- samplers guide
+- model generation
 ---
 <h3>Maximizing Model Performance for All Quants Types And Full-Precision using Samplers, Advance Samplers and Parameters Guide</h3>
 PRIMARY PARAMETERS:
 ------------------------------------------------------------------------------
+--temp N
+temperature (default: 0.8)
 Primary factor to control the randomness of outputs. 0 = deterministic (only the most likely token is used). Higher value = more randomness.
 Newer model archs (L3,L3.1,L3.2, Mistral Nemo, Gemma2 etc) many times NEED more temp (1+) to get their best generations.
+--top-p N
+top-p sampling (default: 0.9, 1.0 = disabled)
 If not set to 1, select tokens with probabilities adding up to less than this number. Higher value = higher range of possible random results.
 I use default of: .95 ;
+--min-p N
+min-p sampling (default: 0.1, 0.0 = disabled)
 Tokens with probability smaller than (min_p) * (probability of the most likely token) are discarded.
 I use default: .05 ;
+--top-k N
+top-k sampling (default: 40, 0 = disabled)
 Similar to top_p, but select instead only the top_k most likely tokens. Higher value = higher range of possible random results.
 This is also influenced by the parameter size of the model in relation to the quant size.
+IE: a 8B model at Q2K will be far more unstable relative to a 20B model at Q2K, and as a result require stronger settings.