DavidAU commited on
Commit
6b95a3c
·
verified ·
1 Parent(s): 3f1e840

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -24
README.md CHANGED
@@ -17,7 +17,7 @@ tags:
17
  <h3>Maximizing Model Performance for All Quants Types And Full-Precision using Samplers, Advance Samplers and Parameters Guide</h3>
18
 
19
  This document includes detailed information, references, and notes for general parameters, samplers and
20
- advanced samplers to get the most out of your model's abilities including notes / settings for the most popular AI/LLM app in use.
21
 
22
  These settings / suggestions can be applied to all models including GGUF, EXL2, GPTQ, HQQ, AWQ and full source/precision.
23
 
@@ -40,13 +40,13 @@ The settings discussed in this document can also fix a number of model issues (<
40
 
41
  Likewise ALL the setting (parameters, samplers and advanced samplers) below can also improve model generation and/or general overall "smoothness" / "quality" of model operation:
42
 
43
- - all parameters and samplers available via LLAMACPP (and most apps that run / use LLAMACPP)
44
  - all parameters (including some not in Lllamacpp), samplers and advanced samplers ("Dry", "Quadratic", "Microstat") in oobabooga/text-generation-webui including llamacpp_HF loader (allowing a lot more samplers)
45
- - all parameters (including some not in Lllamacpp), samplers and advanced samplers ("Dry", "Quadratic", "Microstat") in KoboldCPP (including Anti-slop filters)
46
 
47
- Even if you are not using my models, you may find this document useful for any model (any quant / full source) available online.
48
 
49
- If you are currently using model(s) that are difficult to "wrangle" then apply "Class 3" or "Class 4" settings to them.
50
 
51
  This document will be updated over time too and is subject to change without notice.
52
 
@@ -90,7 +90,7 @@ I do not set any other settings, parameters or have samplers activated when gene
90
 
91
  Everything else is "zeroed" / "disabled".
92
 
93
- These parameters/settings are considered both safe and default and in most cases available to all users in all apps.
94
 
95
  ---
96
 
@@ -106,7 +106,7 @@ You will need the config files to use "llamacpp_HF" loader ("text-generation-web
106
 
107
  You can also use the full source in "text-generation-webui" too.
108
 
109
- As an alternative you can use GGUFs directly in "KOBOLDCPP" without the "config files" and still use almost all the parameters, samplers and advanced samplers.
110
 
111
  <B>Parameters, Samplers and Advanced Samplers</B>
112
 
@@ -143,7 +143,9 @@ For CLASS3 and CLASS4 the most important setting is "SMOOTHING FACTOR" (Quadrati
143
 
144
  https://docs.sillytavern.app/usage/common-settings/
145
 
146
- NOTE: It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
 
 
147
 
148
  You may also want to check out how to connect SillyTavern to local AI "apps" running on your pc here:
149
 
@@ -154,7 +156,7 @@ OTHER PROGRAMS:
154
 
155
  Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
156
 
157
- In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Olama", "backyard", and "lmstudio" (as well as other apps too).
158
 
159
  You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
160
 
@@ -162,6 +164,12 @@ https://github.com/ggerganov/llama.cpp
162
 
163
  (scroll down on the main page for more apps/programs to use GGUFs too that connect to / use the LLAMA-CPP package.)
164
 
 
 
 
 
 
 
165
  ---
166
 
167
  DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS:
@@ -176,11 +184,10 @@ https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-
176
 
177
  Additional Links (on parameters, samplers and advanced samplers):
178
 
179
- DRY => https://github.com/oobabooga/text-generation-webui/pull/5677
180
-
181
- DRY => https://www.reddit.com/r/KoboldAI/comments/1e49vpt/dry_sampler_questionsthat_im_sure_most_of_us_are/
182
-
183
- DRY => https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
184
 
185
  Samplers : https://gist.github.com/kalomaze/4473f3f975ff5e5fade06e632498f73e
186
 
@@ -267,7 +274,7 @@ This covers both Imatrix and regular quants.
267
 
268
  Imatrix can be applied to any quant - "Q" or "IQ" - however, IQ1s to IQ3_S REQUIRE an imatrix dataset / imatrixing process before quanting.
269
 
270
- This chart shows the order in terms of "BPW" for each quant (mapped below with relative "strength" to one another) with "IQ1_S" with the least, and "Q8_0" with the most:
271
 
272
  <small>
273
  <PRE>
@@ -316,11 +323,11 @@ Here are some Imatrix Neo Models:
316
 
317
  Suggestions for Imatrix NEO quants:
318
 
319
- - The LOWER the quant the STRONGER the Imatrix effect is, and therefore the stronger the horror "tint" so to speak
320
- - Due to the unique nature of this project, quants IQ1s to IQ4s are recommended for maximum horror effect with IQ4_XS the most balanced in terms of power and bits.
321
  - Secondaries are Q2s-Q4s. Imatrix effect is still strong in these quants.
322
  - Effects diminish quickly from Q5s and up.
323
- - Q8 there is no change (as the Imatrix process does not affect this quant), and therefore was not uploaded.
324
 
325
  ---
326
 
@@ -411,12 +418,6 @@ Please see sections below this for advanced usage, more details, settings notes
411
 
412
  </small>
413
 
414
- Special note:
415
-
416
- It appears "DRY" / "XTC" samplers has been added to LLAMACPP.
417
-
418
- It is available via "llama-server.exe". Likely this sampler will also become available "downstream" in applications that use LLAMACPP in due time.
419
-
420
  ---
421
 
422
  HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
@@ -722,6 +723,8 @@ i.e. `--logit-bias 15043+1` to increase likelihood of token ' Hello', or `--log
722
 
723
  This may or may not be available. This requires a bit more work.
724
 
 
 
725
  IN "oobabooga/text-generation-webui" there is "TOKEN BANNING":
726
 
727
  This is a very powerful pruning method; which can drastically alter output generation.
 
17
  <h3>Maximizing Model Performance for All Quants Types And Full-Precision using Samplers, Advance Samplers and Parameters Guide</h3>
18
 
19
  This document includes detailed information, references, and notes for general parameters, samplers and
20
+ advanced samplers to get the most out of your model's abilities including notes / settings for the most popular AI/LLM app in use (LLAMACPP, KoboldCPP, Text-Generation-WebUI, LMStudio, Sillytavern, Ollama and others).
21
 
22
  These settings / suggestions can be applied to all models including GGUF, EXL2, GPTQ, HQQ, AWQ and full source/precision.
23
 
 
40
 
41
  Likewise ALL the setting (parameters, samplers and advanced samplers) below can also improve model generation and/or general overall "smoothness" / "quality" of model operation:
42
 
43
+ - all parameters and samplers available via LLAMACPP (and most apps that run / use LLAMACPP - including Lmstudio, Ollama, Sillytavern and others.)
44
  - all parameters (including some not in Lllamacpp), samplers and advanced samplers ("Dry", "Quadratic", "Microstat") in oobabooga/text-generation-webui including llamacpp_HF loader (allowing a lot more samplers)
45
+ - all parameters (including some not in Lllamacpp), samplers and advanced samplers ("Dry", "Quadratic", "Microstat") in SillyTavern / KoboldCPP (including Anti-slop filters)
46
 
47
+ Even if you are not using my models, you may find this document <u>useful for any model (any quant / full source / any repo) available online.</u>
48
 
49
+ If you are currently using model(s) - from my repo and/or others - that are difficult to "wrangle" then you can apply "Class 3" or "Class 4" settings to them.
50
 
51
  This document will be updated over time too and is subject to change without notice.
52
 
 
90
 
91
  Everything else is "zeroed" / "disabled".
92
 
93
+ These parameters/settings are considered both safe and default and in most cases available to all users in all AI/LLM apps.
94
 
95
  ---
96
 
 
106
 
107
  You can also use the full source in "text-generation-webui" too.
108
 
109
+ As an alternative you can use GGUFs directly in "KOBOLDCPP" / "SillyTavern" without the "config files" and still use almost all the parameters, samplers and advanced samplers.
110
 
111
  <B>Parameters, Samplers and Advanced Samplers</B>
112
 
 
143
 
144
  https://docs.sillytavern.app/usage/common-settings/
145
 
146
+ NOTE:
147
+
148
+ It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
149
 
150
  You may also want to check out how to connect SillyTavern to local AI "apps" running on your pc here:
151
 
 
156
 
157
  Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
158
 
159
+ In most cases all llama_cpp parameters/samplers are available when using API / headless / server mode in "text-generation-webui", "koboldcpp", "Sillytavern", "Olama", "Backyard", and "LMStudio" (as well as other apps too).
160
 
161
  You can also use llama_cpp directly too. (IE: llama-server.exe) ; see :
162
 
 
164
 
165
  (scroll down on the main page for more apps/programs to use GGUFs too that connect to / use the LLAMA-CPP package.)
166
 
167
+ Special note:
168
+
169
+ It appears "DRY" / "XTC" samplers has been added to LLAMACPP and SILLYTAVERN.
170
+
171
+ It is available via "llama-server.exe". Likely this sampler will also become available "downstream" in applications that use LLAMACPP in due time.
172
+
173
  ---
174
 
175
  DETAILED NOTES ON PARAMETERS, SAMPLERS and ADVANCED SAMPLERS:
 
184
 
185
  Additional Links (on parameters, samplers and advanced samplers):
186
 
187
+ DRY
188
+ - https://github.com/oobabooga/text-generation-webui/pull/5677
189
+ - https://www.reddit.com/r/KoboldAI/comments/1e49vpt/dry_sampler_questionsthat_im_sure_most_of_us_are/
190
+ - https://www.reddit.com/r/KoboldAI/comments/1eo4r6q/dry_settings_questions/
 
191
 
192
  Samplers : https://gist.github.com/kalomaze/4473f3f975ff5e5fade06e632498f73e
193
 
 
274
 
275
  Imatrix can be applied to any quant - "Q" or "IQ" - however, IQ1s to IQ3_S REQUIRE an imatrix dataset / imatrixing process before quanting.
276
 
277
+ This chart shows the order in terms of "BPW" for each quant (mapped below with relative "strength" to one another) with "IQ1_S" with the least, and "Q8_0" (F16 is full precision) with the most:
278
 
279
  <small>
280
  <PRE>
 
323
 
324
  Suggestions for Imatrix NEO quants:
325
 
326
+ - The LOWER the quant the STRONGER the Imatrix effect is, and therefore the stronger the "tint" so to speak
327
+ - Due to the unique nature of this project, quants IQ1s to IQ4s are recommended for maximum effect with IQ4_XS the most balanced in terms of power and bits.
328
  - Secondaries are Q2s-Q4s. Imatrix effect is still strong in these quants.
329
  - Effects diminish quickly from Q5s and up.
330
+ - Q8/F16 there is no change (as the Imatrix process does not affect this quant), and therefore not included.
331
 
332
  ---
333
 
 
418
 
419
  </small>
420
 
 
 
 
 
 
 
421
  ---
422
 
423
  HOW TO TEST EACH PARAMETER(s), SAMPLER(s) and ADVANCED SAMPLER(s)
 
723
 
724
  This may or may not be available. This requires a bit more work.
725
 
726
+ Note: +- range is 0 to 100.
727
+
728
  IN "oobabooga/text-generation-webui" there is "TOKEN BANNING":
729
 
730
  This is a very powerful pruning method; which can drastically alter output generation.