Lewdiculous commited on
Commit
d460a1a
1 Parent(s): ff0ad29

att --quantkv mention

Browse files
Files changed (1) hide show
  1. README.md +11 -5
README.md CHANGED
@@ -1,8 +1,11 @@
1
  ---
 
 
 
2
  license: cc-by-nc-4.0
 
3
  language:
4
  - en
5
- inference: false
6
  tags:
7
  - roleplay
8
  - llama3
@@ -23,13 +26,16 @@ I recommend checking their page for feedback and support.
23
  > If you noticed any issues let me know in the discussions.
24
 
25
  > [!NOTE]
26
- > **General usage:** <br>
27
- > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** (4.89 BPW) quant for up to 12288 context sizes. <br>
28
- >
29
  > **Presets:** <br>
30
  > Some compatible SillyTavern presets can be found [**here (Virt's Roleplay Presets - v1.9)**](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
 
31
  > Lower temperatures are recommended by the authors, so make sure to experiment. <br>
32
- > Check [**discussions such as this one**](https://huggingface.co/Virt-io/SillyTavern-Presets/discussions/5#664d6fb87c563d4d95151baa) and [**this one**](https://www.reddit.com/r/SillyTavernAI/comments/1dff2tl/my_personal_llama3_stheno_presets/) for other presets and samplers recommendations.
 
 
 
 
 
33
 
34
  <details>
35
  <summary>⇲ Click here to expand/hide information – General chart with relative quant parformances.</summary>
 
1
  ---
2
+ base_model: NeverSleep/Lumimaid-v0.2-8B
3
+ quantized_by: Lewdiculous
4
+ library_name: transformers
5
  license: cc-by-nc-4.0
6
+ inference: false
7
  language:
8
  - en
 
9
  tags:
10
  - roleplay
11
  - llama3
 
26
  > If you noticed any issues let me know in the discussions.
27
 
28
  > [!NOTE]
 
 
 
29
  > **Presets:** <br>
30
  > Some compatible SillyTavern presets can be found [**here (Virt's Roleplay Presets - v1.9)**](https://huggingface.co/Virt-io/SillyTavern-Presets). <br>
31
+ > Check [**discussions such as this one**](https://huggingface.co/Virt-io/SillyTavern-Presets/discussions/5#664d6fb87c563d4d95151baa) and [**this one**](https://www.reddit.com/r/SillyTavernAI/comments/1dff2tl/my_personal_llama3_stheno_presets/) for other presets and samplers recommendations. <br>
32
  > Lower temperatures are recommended by the authors, so make sure to experiment. <br>
33
+ >
34
+ > **General usage with KoboldCpp:** <br>
35
+ > For **8GB VRAM** GPUs, I recommend the **Q4_K_M-imat** (4.89 BPW) quant for up to 12288 context sizes without the use of `--quantkv`. <br>
36
+ > Using `--quantkv 1` (≈Q8) or even `--quantkv 2` (≈Q4) can get you to 32K context sizes with the caveat of not being compatible with Context Shifting, only relevant if you can manage to fill up that much context. <br>
37
+ > [**Read more about it in the release here**](https://github.com/LostRuins/koboldcpp/releases/tag/v1.67).
38
+
39
 
40
  <details>
41
  <summary>⇲ Click here to expand/hide information – General chart with relative quant parformances.</summary>