mlabonne
/

AlphaMonarch-7B

Text Generation

text-generation-inference

Model card Files Files and versions

mlabonne commited on Mar 7, 2024

Commit

c362a1b

·

verified ·

1 Parent(s): cd7a097

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -139,10 +139,14 @@ Special thanks to [Jon Durbin](https://huggingface.co/jondurbin), [Intel](https:
 This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).
 It is one of the very best 7B models in terms of instructing following and reasoning abilities and can be used for conversations, RP, and storytelling. Note that it tends to have a quite formal and sophisticated style, but it can be changed by modifying the prompt.
 ## ⚡ Quantized models
 * **GGUF**: https://huggingface.co/mlabonne/AlphaMonarch-7B-GGUF
 * **GPTQ**: https://huggingface.co/LoneStriker/AlphaMonarch-7B-GPTQ
 * **AWQ**: https://huggingface.co/LoneStriker/AlphaMonarch-7B-AWQ

 This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).
+If you use SillyTavern, you might want to tweak the inference parameters. Here's what LM Studio uses as a reference: `temp` 0.8, `top_k` 40, `top_p` 0.95, `min_p` 0.05, `repeat_penalty` 1.1.
 It is one of the very best 7B models in terms of instructing following and reasoning abilities and can be used for conversations, RP, and storytelling. Note that it tends to have a quite formal and sophisticated style, but it can be changed by modifying the prompt.
 ## ⚡ Quantized models
+Thanks to [LoneStriker](https://huggingface.co/LoneStriker) for the GPTQ, AWQ, and EXL2 quants.
 * **GGUF**: https://huggingface.co/mlabonne/AlphaMonarch-7B-GGUF
 * **GPTQ**: https://huggingface.co/LoneStriker/AlphaMonarch-7B-GPTQ
 * **AWQ**: https://huggingface.co/LoneStriker/AlphaMonarch-7B-AWQ