benk04
/

CausalLM-RP-34B-4.65bpw-h6-exl2

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

benk04 commited on Jun 6, 2024

Commit

962de4a

•

1 Parent(s): 45662f5

Update README.md

Files changed (1) hide show

README.md +34 -3

README.md CHANGED Viewed

@@ -1,3 +1,34 @@
----
-license: cc-by-nc-4.0
----

+---
+license: cc-by-nc-4.0
+tags:
+- not-for-all-audiences
+- nsfw
+---
+<!-- description start -->
+Exllamav2 4.65bpw quantization of CausalLM-RP-34B from [NeverSleep](https://huggingface.co/NeverSleep/CausalLM-RP-34B), quantized with default calibration dataset.
+> [!IMPORTANT]
+>This bpw is the perfect size for 24GB GPUs, and can fit 32k+ context. Make sure to enable 4-bit cache option or you'll run into OOM errors.
+---
+## Original Card
+## Description
+This repo contains fp16 files of CausalLM-RP-34B, a finetuned model of the CausalLM-34B Beta on multiple RP datasets.
+<!-- description end -->
+<!-- description start -->
+## Model used
+- [CausalLM/34b-beta](https://huggingface.co/CausalLM/34b-beta)
+### Prompt template ChatML
+```
+<|im_start|>system
+{system_prompt}<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+{output}<|im_end|>
+```