Exllamav2 quant (exl2 / 8.0 bpw) made with ExLlamaV2 v0.1.1

Other EXL2 quants:

Quant Model Size lm_head
2.2
3588 MB
6
2.5
3990 MB
6
3.0
4718 MB
6
3.5
5443 MB
6
3.75
5809 MB
6
4.0
6166 MB
6
4.25
6537 MB
6
5.0
7625 MB
6
6.0
9111 MB
8
6.5
9831 MB
8
8.0
11277 MB
8

LuminRP-13B-128k-v0.5

LuminRP-13B-128k-v0.5 is the 13B parameter version of the v0.5, LuminRP-7B model which specializes in RP/ERP by merging a couple models that excels in it.


  • This model can and will output X-rated content.

SillyTavern

Template: Alpaca, ChatML, and Mistral should be okay.

Instruct Mode: On


πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "Ppoyaa/LuminRP-13B-128k-v0.5"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Merge Details Below:

See Merge Config
ChaoticNeutrals/BuRP_7B
Endevor/InfinityRP-v1-7B
Nitral-AI/Kunocchini-7b-128k-test
core-3/kuno-royale-v2-7b
KatyTheCutie/LemonadeRP-4.5.3
grimjim/kukulemon-7B
MaziyarPanahi/Calme-7B-Instruct-v0.9
icefog72/WestIceLemonTeaRP-32k-7b
crestf411/daybreak-kunoichi-2dpo-7b
Undi95/Mistral-RP-0.1-7B

Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.