This is a sharded fp16 version of Tap-M/Luna-AI-Llama2-Uncensored-FP16

Model Description

“Luna AI Llama2 Uncensored” is a Llama2 based Chat model
fine-tuned on over 40,000 long form chat discussions
This model was fine-tuned by Tap, the creator of Luna AI.

Model Training

The fine-tuning process was performed on an 8x a100 80GB machine.
The model was trained on synthetic outputs which include multiple rounds of chats between Human & AI.

4bit GPTQ Version provided by @TheBloke - for GPU inference
GGML Version provided by @TheBloke - For CPU inference

Prompt Format

The model follows the Vicuna 1.1/ OpenChat format:

USER: I have difficulties in making friends, and I really need someone to talk to. Would you be my friend?

ASSISTANT: Of course! Friends are always here for each other. What do you like to do?

Benchmark Results

Task Version Metric Value Stderr
arc_challenge 0 acc_norm 0.5512 0.0146
hellaswag 0
mmlu 1 acc_norm 0.46521 0.036
truthfulqa_mc 1 mc2 0.4716 0.0155
Average - - 0.5114 0.0150
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.