TheBloke commited on
Commit
975b392
·
1 Parent(s): 26a4a74

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -2
README.md CHANGED
@@ -1,11 +1,20 @@
1
  ---
 
 
2
  inference: false
 
 
3
  license: llama2
4
  model_creator: PygmalionAI
5
  model_link: https://huggingface.co/PygmalionAI/pygmalion-2-7b
6
  model_name: Pygmalion 2 7B
7
  model_type: llama
 
8
  quantized_by: TheBloke
 
 
 
 
9
  ---
10
 
11
  <!-- header start -->
@@ -67,6 +76,15 @@ The model has been trained on prompts using three different roles, which are den
67
  The `<|system|>` prompt can be used to inject out-of-channel information behind the scenes, while the `<|user|>` prompt should be used to indicate user input.
68
  The `<|model|>` token should then be used to indicate that the model should generate a response. These tokens can happen multiple times and be chained up to form a conversation history.
69
 
 
 
 
 
 
 
 
 
 
70
 
71
  <!-- prompt-template end -->
72
  <!-- compatibility_gguf start -->
@@ -123,7 +141,7 @@ Make sure you are using `llama.cpp` from commit [6381d4e110bd0ec02843a60bbeb8b6f
123
  For compatibility with older versions of llama.cpp, or for any third-party libraries or clients that haven't yet updated for GGUF, please use GGML files instead.
124
 
125
  ```
126
- ./main -t 10 -ngl 32 -m pygmalion-2-7b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<|system|>Enter RP mode. Pretend to be {{char}} whose persona follows:\n{{persona}}"
127
  ```
128
  Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`. If offloading all layers to GPU, set `-t 1`.
129
 
@@ -213,6 +231,53 @@ And thank you again to a16z for their generous grant.
213
  <!-- original-model-card start -->
214
  # Original model card: PygmalionAI's Pygmalion 2 7B
215
 
216
- No original model card was available.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
217
 
218
  <!-- original-model-card end -->
 
1
  ---
2
+ datasets:
3
+ - PygmalionAI/PIPPA
4
  inference: false
5
+ language:
6
+ - en
7
  license: llama2
8
  model_creator: PygmalionAI
9
  model_link: https://huggingface.co/PygmalionAI/pygmalion-2-7b
10
  model_name: Pygmalion 2 7B
11
  model_type: llama
12
+ pipeline_tag: text-generation
13
  quantized_by: TheBloke
14
+ tags:
15
+ - text generation
16
+ - instruct
17
+ thumbnail: null
18
  ---
19
 
20
  <!-- header start -->
 
76
  The `<|system|>` prompt can be used to inject out-of-channel information behind the scenes, while the `<|user|>` prompt should be used to indicate user input.
77
  The `<|model|>` token should then be used to indicate that the model should generate a response. These tokens can happen multiple times and be chained up to form a conversation history.
78
 
79
+ The system prompt has been designed to allow the model to "enter" various modes and dictate the reply length. Here's an example:
80
+
81
+ ```
82
+ <|system|>Enter RP mode. Pretend to be {{char}} whose persona follows:
83
+ {{persona}}
84
+
85
+ You shall reply to the user while staying in character, and generate long responses.
86
+ ```
87
+
88
 
89
  <!-- prompt-template end -->
90
  <!-- compatibility_gguf start -->
 
141
  For compatibility with older versions of llama.cpp, or for any third-party libraries or clients that haven't yet updated for GGUF, please use GGML files instead.
142
 
143
  ```
144
+ ./main -t 10 -ngl 32 -m pygmalion-2-7b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "<|system|>Enter RP mode. Pretend to be {{char}} whose persona follows:\n{{persona}}\n\nYou shall reply to the user while staying in character, and generate long responses."
145
  ```
146
  Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`. If offloading all layers to GPU, set `-t 1`.
147
 
 
231
  <!-- original-model-card start -->
232
  # Original model card: PygmalionAI's Pygmalion 2 7B
233
 
234
+ <h1 style="text-align: center">Pygmalion-2 7B</h1>
235
+ <h2 style="text-align: center">An instruction-tuned Llama-2 biased towards fiction writing and conversation.</h2>
236
+
237
+ ## Model Details
238
+
239
+ The long-awaited release of our new models based on Llama-2 is finally here. Pygmalion-2 7B (formerly known as Metharme) is based on
240
+ [Llama-2 7B](https://huggingface.co/meta-llama/llama-2-7b-hf) released by Meta AI.
241
+
242
+ The Metharme models were an experiment to try and get a model that is usable for conversation, roleplaying and storywriting,
243
+ but which can be guided using natural language like other instruct models. After much deliberation, we reached the conclusion
244
+ that the Metharme prompting format is superior (and easier to use) compared to the classic Pygmalion.
245
+
246
+ This model was trained by doing supervised fine-tuning over a mixture of regular instruction data alongside roleplay, fictional stories
247
+ and conversations with synthetically generated instructions attached.
248
+
249
+ This model is freely available for both commercial and non-commercial use, as per the Llama-2 license.
250
+
251
+
252
+ ## Prompting
253
+
254
+ The model has been trained on prompts using three different roles, which are denoted by the following tokens: `<|system|>`, `<|user|>` and `<|model|>`.
255
+
256
+ The `<|system|>` prompt can be used to inject out-of-channel information behind the scenes, while the `<|user|>` prompt should be used to indicate user input.
257
+ The `<|model|>` token should then be used to indicate that the model should generate a response. These tokens can happen multiple times and be chained up to
258
+ form a conversation history.
259
+
260
+ ### Prompting example
261
+
262
+ The system prompt has been designed to allow the model to "enter" various modes and dictate the reply length. Here's an example:
263
+
264
+ ```
265
+ <|system|>Enter RP mode. Pretend to be {{char}} whose persona follows:
266
+ {{persona}}
267
+
268
+ You shall reply to the user while staying in character, and generate long responses.
269
+ ```
270
+
271
+ ## Dataset
272
+ The dataset used to fine-tune this model includes our own [PIPPA](https://huggingface.co/datasets/PygmalionAI/PIPPA), along with several other instruction
273
+ datasets, and datasets acquired from various RP forums.
274
+
275
+ ## Limitations and biases
276
+
277
+ The intended use-case for this model is fictional writing for entertainment purposes. Any other sort of usage is out of scope.
278
+
279
+ As such, it was **not** fine-tuned to be safe and harmless: the base model _and_ this fine-tune have been trained on data known to contain profanity and texts that
280
+ are lewd or otherwise offensive. It may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.
281
+ Outputs might often be factually wrong or misleading.
282
 
283
  <!-- original-model-card end -->