TehVenom commited on
Commit
eb010f1
·
1 Parent(s): 6f24e8f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -5
README.md CHANGED
@@ -1,10 +1,86 @@
1
- #TODO card. Mix of (GPT-J-6B-Janeway + PPO_HH_GPT-J) + Pygmalion-6b
 
 
 
 
 
 
 
 
2
 
3
- At a ratio of
 
 
 
 
4
 
 
5
 
6
- GPT-J-6B-Janeway - 20%
7
 
8
- PPO_HH_GPT-J - 20%
 
 
 
 
 
 
 
9
 
10
- Pygmalion-6b - 60%
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ commercial: 'no'
5
+ inference: false
6
+ ---
7
+ # GPT-J 6B - PPO_Pygway Mix
8
+ ## Model description
9
+ This is a a merged model, using an averaged weights strategy at a (20:20:60) ratio between the models:
10
 
11
+ - [20%] - KoboldAI/GPT-J-6B-Janeway: https://huggingface.co/KoboldAI/GPT-J-6B-Janeway
12
+ - [20%] - reciprocate/ppo_hh_gpt-j: https://huggingface.co/reciprocate/ppo_hh_gpt-j
13
+ - [60%] - Pygmalion/Pygmalion-6b: https://huggingface.co/Pygmalion/Pygmalion-6b
14
+
15
+ By their respective authors.
16
 
17
+ **Warning: Pygmalion may generate NSFW or inappropriate content due to being trained on general user logs, and internet archives.**
18
 
19
+ ### Intended Use:
20
 
21
+ Research purposes only, intended for responsible use.
22
+ Express a conversation in natural language, and PPO_Pygmalion will pick up on the conversational format.
23
+ Try starting a two line prompt such as:
24
+ ```
25
+ Bot: "Hello, how are you?"
26
+ You: "I am doing just fine, thank you."
27
+ ```
28
+ Or any other topic, and the model will carry on in this back and forth style.
29
 
30
+ ## Information:
31
+ For more details, check out the related source models, especially [Pygmalion-6b](https://huggingface.co/Pygmalion/Pygmalion-6b) for more information on how to utilize the chat bot formatting expected.
32
+
33
+ In a similar manner to fine-tuning, merging weights does not add information but transforms it, therefore it is important to consider trade-offs.
34
+ PPO_Pygway combines `ppo_hh_gpt-j`, `Janeway-6b` and `Pygmalion-6b`; all three models were blended in a two step process using the a simple weighted parameter method
35
+ ```
36
+ (X*A + Y*B)
37
+ ```
38
+ With X & Y being the model weighs, and A/B being how strongly they are represented within the final value.
39
+ The intent of this is to elevate the end-model by borrowing the strongly represented aspects out of each base model.
40
+
41
+ Blend was done in FP32 and output saved in FP16 for reduced storage needs.
42
+
43
+
44
+ ## Limitations and biases
45
+ Based on known problems with NLP technology, potential relevant factors include bias (gender, profession, race and religion).
46
+ **Warning: This model has a very strong NSFW bias!**
47
+
48
+ ### License
49
+ GPT-J-6b is licensed by EleutherAI under the apache-2.0 license. All Rights Reserved.
50
+
51
+ ### BibTeX entry and citation info
52
+ ```
53
+ @misc{gpt-j,
54
+ author = {Wang, Ben and Komatsuzaki, Aran},
55
+ title = {{GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model}},
56
+ howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
57
+ year = 2021,
58
+ month = May
59
+ }
60
+ ```
61
+
62
+ ### Credits To:
63
+
64
+ Models involved:
65
+ - https://huggingface.co/EleutherAI/gpt-j-6B
66
+ - https://huggingface.co/Pygmalion/Pygmalion-6b
67
+ - https://huggingface.co/reciprocate/ppo_hh_gpt-j
68
+ - https://huggingface.co/KoboldAI/GPT-J-6B-Janeway
69
+
70
+ Average weights merging Script credit to Concedo:
71
+ - https://huggingface.co/concedo
72
+
73
+ ### Related datasets and articles:
74
+
75
+ PPO_HH-GPT-J-6b's Dataset is a variant of the Helpful Harmless assistant themed
76
+ dataset and Proximal Policy Optimization, specific datasets
77
+ used are unknown; listed repo datasets include:
78
+ - https://huggingface.co/datasets/reciprocate/summarize_eval_ilql
79
+ - https://huggingface.co/datasets/reciprocate/hh_eval_ilql
80
+
81
+ PPO explained:
82
+ - https://paperswithcode.com/method/ppo
83
+
84
+ Potential HH-type datasets utilized:
85
+ - https://huggingface.co/HuggingFaceH4
86
+ - https://huggingface.co/datasets/Anthropic/hh-rlhf