KoboldAI
/

PPO_Pygway-6b-Mix

Text Generation

Model card Files Files and versions Community

TehVenom commited on Mar 27, 2023

Commit

3cf067a

·

1 Parent(s): eb010f1

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ inference: false
 ---
 # GPT-J 6B - PPO_Pygway Mix
 ## Model description
-This is a a merged model, using an averaged weights strategy at a (20:20:60) ratio between the models:
 - [20%] - KoboldAI/GPT-J-6B-Janeway: https://huggingface.co/KoboldAI/GPT-J-6B-Janeway
 - [20%] - reciprocate/ppo_hh_gpt-j: https://huggingface.co/reciprocate/ppo_hh_gpt-j
@@ -36,7 +36,8 @@ PPO_Pygway combines `ppo_hh_gpt-j`, `Janeway-6b` and `Pygmalion-6b`; all three m
 (X*A + Y*B)
 ```
 With X & Y being the model weighs, and A/B being how strongly they are represented within the final value.
-The intent of this is to elevate the end-model by borrowing the strongly represented aspects out of each base model.
 Blend was done in FP32 and output saved in FP16 for reduced storage needs.

 ---
 # GPT-J 6B - PPO_Pygway Mix
 ## Model description
+This is a a merged model, using an weighted parameter blend strategy at a (20:20:60) ratio between the models:
 - [20%] - KoboldAI/GPT-J-6B-Janeway: https://huggingface.co/KoboldAI/GPT-J-6B-Janeway
 - [20%] - reciprocate/ppo_hh_gpt-j: https://huggingface.co/reciprocate/ppo_hh_gpt-j
 (X*A + Y*B)
 ```
 With X & Y being the model weighs, and A/B being how strongly they are represented within the final value.
+The intent of this is to elevate the end-model by borrowing the strongly represented aspects out of each base model,
+but may in part weaken other parts out of each model, which can be desirable if the base models have problematic traits that need to be worked on.
 Blend was done in FP32 and output saved in FP16 for reduced storage needs.