SaisExperiments
commited on
Commit
•
68660f9
1
Parent(s):
4c7d2f0
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,233 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-nc-4.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-nc-4.0
|
3 |
+
base_model:
|
4 |
+
- Lambent/arsenic-nemo-unleashed-12B
|
5 |
+
---
|
6 |
+
|
7 |
+
# GGUF quantizations of [Lambent/arsenic-nemo-unleashed-12B](https://huggingface.co/Lambent/arsenic-nemo-unleashed-12B)
|
8 |
+
|
9 |
+
## Original card
|
10 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
11 |
+
should probably proofread and complete it, then remove this comment. -->
|
12 |
+
|
13 |
+
<img src="https://cdn.midjourney.com/13dd14c8-9bf4-41af-aa96-c4298a9cb2b5/0_2.jpeg"></img>
|
14 |
+
|
15 |
+
Motive: The gutenberg tunes are lovely but all the chatml variants seem to present many issues for merging, and have broken context later on.
|
16 |
+
Decided to see how it worked to tune directly on Unleashed. eq-bench is about a point and a half lower, which isn't drastic but suggests it might benefit from some additional workings.
|
17 |
+
|
18 |
+
In hindsight, there actually *is* a gutenberg tune mixed into Unleashed, so this intensifies the style a fair degree. Poetry leans a bit archaic.
|
19 |
+
I rather like the impact personally.
|
20 |
+
|
21 |
+
As is traditional, she got at least one quirk from DPO.
|
22 |
+
In this case it seems to be sometimes briefly slipping into Arabic while chatting.
|
23 |
+
One of the more charming ones I've seen.
|
24 |
+
|
25 |
+
Quality of life improvements in some circumstances:
|
26 |
+
* Assigned the pad token as the pad token for fine-tuning
|
27 |
+
* Had Axolotl add the chat template (useful on Runpod maybe?)
|
28 |
+
|
29 |
+
Substance: DPO-tuning on a mix of gutenberg-dpo and toxic-dpo, in the hope of getting enough classic human talent and edge to write well with.
|
30 |
+
Some of the most beautiful pigments are the most poisonous.
|
31 |
+
|
32 |
+
Testing: Not sure if I'm loading long context right, but it seems functional when I load it with 16k but not at 30k. Might just be doing it wrong.
|
33 |
+
|
34 |
+
Sample generated poetry (from the same start and only varied by a low temperature, for why they start the same):
|
35 |
+
|
36 |
+
```
|
37 |
+
Upon the shore of time,
|
38 |
+
Where waves of years do beat,
|
39 |
+
I stood and watched as they climb
|
40 |
+
The cliffs of eternity.
|
41 |
+
|
42 |
+
In every cresting o'er the deep,
|
43 |
+
A life was borne upon the tide;
|
44 |
+
Each wave that did break at my feet,
|
45 |
+
Did bear away some soul that had died.
|
46 |
+
|
47 |
+
And yet, amidst all this great throng,
|
48 |
+
There came to me a voice soft and low:
|
49 |
+
"O mortal man! What seekest thou here?"
|
50 |
+
Then answered I: "I seek the path of woe."
|
51 |
+
|
52 |
+
"Thou seekest," quoth she, "that which is lost;
|
53 |
+
For lo! The waves have borne it away.
|
54 |
+
Seek not then for that which once was thine own,
|
55 |
+
But rather turn thy face towards the light of day."
|
56 |
+
|
57 |
+
And so saying, away did she go,
|
58 |
+
Leaving me alone upon the shore.
|
59 |
+
Then turned I my steps unto mine own home,
|
60 |
+
And left behind me all thoughts of sorrow and of woe.
|
61 |
+
```
|
62 |
+
|
63 |
+
```
|
64 |
+
Upon the canvas of the azure sky,
|
65 |
+
The sun doth paint his golden chariot's course.
|
66 |
+
His radiant beams like arrows fly,
|
67 |
+
And pierce the clouds with light's resplendent force.
|
68 |
+
|
69 |
+
The verdant fields in emerald array,
|
70 |
+
Reflect the glory of his noontide reign;
|
71 |
+
While every blossom, bathed in warm display,
|
72 |
+
Exhales its sweet perfume on the plain.
|
73 |
+
|
74 |
+
But lo! as evening's purple curtain falls,
|
75 |
+
The sun doth sink behind the western hill.
|
76 |
+
His parting rays like lovers' farewell calls,
|
77 |
+
And gilds the heavens with a rosy thrill.
|
78 |
+
|
79 |
+
Then let us praise this daily wondrous sight,
|
80 |
+
And bless the Maker of the world so bright.
|
81 |
+
```
|
82 |
+
|
83 |
+
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
|
84 |
+
<details><summary>See axolotl config</summary>
|
85 |
+
|
86 |
+
axolotl version: `0.4.1`
|
87 |
+
```yaml
|
88 |
+
base_model: MarinaraSpaghetti/NemoMix-Unleashed-12B
|
89 |
+
model_type: AutoModelForCausalLM
|
90 |
+
tokenizer_type: AutoTokenizer
|
91 |
+
trust_remote_code: true
|
92 |
+
|
93 |
+
save_safetensors: true
|
94 |
+
|
95 |
+
load_in_8bit: false
|
96 |
+
load_in_4bit: true
|
97 |
+
strict: false
|
98 |
+
|
99 |
+
special_tokens:
|
100 |
+
pad_token: <pad>
|
101 |
+
|
102 |
+
rl: dpo
|
103 |
+
# total_num_tokens:
|
104 |
+
datasets:
|
105 |
+
- path: jondurbin/gutenberg-dpo-v0.1
|
106 |
+
split: train
|
107 |
+
type:
|
108 |
+
field_system: system
|
109 |
+
field_prompt: prompt
|
110 |
+
field_chosen: chosen
|
111 |
+
field_rejected: rejected
|
112 |
+
prompt_format: "[INST]{prompt}[/INST]"
|
113 |
+
chosen_format: "{chosen}"
|
114 |
+
rejected_format: "{rejected}"
|
115 |
+
- path: unalignment/toxic-dpo-v0.2
|
116 |
+
split: train
|
117 |
+
type:
|
118 |
+
field_system: system
|
119 |
+
field_prompt: prompt
|
120 |
+
field_chosen: chosen
|
121 |
+
field_rejected: rejected
|
122 |
+
prompt_format: "[INST]{prompt}[/INST]"
|
123 |
+
chosen_format: "{chosen}"
|
124 |
+
rejected_format: "{rejected}"
|
125 |
+
|
126 |
+
dataset_prepared_path: prepared-dpo
|
127 |
+
output_dir: ./dpoq
|
128 |
+
val_set_size: 0.001
|
129 |
+
|
130 |
+
seed: 1
|
131 |
+
|
132 |
+
sequence_len: 2048
|
133 |
+
sample_packing: false
|
134 |
+
eval_sample_packing: false
|
135 |
+
pad_to_sequence_len: false
|
136 |
+
|
137 |
+
chat_template: inst
|
138 |
+
|
139 |
+
adapter: qlora
|
140 |
+
lora_model_dir:
|
141 |
+
lora_r: 256
|
142 |
+
lora_alpha: 256
|
143 |
+
lora_dropout: 0.05
|
144 |
+
lora_target_linear: true
|
145 |
+
lora_fan_in_fan_out:
|
146 |
+
peft_use_dora: true
|
147 |
+
|
148 |
+
wandb_project: unleashed-qlora-dpo
|
149 |
+
wandb_entity:
|
150 |
+
wandb_watch:
|
151 |
+
wandb_name:
|
152 |
+
wandb_log_model:
|
153 |
+
|
154 |
+
gradient_accumulation_steps: 16
|
155 |
+
micro_batch_size: 1
|
156 |
+
num_epochs: 1
|
157 |
+
optimizer: paged_adamw_8bit
|
158 |
+
lr_scheduler: cosine
|
159 |
+
learning_rate: 0.00002
|
160 |
+
cosine_min_lr_ratio: 0.1
|
161 |
+
cosine_constant_lr_ratio: 0.95
|
162 |
+
|
163 |
+
train_on_inputs: false
|
164 |
+
group_by_length: false
|
165 |
+
bf16: true
|
166 |
+
fp16:
|
167 |
+
tf32: false
|
168 |
+
|
169 |
+
gradient_checkpointing: true
|
170 |
+
early_stopping_patience:
|
171 |
+
resume_from_checkpoint:
|
172 |
+
local_rank:
|
173 |
+
logging_steps: 1
|
174 |
+
xformers_attention:
|
175 |
+
flash_attention: true
|
176 |
+
|
177 |
+
warmup_steps: 16
|
178 |
+
evals_per_epoch: 8
|
179 |
+
saves_per_epoch: 8
|
180 |
+
save_total_limit: 2
|
181 |
+
debug:
|
182 |
+
deepspeed:
|
183 |
+
weight_decay: 0.001
|
184 |
+
fsdp:
|
185 |
+
fsdp_config:
|
186 |
+
|
187 |
+
```
|
188 |
+
|
189 |
+
</details><br>
|
190 |
+
|
191 |
+
# dpoq
|
192 |
+
|
193 |
+
This model is a fine-tuned version of [MarinaraSpaghetti/NemoMix-Unleashed-12B](https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B) on the None dataset.
|
194 |
+
|
195 |
+
## Model description
|
196 |
+
|
197 |
+
More information needed
|
198 |
+
|
199 |
+
## Intended uses & limitations
|
200 |
+
|
201 |
+
More information needed
|
202 |
+
|
203 |
+
## Training and evaluation data
|
204 |
+
|
205 |
+
More information needed
|
206 |
+
|
207 |
+
## Training procedure
|
208 |
+
|
209 |
+
### Training hyperparameters
|
210 |
+
|
211 |
+
The following hyperparameters were used during training:
|
212 |
+
- learning_rate: 2e-05
|
213 |
+
- train_batch_size: 1
|
214 |
+
- eval_batch_size: 8
|
215 |
+
- seed: 42
|
216 |
+
- gradient_accumulation_steps: 16
|
217 |
+
- total_train_batch_size: 16
|
218 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
219 |
+
- lr_scheduler_type: cosine
|
220 |
+
- lr_scheduler_warmup_steps: 16
|
221 |
+
- training_steps: 92
|
222 |
+
|
223 |
+
### Training results
|
224 |
+
|
225 |
+
|
226 |
+
|
227 |
+
### Framework versions
|
228 |
+
|
229 |
+
- PEFT 0.12.0
|
230 |
+
- Transformers 4.44.2
|
231 |
+
- Pytorch 2.3.1+cu121
|
232 |
+
- Datasets 2.20.0
|
233 |
+
- Tokenizers 0.19.1
|