lucyknada commited on
Commit
12ebafa
·
verified ·
1 Parent(s): c423446

Upload ./README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +215 -0
README.md ADDED
@@ -0,0 +1,215 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Doctor-Shotgun/c2_deduped_16k_llama3_tok_deanon
5
+ - anthracite-org/kalo-opus-instruct-22k-no-refusal
6
+ - lodrick-the-lafted/kalo-opus-instruct-3k-filtered
7
+ - anthracite-org/nopm_claude_writing_fixed
8
+ - anthracite-org/kalo_opus_misc_240827
9
+ - anthracite-org/kalo_misc_part2
10
+ language:
11
+ - en
12
+ base_model:
13
+ - Qwen/Qwen2.5-72B-Instruct
14
+ library_name: transformers
15
+ ---
16
+ ### exl2 quant (measurement.json in main branch)
17
+ ---
18
+ ### check revisions for quants
19
+ ---
20
+
21
+
22
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/658a46cbfb9c2bdfae75b3a6/trlkbv0jv_0HImUESrt5C.png)
23
+ This is an experimental model designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of [Qwen-2.5 72B Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct).
24
+
25
+ ## Prompting
26
+ Model has been instruct tuned with ChatML prompt formatting. A typical input would look like this:
27
+
28
+ ```
29
+ """<|im_start|>system
30
+ system prompt<|im_end|>
31
+ <|im_start|>user
32
+ Hi there!<|im_end|>
33
+ <|im_start|>assistant
34
+ Nice to meet you!<|im_end|>
35
+ <|im_start|>user
36
+ Can I ask a question?<|im_end|>
37
+ <|im_start|>assistant
38
+ """
39
+ ```
40
+
41
+ ## SillyTavern templates
42
+
43
+ Below are Instruct and Context templates for use within SillyTavern.
44
+
45
+ <details><summary>context template</summary>
46
+
47
+ ```yaml
48
+ {
49
+ "story_string": "<|im_start|>system\n{{#if system}}{{system}}\n{{/if}}{{#if wiBefore}}{{wiBefore}}\n{{/if}}{{#if description}}{{description}}\n{{/if}}{{#if personality}}{{char}}'s personality: {{personality}}\n{{/if}}{{#if scenario}}Scenario: {{scenario}}\n{{/if}}{{#if wiAfter}}{{wiAfter}}\n{{/if}}{{#if persona}}{{persona}}\n{{/if}}{{trim}}<|im_end|>\n",
50
+ "example_separator": "",
51
+ "chat_start": "",
52
+ "use_stop_strings": false,
53
+ "allow_jailbreak": false,
54
+ "always_force_name2": true,
55
+ "trim_sentences": false,
56
+ "include_newline": false,
57
+ "single_line": false,
58
+ "name": "Magnum ChatML"
59
+ }
60
+ ```
61
+
62
+ </details><br>
63
+ <details><summary>instruct template</summary>
64
+
65
+ ```yaml
66
+ {
67
+ "system_prompt": "You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.",
68
+ "input_sequence": "<|im_start|>user\n",
69
+ "output_sequence": "<|im_start|>assistant\n",
70
+ "last_output_sequence": "",
71
+ "system_sequence": "<|im_start|>system\n",
72
+ "stop_sequence": "<|im_end|>",
73
+ "wrap": false,
74
+ "macro": true,
75
+ "names": true,
76
+ "names_force_groups": true,
77
+ "activation_regex": "",
78
+ "system_sequence_prefix": "",
79
+ "system_sequence_suffix": "",
80
+ "first_output_sequence": "",
81
+ "skip_examples": false,
82
+ "output_suffix": "<|im_end|>\n",
83
+ "input_suffix": "<|im_end|>\n",
84
+ "system_suffix": "<|im_end|>\n",
85
+ "user_alignment_message": "",
86
+ "system_same_as_user": false,
87
+ "last_system_sequence": "",
88
+ "name": "Magnum ChatML"
89
+ }
90
+ ```
91
+
92
+ </details><br>
93
+
94
+ ## Credits
95
+
96
+ Datasets used:
97
+ - [anthracite-org/c2_logs_32k_llama3_qwen2_v1.2](https://huggingface.co/datasets/anthracite-org/c2_logs_32k_llama3_qwen2_v1.2)
98
+ - [anthracite-org/kalo-opus-instruct-22k-no-refusal](https://huggingface.co/datasets/anthracite-org/kalo-opus-instruct-22k-no-refusal)
99
+ - [lodrick-the-lafted/kalo-opus-instruct-3k-filtered](https://huggingface.co/datasets/lodrick-the-lafted/kalo-opus-instruct-3k-filtered)
100
+ - [anthracite-org/nopm_claude_writing_fixed](https://huggingface.co/datasets/anthracite-org/nopm_claude_writing_fixed)
101
+ - [anthracite-org/kalo_opus_misc_240827](https://huggingface.co/datasets/anthracite-org/kalo_opus_misc_240827)
102
+ - [anthracite-org/kalo_misc_part2](https://huggingface.co/datasets/anthracite-org/kalo_misc_part2)
103
+
104
+
105
+ ## Axolotl config
106
+
107
+ <details><summary>See axolotl config</summary>
108
+
109
+ ```yaml
110
+ base_model: /workspace/data/models/Qwen2.5-72B-Instruct
111
+ model_type: AutoModelForCausalLM
112
+ tokenizer_type: AutoTokenizer
113
+
114
+ plugins:
115
+ - axolotl.integrations.liger.LigerPlugin
116
+ liger_rope: true
117
+ liger_rms_norm: true
118
+ liger_swiglu: true
119
+ liger_fused_linear_cross_entropy: true
120
+
121
+ load_in_8bit: false
122
+ load_in_4bit: false
123
+ strict: false
124
+
125
+ datasets:
126
+ - path: anthracite-org/c2_logs_32k_llama3_qwen2_v1.2
127
+ type: sharegpt
128
+ conversation: chatml
129
+ - path: anthracite-org/kalo-opus-instruct-22k-no-refusal
130
+ type: sharegpt
131
+ conversation: chatml
132
+ - path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
133
+ type: sharegpt
134
+ conversation: chatml
135
+ - path: anthracite-org/nopm_claude_writing_fixed
136
+ type: sharegpt
137
+ conversation: chatml
138
+ - path: anthracite-org/kalo_opus_misc_240827
139
+ type: sharegpt
140
+ conversation: chatml
141
+ - path: anthracite-org/kalo_misc_part2
142
+ type: sharegpt
143
+ conversation: chatml
144
+ #chat_template: chatml
145
+ shuffle_merged_datasets: true
146
+ #default_system_message: "You are an assistant that responds to the user."
147
+ dataset_prepared_path: /workspace/data/magnum-72b-data
148
+ val_set_size: 0.0
149
+ output_dir: /workspace/data/72b-fft-out
150
+
151
+ sequence_len: 32768
152
+ sample_packing: true
153
+ pad_to_sequence_len: true
154
+
155
+ adapter:
156
+ lora_model_dir:
157
+ lora_r:
158
+ lora_alpha:
159
+ lora_dropout:
160
+ lora_target_linear:
161
+ lora_fan_in_fan_out:
162
+
163
+ wandb_project: 72b-magnum-fft
164
+ wandb_entity:
165
+ wandb_watch:
166
+ wandb_name: alter-attempt-01
167
+ wandb_log_model:
168
+
169
+ gradient_accumulation_steps: 2
170
+ micro_batch_size: 1
171
+ num_epochs: 2
172
+ optimizer: adamw_bnb_8bit
173
+ lr_scheduler: cosine
174
+ learning_rate: 0.000004
175
+
176
+ train_on_inputs: false
177
+ group_by_length: false
178
+ bf16: auto
179
+ fp16:
180
+ tf32: false
181
+
182
+ gradient_checkpointing: true
183
+ early_stopping_patience:
184
+ resume_from_checkpoint:
185
+ local_rank:
186
+ logging_steps: 1
187
+ xformers_attention:
188
+ flash_attention: true
189
+
190
+ warmup_steps: 40
191
+ evals_per_epoch:
192
+ eval_table_size:
193
+ eval_max_new_tokens:
194
+ saves_per_epoch: 2
195
+ debug:
196
+ deepspeed: deepspeed_configs/zero3_bf16.json
197
+ weight_decay: 0.01
198
+ fsdp:
199
+ fsdp_config:
200
+ special_tokens:
201
+
202
+ ```
203
+ </details><br>
204
+
205
+ ## Training
206
+ The model was trained for 2 epochs on 8x [AMD Instinct™ MI300X Accelerators](https://www.amd.com/en/products/accelerators/instinct/mi300/mi300x.html) for full-parameter fine-tuning of the model.
207
+
208
+ The model was trained with an LR of 4e-6 for 2 epochs and with the Liger kernel.
209
+
210
+ Sample Packing was done for 32k tokens, with individual sequences up to 32k tokens in length.
211
+
212
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
213
+
214
+ ## Safety
215
+ ...