AiAF commited on
Commit
f2026f6
·
1 Parent(s): eb762d9

Psycheswings Flux1 LoRA Repo update. Contains runtime samples, the config.yaml file, optimizer.pt file, and I also threw in ai-toolkit notebooks directory for later use and as backups of the trainers (Dev and Schnell)

Browse files
Files changed (47) hide show
  1. config.yaml +99 -0
  2. notebooks/FLUX_1_dev_LoRA_Training.ipynb +291 -0
  3. notebooks/FLUX_1_schnell_LoRA_Training.ipynb +296 -0
  4. notebooks/SliderTraining.ipynb +339 -0
  5. optimizer.pt +3 -0
  6. samples/1733773241950__000000000_0.jpg +3 -0
  7. samples/1733773260524__000000000_1.jpg +3 -0
  8. samples/1733773279091__000000000_2.jpg +3 -0
  9. samples/1733773297647__000000000_3.jpg +3 -0
  10. samples/1733773316204__000000000_4.jpg +3 -0
  11. samples/1733773334761__000000000_5.jpg +3 -0
  12. samples/1733773854236__000000250_0.jpg +3 -0
  13. samples/1733773872815__000000250_1.jpg +3 -0
  14. samples/1733773891384__000000250_2.jpg +3 -0
  15. samples/1733773909950__000000250_3.jpg +3 -0
  16. samples/1733773928516__000000250_4.jpg +3 -0
  17. samples/1733773947082__000000250_5.jpg +3 -0
  18. samples/1733774468334__000000500_0.jpg +3 -0
  19. samples/1733774486925__000000500_1.jpg +3 -0
  20. samples/1733774505504__000000500_2.jpg +3 -0
  21. samples/1733774524087__000000500_3.jpg +3 -0
  22. samples/1733774542663__000000500_4.jpg +3 -0
  23. samples/1733774561257__000000500_5.jpg +3 -0
  24. samples/1733775084642__000000750_0.jpg +3 -0
  25. samples/1733775103231__000000750_1.jpg +3 -0
  26. samples/1733775121813__000000750_2.jpg +3 -0
  27. samples/1733775140394__000000750_3.jpg +3 -0
  28. samples/1733775158977__000000750_4.jpg +3 -0
  29. samples/1733775177573__000000750_5.jpg +3 -0
  30. samples/1733775697559__000001000_0.jpg +3 -0
  31. samples/1733775716149__000001000_1.jpg +3 -0
  32. samples/1733775734740__000001000_2.jpg +3 -0
  33. samples/1733775753330__000001000_3.jpg +3 -0
  34. samples/1733775771915__000001000_4.jpg +3 -0
  35. samples/1733775790516__000001000_5.jpg +3 -0
  36. samples/1733776314928__000001250_0.jpg +3 -0
  37. samples/1733776333525__000001250_1.jpg +3 -0
  38. samples/1733776352107__000001250_2.jpg +3 -0
  39. samples/1733776370700__000001250_3.jpg +3 -0
  40. samples/1733776389287__000001250_4.jpg +3 -0
  41. samples/1733776407876__000001250_5.jpg +3 -0
  42. samples/1733776931659__000001500_0.jpg +3 -0
  43. samples/1733776950250__000001500_1.jpg +3 -0
  44. samples/1733776968841__000001500_2.jpg +3 -0
  45. samples/1733776987424__000001500_3.jpg +3 -0
  46. samples/1733777006010__000001500_4.jpg +3 -0
  47. samples/1733777024585__000001500_5.jpg +3 -0
config.yaml ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ job: extension
2
+ config:
3
+ name: Psycheswings-Flux1
4
+ process:
5
+ - type: sd_trainer
6
+ training_folder: /content/output
7
+ performance_log_every: 100
8
+ device: cuda:0
9
+ network:
10
+ type: lora
11
+ linear: 16
12
+ linear_alpha: 16
13
+ save:
14
+ dtype: float16
15
+ save_every: 250
16
+ max_step_saves_to_keep: 10
17
+ datasets:
18
+ - folder_path: /content/dataset
19
+ caption_ext: txt
20
+ caption_dropout_rate: 0.05
21
+ shuffle_tokens: false
22
+ cache_latents_to_disk: true
23
+ resolution:
24
+ - 512
25
+ - 768
26
+ - 1024
27
+ train:
28
+ batch_size: 1
29
+ steps: 4000
30
+ gradient_accumulation_steps: 1
31
+ train_unet: true
32
+ train_text_encoder: false
33
+ content_or_style: balanced
34
+ gradient_checkpointing: true
35
+ noise_scheduler: flowmatch
36
+ optimizer: adamw8bit
37
+ lr: 0.0004
38
+ ema_config:
39
+ use_ema: true
40
+ ema_decay: 0.99
41
+ dtype: bf16
42
+ model:
43
+ name_or_path: black-forest-labs/FLUX.1-dev
44
+ is_flux: true
45
+ quantize: true
46
+ sample:
47
+ sampler: flowmatch
48
+ sample_every: 250
49
+ width: 1024
50
+ height: 1024
51
+ prompts:
52
+ - psyche\(Person\), Psycheswings \(Person\), @psychedwings, Photo of a young
53
+ woman with curly blonde hair, smiling at the camera. She is wearing a red
54
+ bikini top that accentuates her large breasts and cleavage. The background
55
+ is a serene beach scene with gentle waves crashing onto the shore. The lighting
56
+ is soft and natural, casting gentle shadows on her face and body. The overall
57
+ mood is cheerful and inviting.
58
+ - psyche\(Person\), Psycheswings \(Person\), @psychedwings, A young woman with
59
+ curly blonde hair, wearing a blue and white striped shirt, smiling at the
60
+ camera. She is standing in the middle of the image, with a bright blue sky
61
+ and a few people in the background. The building behind her has arches and
62
+ intricate carvings, and the image is taken from a low angle, giving a clear
63
+ view of her upper body. The lighting is bright and natural, highlighting her
64
+ curly hair and the blue sky.
65
+ - psyche\(Person\), Psycheswings \(Person\), @psychedwings, Photo of a young
66
+ woman with curly blonde hair, wearing a pink ribbed crop top and blue jeans,
67
+ standing in a modern living room with a beige carpet, white walls, and shelves
68
+ with various items. She is smiling at the camera, looking directly at the
69
+ viewer. The lighting is bright and natural, coming from the left side of the
70
+ image. The woman has fair skin and a slender physique. The image has a high-quality,
71
+ professional feel.
72
+ - psyche\(Person\), Psycheswings \(Person\), @psychedwings, Photo of a young
73
+ woman with curly blonde hair, wearing a green dress with white daisies, sitting
74
+ on a grassy field at sunset. She has a fair complexion and is looking directly
75
+ at the camera with a slight smile. The background features a row of houses
76
+ and a tree. The lighting is warm and golden, casting gentle shadows on her
77
+ face and body. The overall mood is peaceful and serene.
78
+ - psyche\(Person\), Psycheswings \(Person\), @psychedwings, Photo of a young
79
+ woman with curly blonde hair, wearing a blue sweater vest over a white shirt
80
+ and a red plaid skirt, standing in a classroom setting with a desk and chair
81
+ in the background. She has a neutral expression and is looking directly at
82
+ the camera. The lighting is soft and natural, casting gentle shadows on her
83
+ face. The image has a high-quality, professional feel.
84
+ - psyche\(Person\), Psycheswings \(Person\), @psychedwings, Photo of a young
85
+ woman with curly blonde hair, wearing a yellow raincoat over a grey t-shirt,
86
+ standing on a rainy street in a city. She has a neutral expression and is
87
+ looking directly at the camera. She is holding a sign that says "Psyche".
88
+ The background features tall buildings and a cloudy sky. The image is taken
89
+ from a low angle, focusing on the woman's face and upper body. The lighting
90
+ is soft and natural, highlighting her features. The overall mood is moody
91
+ and rainy.
92
+ neg: ''
93
+ seed: 79200
94
+ walk_seed: true
95
+ guidance_scale: 4
96
+ sample_steps: 20
97
+ meta:
98
+ name: Psycheswings-Flux1
99
+ version: '1.0'
notebooks/FLUX_1_dev_LoRA_Training.ipynb ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "collapsed": false,
7
+ "id": "zl-S0m3pkQC5"
8
+ },
9
+ "source": [
10
+ "# AI Toolkit by Ostris\n",
11
+ "## FLUX.1-dev Training\n"
12
+ ]
13
+ },
14
+ {
15
+ "cell_type": "code",
16
+ "execution_count": null,
17
+ "metadata": {},
18
+ "outputs": [],
19
+ "source": [
20
+ "!nvidia-smi"
21
+ ]
22
+ },
23
+ {
24
+ "cell_type": "code",
25
+ "execution_count": null,
26
+ "metadata": {
27
+ "id": "BvAG0GKAh59G"
28
+ },
29
+ "outputs": [],
30
+ "source": [
31
+ "!git clone https://github.com/ostris/ai-toolkit\n",
32
+ "!mkdir -p /content/dataset"
33
+ ]
34
+ },
35
+ {
36
+ "cell_type": "markdown",
37
+ "metadata": {
38
+ "id": "UFUW4ZMmnp1V"
39
+ },
40
+ "source": [
41
+ "Put your image dataset in the `/content/dataset` folder"
42
+ ]
43
+ },
44
+ {
45
+ "cell_type": "code",
46
+ "execution_count": null,
47
+ "metadata": {
48
+ "id": "XGZqVER_aQJW"
49
+ },
50
+ "outputs": [],
51
+ "source": [
52
+ "!cd ai-toolkit && git submodule update --init --recursive && pip install -r requirements.txt\n"
53
+ ]
54
+ },
55
+ {
56
+ "cell_type": "markdown",
57
+ "metadata": {
58
+ "id": "OV0HnOI6o8V6"
59
+ },
60
+ "source": [
61
+ "## Model License\n",
62
+ "Training currently only works with FLUX.1-dev. Which means anything you train will inherit the non-commercial license. It is also a gated model, so you need to accept the license on HF before using it. Otherwise, this will fail. Here are the required steps to setup a license.\n",
63
+ "\n",
64
+ "Sign into HF and accept the model access here [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)\n",
65
+ "\n",
66
+ "[Get a READ key from huggingface](https://huggingface.co/settings/tokens/new?) and place it in the next cell after running it."
67
+ ]
68
+ },
69
+ {
70
+ "cell_type": "code",
71
+ "execution_count": null,
72
+ "metadata": {
73
+ "id": "3yZZdhFRoj2m"
74
+ },
75
+ "outputs": [],
76
+ "source": [
77
+ "import getpass\n",
78
+ "import os\n",
79
+ "\n",
80
+ "# Prompt for the token\n",
81
+ "hf_token = getpass.getpass('Enter your HF access token and press enter: ')\n",
82
+ "\n",
83
+ "# Set the environment variable\n",
84
+ "os.environ['HF_TOKEN'] = hf_token\n",
85
+ "\n",
86
+ "print(\"HF_TOKEN environment variable has been set.\")"
87
+ ]
88
+ },
89
+ {
90
+ "cell_type": "code",
91
+ "execution_count": null,
92
+ "metadata": {
93
+ "id": "9gO2EzQ1kQC8"
94
+ },
95
+ "outputs": [],
96
+ "source": [
97
+ "import os\n",
98
+ "import sys\n",
99
+ "sys.path.append('/content/ai-toolkit')\n",
100
+ "from toolkit.job import run_job\n",
101
+ "from collections import OrderedDict\n",
102
+ "from PIL import Image\n",
103
+ "import os\n",
104
+ "os.environ[\"HF_HUB_ENABLE_HF_TRANSFER\"] = \"1\""
105
+ ]
106
+ },
107
+ {
108
+ "cell_type": "markdown",
109
+ "metadata": {
110
+ "id": "N8UUFzVRigbC"
111
+ },
112
+ "source": [
113
+ "## Setup\n",
114
+ "\n",
115
+ "This is your config. It is documented pretty well. Normally you would do this as a yaml file, but for colab, this will work. This will run as is without modification, but feel free to edit as you want."
116
+ ]
117
+ },
118
+ {
119
+ "cell_type": "code",
120
+ "execution_count": null,
121
+ "metadata": {
122
+ "id": "_t28QURYjRQO"
123
+ },
124
+ "outputs": [],
125
+ "source": [
126
+ "from collections import OrderedDict\n",
127
+ "\n",
128
+ "job_to_run = OrderedDict([\n",
129
+ " ('job', 'extension'),\n",
130
+ " ('config', OrderedDict([\n",
131
+ " # this name will be the folder and filename name\n",
132
+ " ('name', 'my_first_flux_lora_v1'),\n",
133
+ " ('process', [\n",
134
+ " OrderedDict([\n",
135
+ " ('type', 'sd_trainer'),\n",
136
+ " # root folder to save training sessions/samples/weights\n",
137
+ " ('training_folder', '/content/output'),\n",
138
+ " # uncomment to see performance stats in the terminal every N steps\n",
139
+ " #('performance_log_every', 1000),\n",
140
+ " ('device', 'cuda:0'),\n",
141
+ " # if a trigger word is specified, it will be added to captions of training data if it does not already exist\n",
142
+ " # alternatively, in your captions you can add [trigger] and it will be replaced with the trigger word\n",
143
+ " # ('trigger_word', 'image'),\n",
144
+ " ('network', OrderedDict([\n",
145
+ " ('type', 'lora'),\n",
146
+ " ('linear', 16),\n",
147
+ " ('linear_alpha', 16)\n",
148
+ " ])),\n",
149
+ " ('save', OrderedDict([\n",
150
+ " ('dtype', 'float16'), # precision to save\n",
151
+ " ('save_every', 250), # save every this many steps\n",
152
+ " ('max_step_saves_to_keep', 4) # how many intermittent saves to keep\n",
153
+ " ])),\n",
154
+ " ('datasets', [\n",
155
+ " # datasets are a folder of images. captions need to be txt files with the same name as the image\n",
156
+ " # for instance image2.jpg and image2.txt. Only jpg, jpeg, and png are supported currently\n",
157
+ " # images will automatically be resized and bucketed into the resolution specified\n",
158
+ " OrderedDict([\n",
159
+ " ('folder_path', '/content/dataset'),\n",
160
+ " ('caption_ext', 'txt'),\n",
161
+ " ('caption_dropout_rate', 0.05), # will drop out the caption 5% of time\n",
162
+ " ('shuffle_tokens', False), # shuffle caption order, split by commas\n",
163
+ " ('cache_latents_to_disk', True), # leave this true unless you know what you're doing\n",
164
+ " ('resolution', [512, 768, 1024]) # flux enjoys multiple resolutions\n",
165
+ " ])\n",
166
+ " ]),\n",
167
+ " ('train', OrderedDict([\n",
168
+ " ('batch_size', 1),\n",
169
+ " ('steps', 2000), # total number of steps to train 500 - 4000 is a good range\n",
170
+ " ('gradient_accumulation_steps', 1),\n",
171
+ " ('train_unet', True),\n",
172
+ " ('train_text_encoder', False), # probably won't work with flux\n",
173
+ " ('content_or_style', 'balanced'), # content, style, balanced\n",
174
+ " ('gradient_checkpointing', True), # need the on unless you have a ton of vram\n",
175
+ " ('noise_scheduler', 'flowmatch'), # for training only\n",
176
+ " ('optimizer', 'adamw8bit'),\n",
177
+ " ('lr', 1e-4),\n",
178
+ "\n",
179
+ " # uncomment this to skip the pre training sample\n",
180
+ " # ('skip_first_sample', True),\n",
181
+ "\n",
182
+ " # uncomment to completely disable sampling\n",
183
+ " # ('disable_sampling', True),\n",
184
+ "\n",
185
+ " # uncomment to use new vell curved weighting. Experimental but may produce better results\n",
186
+ " # ('linear_timesteps', True),\n",
187
+ "\n",
188
+ " # ema will smooth out learning, but could slow it down. Recommended to leave on.\n",
189
+ " ('ema_config', OrderedDict([\n",
190
+ " ('use_ema', True),\n",
191
+ " ('ema_decay', 0.99)\n",
192
+ " ])),\n",
193
+ "\n",
194
+ " # will probably need this if gpu supports it for flux, other dtypes may not work correctly\n",
195
+ " ('dtype', 'bf16')\n",
196
+ " ])),\n",
197
+ " ('model', OrderedDict([\n",
198
+ " # huggingface model name or path\n",
199
+ " ('name_or_path', 'black-forest-labs/FLUX.1-dev'),\n",
200
+ " ('is_flux', True),\n",
201
+ " ('quantize', True), # run 8bit mixed precision\n",
202
+ " #('low_vram', True), # uncomment this if the GPU is connected to your monitors. It will use less vram to quantize, but is slower.\n",
203
+ " ])),\n",
204
+ " ('sample', OrderedDict([\n",
205
+ " ('sampler', 'flowmatch'), # must match train.noise_scheduler\n",
206
+ " ('sample_every', 250), # sample every this many steps\n",
207
+ " ('width', 1024),\n",
208
+ " ('height', 1024),\n",
209
+ " ('prompts', [\n",
210
+ " # you can add [trigger] to the prompts here and it will be replaced with the trigger word\n",
211
+ " #'[trigger] holding a sign that says \\'I LOVE PROMPTS!\\'',\n",
212
+ " 'woman with red hair, playing chess at the park, bomb going off in the background',\n",
213
+ " 'a woman holding a coffee cup, in a beanie, sitting at a cafe',\n",
214
+ " 'a horse is a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini',\n",
215
+ " 'a man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background',\n",
216
+ " 'a bear building a log cabin in the snow covered mountains',\n",
217
+ " 'woman playing the guitar, on stage, singing a song, laser lights, punk rocker',\n",
218
+ " 'hipster man with a beard, building a chair, in a wood shop',\n",
219
+ " 'photo of a man, white background, medium shot, modeling clothing, studio lighting, white backdrop',\n",
220
+ " 'a man holding a sign that says, \\'this is a sign\\'',\n",
221
+ " 'a bulldog, in a post apocalyptic world, with a shotgun, in a leather jacket, in a desert, with a motorcycle'\n",
222
+ " ]),\n",
223
+ " ('neg', ''), # not used on flux\n",
224
+ " ('seed', 42),\n",
225
+ " ('walk_seed', True),\n",
226
+ " ('guidance_scale', 4),\n",
227
+ " ('sample_steps', 20)\n",
228
+ " ]))\n",
229
+ " ])\n",
230
+ " ])\n",
231
+ " ])),\n",
232
+ " # you can add any additional meta info here. [name] is replaced with config name at top\n",
233
+ " ('meta', OrderedDict([\n",
234
+ " ('name', '[name]'),\n",
235
+ " ('version', '1.0')\n",
236
+ " ]))\n",
237
+ "])\n"
238
+ ]
239
+ },
240
+ {
241
+ "cell_type": "markdown",
242
+ "metadata": {
243
+ "id": "h6F1FlM2Wb3l"
244
+ },
245
+ "source": [
246
+ "## Run it\n",
247
+ "\n",
248
+ "Below does all the magic. Check your folders to the left. Items will be in output/LoRA/your_name_v1 In the samples folder, there are preiodic sampled. This doesnt work great with colab. They will be in /content/output"
249
+ ]
250
+ },
251
+ {
252
+ "cell_type": "code",
253
+ "execution_count": null,
254
+ "metadata": {
255
+ "id": "HkajwI8gteOh"
256
+ },
257
+ "outputs": [],
258
+ "source": [
259
+ "run_job(job_to_run)\n"
260
+ ]
261
+ },
262
+ {
263
+ "cell_type": "markdown",
264
+ "metadata": {
265
+ "id": "Hblgb5uwW5SD"
266
+ },
267
+ "source": [
268
+ "## Done\n",
269
+ "\n",
270
+ "Check your ourput dir and get your slider\n"
271
+ ]
272
+ }
273
+ ],
274
+ "metadata": {
275
+ "accelerator": "GPU",
276
+ "colab": {
277
+ "gpuType": "A100",
278
+ "machine_shape": "hm",
279
+ "provenance": []
280
+ },
281
+ "kernelspec": {
282
+ "display_name": "Python 3",
283
+ "name": "python3"
284
+ },
285
+ "language_info": {
286
+ "name": "python"
287
+ }
288
+ },
289
+ "nbformat": 4,
290
+ "nbformat_minor": 0
291
+ }
notebooks/FLUX_1_schnell_LoRA_Training.ipynb ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "collapsed": false,
7
+ "id": "zl-S0m3pkQC5"
8
+ },
9
+ "source": [
10
+ "# AI Toolkit by Ostris\n",
11
+ "## FLUX.1-schnell Training\n"
12
+ ]
13
+ },
14
+ {
15
+ "cell_type": "code",
16
+ "execution_count": null,
17
+ "metadata": {
18
+ "id": "3cokMT-WC6rG"
19
+ },
20
+ "outputs": [],
21
+ "source": [
22
+ "!nvidia-smi"
23
+ ]
24
+ },
25
+ {
26
+ "cell_type": "code",
27
+ "execution_count": null,
28
+ "metadata": {
29
+ "collapsed": true,
30
+ "id": "BvAG0GKAh59G"
31
+ },
32
+ "outputs": [],
33
+ "source": [
34
+ "!git clone https://github.com/ostris/ai-toolkit\n",
35
+ "!mkdir -p /content/dataset"
36
+ ]
37
+ },
38
+ {
39
+ "cell_type": "markdown",
40
+ "metadata": {
41
+ "id": "UFUW4ZMmnp1V"
42
+ },
43
+ "source": [
44
+ "Put your image dataset in the `/content/dataset` folder"
45
+ ]
46
+ },
47
+ {
48
+ "cell_type": "code",
49
+ "execution_count": null,
50
+ "metadata": {
51
+ "collapsed": true,
52
+ "id": "XGZqVER_aQJW"
53
+ },
54
+ "outputs": [],
55
+ "source": [
56
+ "!cd ai-toolkit && git submodule update --init --recursive && pip install -r requirements.txt\n"
57
+ ]
58
+ },
59
+ {
60
+ "cell_type": "markdown",
61
+ "metadata": {
62
+ "id": "OV0HnOI6o8V6"
63
+ },
64
+ "source": [
65
+ "## Model License\n",
66
+ "Training currently only works with FLUX.1-dev. Which means anything you train will inherit the non-commercial license. It is also a gated model, so you need to accept the license on HF before using it. Otherwise, this will fail. Here are the required steps to setup a license.\n",
67
+ "\n",
68
+ "Sign into HF and accept the model access here [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)\n",
69
+ "\n",
70
+ "[Get a READ key from huggingface](https://huggingface.co/settings/tokens/new?) and place it in the next cell after running it."
71
+ ]
72
+ },
73
+ {
74
+ "cell_type": "code",
75
+ "execution_count": null,
76
+ "metadata": {
77
+ "id": "3yZZdhFRoj2m"
78
+ },
79
+ "outputs": [],
80
+ "source": [
81
+ "import getpass\n",
82
+ "import os\n",
83
+ "\n",
84
+ "# Prompt for the token\n",
85
+ "hf_token = getpass.getpass('Enter your HF access token and press enter: ')\n",
86
+ "\n",
87
+ "# Set the environment variable\n",
88
+ "os.environ['HF_TOKEN'] = hf_token\n",
89
+ "\n",
90
+ "print(\"HF_TOKEN environment variable has been set.\")"
91
+ ]
92
+ },
93
+ {
94
+ "cell_type": "code",
95
+ "execution_count": 5,
96
+ "metadata": {
97
+ "id": "9gO2EzQ1kQC8"
98
+ },
99
+ "outputs": [],
100
+ "source": [
101
+ "import os\n",
102
+ "import sys\n",
103
+ "sys.path.append('/content/ai-toolkit')\n",
104
+ "from toolkit.job import run_job\n",
105
+ "from collections import OrderedDict\n",
106
+ "from PIL import Image\n",
107
+ "import os\n",
108
+ "os.environ[\"HF_HUB_ENABLE_HF_TRANSFER\"] = \"1\""
109
+ ]
110
+ },
111
+ {
112
+ "cell_type": "markdown",
113
+ "metadata": {
114
+ "id": "N8UUFzVRigbC"
115
+ },
116
+ "source": [
117
+ "## Setup\n",
118
+ "\n",
119
+ "This is your config. It is documented pretty well. Normally you would do this as a yaml file, but for colab, this will work. This will run as is without modification, but feel free to edit as you want."
120
+ ]
121
+ },
122
+ {
123
+ "cell_type": "code",
124
+ "execution_count": 6,
125
+ "metadata": {
126
+ "id": "_t28QURYjRQO"
127
+ },
128
+ "outputs": [],
129
+ "source": [
130
+ "from collections import OrderedDict\n",
131
+ "\n",
132
+ "job_to_run = OrderedDict([\n",
133
+ " ('job', 'extension'),\n",
134
+ " ('config', OrderedDict([\n",
135
+ " # this name will be the folder and filename name\n",
136
+ " ('name', 'my_first_flux_lora_v1'),\n",
137
+ " ('process', [\n",
138
+ " OrderedDict([\n",
139
+ " ('type', 'sd_trainer'),\n",
140
+ " # root folder to save training sessions/samples/weights\n",
141
+ " ('training_folder', '/content/output'),\n",
142
+ " # uncomment to see performance stats in the terminal every N steps\n",
143
+ " #('performance_log_every', 1000),\n",
144
+ " ('device', 'cuda:0'),\n",
145
+ " # if a trigger word is specified, it will be added to captions of training data if it does not already exist\n",
146
+ " # alternatively, in your captions you can add [trigger] and it will be replaced with the trigger word\n",
147
+ " # ('trigger_word', 'image'),\n",
148
+ " ('network', OrderedDict([\n",
149
+ " ('type', 'lora'),\n",
150
+ " ('linear', 16),\n",
151
+ " ('linear_alpha', 16)\n",
152
+ " ])),\n",
153
+ " ('save', OrderedDict([\n",
154
+ " ('dtype', 'float16'), # precision to save\n",
155
+ " ('save_every', 250), # save every this many steps\n",
156
+ " ('max_step_saves_to_keep', 4) # how many intermittent saves to keep\n",
157
+ " ])),\n",
158
+ " ('datasets', [\n",
159
+ " # datasets are a folder of images. captions need to be txt files with the same name as the image\n",
160
+ " # for instance image2.jpg and image2.txt. Only jpg, jpeg, and png are supported currently\n",
161
+ " # images will automatically be resized and bucketed into the resolution specified\n",
162
+ " OrderedDict([\n",
163
+ " ('folder_path', '/content/dataset'),\n",
164
+ " ('caption_ext', 'txt'),\n",
165
+ " ('caption_dropout_rate', 0.05), # will drop out the caption 5% of time\n",
166
+ " ('shuffle_tokens', False), # shuffle caption order, split by commas\n",
167
+ " ('cache_latents_to_disk', True), # leave this true unless you know what you're doing\n",
168
+ " ('resolution', [512, 768, 1024]) # flux enjoys multiple resolutions\n",
169
+ " ])\n",
170
+ " ]),\n",
171
+ " ('train', OrderedDict([\n",
172
+ " ('batch_size', 1),\n",
173
+ " ('steps', 2000), # total number of steps to train 500 - 4000 is a good range\n",
174
+ " ('gradient_accumulation_steps', 1),\n",
175
+ " ('train_unet', True),\n",
176
+ " ('train_text_encoder', False), # probably won't work with flux\n",
177
+ " ('gradient_checkpointing', True), # need the on unless you have a ton of vram\n",
178
+ " ('noise_scheduler', 'flowmatch'), # for training only\n",
179
+ " ('optimizer', 'adamw8bit'),\n",
180
+ " ('lr', 1e-4),\n",
181
+ "\n",
182
+ " # uncomment this to skip the pre training sample\n",
183
+ " # ('skip_first_sample', True),\n",
184
+ "\n",
185
+ " # uncomment to completely disable sampling\n",
186
+ " # ('disable_sampling', True),\n",
187
+ "\n",
188
+ " # uncomment to use new vell curved weighting. Experimental but may produce better results\n",
189
+ " # ('linear_timesteps', True),\n",
190
+ "\n",
191
+ " # ema will smooth out learning, but could slow it down. Recommended to leave on.\n",
192
+ " ('ema_config', OrderedDict([\n",
193
+ " ('use_ema', True),\n",
194
+ " ('ema_decay', 0.99)\n",
195
+ " ])),\n",
196
+ "\n",
197
+ " # will probably need this if gpu supports it for flux, other dtypes may not work correctly\n",
198
+ " ('dtype', 'bf16')\n",
199
+ " ])),\n",
200
+ " ('model', OrderedDict([\n",
201
+ " # huggingface model name or path\n",
202
+ " ('name_or_path', 'black-forest-labs/FLUX.1-schnell'),\n",
203
+ " ('assistant_lora_path', 'ostris/FLUX.1-schnell-training-adapter'), # Required for flux schnell training\n",
204
+ " ('is_flux', True),\n",
205
+ " ('quantize', True), # run 8bit mixed precision\n",
206
+ " # low_vram is painfully slow to fuse in the adapter avoid it unless absolutely necessary\n",
207
+ " #('low_vram', True), # uncomment this if the GPU is connected to your monitors. It will use less vram to quantize, but is slower.\n",
208
+ " ])),\n",
209
+ " ('sample', OrderedDict([\n",
210
+ " ('sampler', 'flowmatch'), # must match train.noise_scheduler\n",
211
+ " ('sample_every', 250), # sample every this many steps\n",
212
+ " ('width', 1024),\n",
213
+ " ('height', 1024),\n",
214
+ " ('prompts', [\n",
215
+ " # you can add [trigger] to the prompts here and it will be replaced with the trigger word\n",
216
+ " #'[trigger] holding a sign that says \\'I LOVE PROMPTS!\\'',\n",
217
+ " 'woman with red hair, playing chess at the park, bomb going off in the background',\n",
218
+ " 'a woman holding a coffee cup, in a beanie, sitting at a cafe',\n",
219
+ " 'a horse is a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini',\n",
220
+ " 'a man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background',\n",
221
+ " 'a bear building a log cabin in the snow covered mountains',\n",
222
+ " 'woman playing the guitar, on stage, singing a song, laser lights, punk rocker',\n",
223
+ " 'hipster man with a beard, building a chair, in a wood shop',\n",
224
+ " 'photo of a man, white background, medium shot, modeling clothing, studio lighting, white backdrop',\n",
225
+ " 'a man holding a sign that says, \\'this is a sign\\'',\n",
226
+ " 'a bulldog, in a post apocalyptic world, with a shotgun, in a leather jacket, in a desert, with a motorcycle'\n",
227
+ " ]),\n",
228
+ " ('neg', ''), # not used on flux\n",
229
+ " ('seed', 42),\n",
230
+ " ('walk_seed', True),\n",
231
+ " ('guidance_scale', 1), # schnell does not do guidance\n",
232
+ " ('sample_steps', 4) # 1 - 4 works well\n",
233
+ " ]))\n",
234
+ " ])\n",
235
+ " ])\n",
236
+ " ])),\n",
237
+ " # you can add any additional meta info here. [name] is replaced with config name at top\n",
238
+ " ('meta', OrderedDict([\n",
239
+ " ('name', '[name]'),\n",
240
+ " ('version', '1.0')\n",
241
+ " ]))\n",
242
+ "])\n"
243
+ ]
244
+ },
245
+ {
246
+ "cell_type": "markdown",
247
+ "metadata": {
248
+ "id": "h6F1FlM2Wb3l"
249
+ },
250
+ "source": [
251
+ "## Run it\n",
252
+ "\n",
253
+ "Below does all the magic. Check your folders to the left. Items will be in output/LoRA/your_name_v1 In the samples folder, there are preiodic sampled. This doesnt work great with colab. They will be in /content/output"
254
+ ]
255
+ },
256
+ {
257
+ "cell_type": "code",
258
+ "execution_count": null,
259
+ "metadata": {
260
+ "id": "HkajwI8gteOh"
261
+ },
262
+ "outputs": [],
263
+ "source": [
264
+ "run_job(job_to_run)\n"
265
+ ]
266
+ },
267
+ {
268
+ "cell_type": "markdown",
269
+ "metadata": {
270
+ "id": "Hblgb5uwW5SD"
271
+ },
272
+ "source": [
273
+ "## Done\n",
274
+ "\n",
275
+ "Check your ourput dir and get your slider\n"
276
+ ]
277
+ }
278
+ ],
279
+ "metadata": {
280
+ "accelerator": "GPU",
281
+ "colab": {
282
+ "gpuType": "A100",
283
+ "machine_shape": "hm",
284
+ "provenance": []
285
+ },
286
+ "kernelspec": {
287
+ "display_name": "Python 3",
288
+ "name": "python3"
289
+ },
290
+ "language_info": {
291
+ "name": "python"
292
+ }
293
+ },
294
+ "nbformat": 4,
295
+ "nbformat_minor": 0
296
+ }
notebooks/SliderTraining.ipynb ADDED
@@ -0,0 +1,339 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": [],
7
+ "machine_shape": "hm",
8
+ "gpuType": "V100"
9
+ },
10
+ "kernelspec": {
11
+ "name": "python3",
12
+ "display_name": "Python 3"
13
+ },
14
+ "language_info": {
15
+ "name": "python"
16
+ },
17
+ "accelerator": "GPU"
18
+ },
19
+ "cells": [
20
+ {
21
+ "cell_type": "markdown",
22
+ "source": [
23
+ "# AI Toolkit by Ostris\n",
24
+ "## Slider Training\n",
25
+ "\n",
26
+ "This is a quick colab demo for training sliders like can be found in my CivitAI profile https://civitai.com/user/Ostris/models . I will work on making it more user friendly, but for now, it will get you started."
27
+ ],
28
+ "metadata": {
29
+ "collapsed": false
30
+ }
31
+ },
32
+ {
33
+ "cell_type": "code",
34
+ "source": [
35
+ "!git clone https://github.com/ostris/ai-toolkit"
36
+ ],
37
+ "metadata": {
38
+ "id": "BvAG0GKAh59G"
39
+ },
40
+ "execution_count": null,
41
+ "outputs": []
42
+ },
43
+ {
44
+ "cell_type": "code",
45
+ "execution_count": null,
46
+ "metadata": {
47
+ "id": "XGZqVER_aQJW"
48
+ },
49
+ "outputs": [],
50
+ "source": [
51
+ "!cd ai-toolkit && git submodule update --init --recursive && pip install -r requirements.txt\n"
52
+ ]
53
+ },
54
+ {
55
+ "cell_type": "code",
56
+ "source": [
57
+ "import os\n",
58
+ "import sys\n",
59
+ "sys.path.append('/content/ai-toolkit')\n",
60
+ "from toolkit.job import run_job\n",
61
+ "from collections import OrderedDict\n",
62
+ "from PIL import Image"
63
+ ],
64
+ "metadata": {
65
+ "collapsed": false
66
+ },
67
+ "outputs": []
68
+ },
69
+ {
70
+ "cell_type": "markdown",
71
+ "source": [
72
+ "## Setup\n",
73
+ "\n",
74
+ "This is your config. It is documented pretty well. Normally you would do this as a yaml file, but for colab, this will work. This will run as is without modification, but feel free to edit as you want."
75
+ ],
76
+ "metadata": {
77
+ "id": "N8UUFzVRigbC"
78
+ }
79
+ },
80
+ {
81
+ "cell_type": "code",
82
+ "source": [
83
+ "from collections import OrderedDict\n",
84
+ "\n",
85
+ "job_to_run = OrderedDict({\n",
86
+ " # This is the config I use on my sliders, It is solid and tested\n",
87
+ " 'job': 'train',\n",
88
+ " 'config': {\n",
89
+ " # the name will be used to create a folder in the output folder\n",
90
+ " # it will also replace any [name] token in the rest of this config\n",
91
+ " 'name': 'detail_slider_v1',\n",
92
+ " # folder will be created with name above in folder below\n",
93
+ " # it can be relative to the project root or absolute\n",
94
+ " 'training_folder': \"output/LoRA\",\n",
95
+ " 'device': 'cuda', # cpu, cuda:0, etc\n",
96
+ " # for tensorboard logging, we will make a subfolder for this job\n",
97
+ " 'log_dir': \"output/.tensorboard\",\n",
98
+ " # you can stack processes for other jobs, It is not tested with sliders though\n",
99
+ " # just use one for now\n",
100
+ " 'process': [\n",
101
+ " {\n",
102
+ " 'type': 'slider', # tells runner to run the slider process\n",
103
+ " # network is the LoRA network for a slider, I recommend to leave this be\n",
104
+ " 'network': {\n",
105
+ " 'type': \"lora\",\n",
106
+ " # rank / dim of the network. Bigger is not always better. Especially for sliders. 8 is good\n",
107
+ " 'linear': 8, # \"rank\" or \"dim\"\n",
108
+ " 'linear_alpha': 4, # Do about half of rank \"alpha\"\n",
109
+ " # 'conv': 4, # for convolutional layers \"locon\"\n",
110
+ " # 'conv_alpha': 4, # Do about half of conv \"alpha\"\n",
111
+ " },\n",
112
+ " # training config\n",
113
+ " 'train': {\n",
114
+ " # this is also used in sampling. Stick with ddpm unless you know what you are doing\n",
115
+ " 'noise_scheduler': \"ddpm\", # or \"ddpm\", \"lms\", \"euler_a\"\n",
116
+ " # how many steps to train. More is not always better. I rarely go over 1000\n",
117
+ " 'steps': 100,\n",
118
+ " # I have had good results with 4e-4 to 1e-4 at 500 steps\n",
119
+ " 'lr': 2e-4,\n",
120
+ " # enables gradient checkpoint, saves vram, leave it on\n",
121
+ " 'gradient_checkpointing': True,\n",
122
+ " # train the unet. I recommend leaving this true\n",
123
+ " 'train_unet': True,\n",
124
+ " # train the text encoder. I don't recommend this unless you have a special use case\n",
125
+ " # for sliders we are adjusting representation of the concept (unet),\n",
126
+ " # not the description of it (text encoder)\n",
127
+ " 'train_text_encoder': False,\n",
128
+ "\n",
129
+ " # just leave unless you know what you are doing\n",
130
+ " # also supports \"dadaptation\" but set lr to 1 if you use that,\n",
131
+ " # but it learns too fast and I don't recommend it\n",
132
+ " 'optimizer': \"adamw\",\n",
133
+ " # only constant for now\n",
134
+ " 'lr_scheduler': \"constant\",\n",
135
+ " # we randomly denoise random num of steps form 1 to this number\n",
136
+ " # while training. Just leave it\n",
137
+ " 'max_denoising_steps': 40,\n",
138
+ " # works great at 1. I do 1 even with my 4090.\n",
139
+ " # higher may not work right with newer single batch stacking code anyway\n",
140
+ " 'batch_size': 1,\n",
141
+ " # bf16 works best if your GPU supports it (modern)\n",
142
+ " 'dtype': 'bf16', # fp32, bf16, fp16\n",
143
+ " # I don't recommend using unless you are trying to make a darker lora. Then do 0.1 MAX\n",
144
+ " # although, the way we train sliders is comparative, so it probably won't work anyway\n",
145
+ " 'noise_offset': 0.0,\n",
146
+ " },\n",
147
+ "\n",
148
+ " # the model to train the LoRA network on\n",
149
+ " 'model': {\n",
150
+ " # name_or_path can be a hugging face name, local path or url to model\n",
151
+ " # on civit ai with or without modelVersionId. They will be cached in /model folder\n",
152
+ " # epicRealisim v5\n",
153
+ " 'name_or_path': \"https://civitai.com/models/25694?modelVersionId=134065\",\n",
154
+ " 'is_v2': False, # for v2 models\n",
155
+ " 'is_v_pred': False, # for v-prediction models (most v2 models)\n",
156
+ " # has some issues with the dual text encoder and the way we train sliders\n",
157
+ " # it works bit weights need to probably be higher to see it.\n",
158
+ " 'is_xl': False, # for SDXL models\n",
159
+ " },\n",
160
+ "\n",
161
+ " # saving config\n",
162
+ " 'save': {\n",
163
+ " 'dtype': 'float16', # precision to save. I recommend float16\n",
164
+ " 'save_every': 50, # save every this many steps\n",
165
+ " # this will remove step counts more than this number\n",
166
+ " # allows you to save more often in case of a crash without filling up your drive\n",
167
+ " 'max_step_saves_to_keep': 2,\n",
168
+ " },\n",
169
+ "\n",
170
+ " # sampling config\n",
171
+ " 'sample': {\n",
172
+ " # must match train.noise_scheduler, this is not used here\n",
173
+ " # but may be in future and in other processes\n",
174
+ " 'sampler': \"ddpm\",\n",
175
+ " # sample every this many steps\n",
176
+ " 'sample_every': 20,\n",
177
+ " # image size\n",
178
+ " 'width': 512,\n",
179
+ " 'height': 512,\n",
180
+ " # prompts to use for sampling. Do as many as you want, but it slows down training\n",
181
+ " # pick ones that will best represent the concept you are trying to adjust\n",
182
+ " # allows some flags after the prompt\n",
183
+ " # --m [number] # network multiplier. LoRA weight. -3 for the negative slide, 3 for the positive\n",
184
+ " # slide are good tests. will inherit sample.network_multiplier if not set\n",
185
+ " # --n [string] # negative prompt, will inherit sample.neg if not set\n",
186
+ " # Only 75 tokens allowed currently\n",
187
+ " # I like to do a wide positive and negative spread so I can see a good range and stop\n",
188
+ " # early if the network is braking down\n",
189
+ " 'prompts': [\n",
190
+ " \"a woman in a coffee shop, black hat, blonde hair, blue jacket --m -5\",\n",
191
+ " \"a woman in a coffee shop, black hat, blonde hair, blue jacket --m -3\",\n",
192
+ " \"a woman in a coffee shop, black hat, blonde hair, blue jacket --m 3\",\n",
193
+ " \"a woman in a coffee shop, black hat, blonde hair, blue jacket --m 5\",\n",
194
+ " \"a golden retriever sitting on a leather couch, --m -5\",\n",
195
+ " \"a golden retriever sitting on a leather couch --m -3\",\n",
196
+ " \"a golden retriever sitting on a leather couch --m 3\",\n",
197
+ " \"a golden retriever sitting on a leather couch --m 5\",\n",
198
+ " \"a man with a beard and red flannel shirt, wearing vr goggles, walking into traffic --m -5\",\n",
199
+ " \"a man with a beard and red flannel shirt, wearing vr goggles, walking into traffic --m -3\",\n",
200
+ " \"a man with a beard and red flannel shirt, wearing vr goggles, walking into traffic --m 3\",\n",
201
+ " \"a man with a beard and red flannel shirt, wearing vr goggles, walking into traffic --m 5\",\n",
202
+ " ],\n",
203
+ " # negative prompt used on all prompts above as default if they don't have one\n",
204
+ " 'neg': \"cartoon, fake, drawing, illustration, cgi, animated, anime, monochrome\",\n",
205
+ " # seed for sampling. 42 is the answer for everything\n",
206
+ " 'seed': 42,\n",
207
+ " # walks the seed so s1 is 42, s2 is 43, s3 is 44, etc\n",
208
+ " # will start over on next sample_every so s1 is always seed\n",
209
+ " # works well if you use same prompt but want different results\n",
210
+ " 'walk_seed': False,\n",
211
+ " # cfg scale (4 to 10 is good)\n",
212
+ " 'guidance_scale': 7,\n",
213
+ " # sampler steps (20 to 30 is good)\n",
214
+ " 'sample_steps': 20,\n",
215
+ " # default network multiplier for all prompts\n",
216
+ " # since we are training a slider, I recommend overriding this with --m [number]\n",
217
+ " # in the prompts above to get both sides of the slider\n",
218
+ " 'network_multiplier': 1.0,\n",
219
+ " },\n",
220
+ "\n",
221
+ " # logging information\n",
222
+ " 'logging': {\n",
223
+ " 'log_every': 10, # log every this many steps\n",
224
+ " 'use_wandb': False, # not supported yet\n",
225
+ " 'verbose': False, # probably done need unless you are debugging\n",
226
+ " },\n",
227
+ "\n",
228
+ " # slider training config, best for last\n",
229
+ " 'slider': {\n",
230
+ " # resolutions to train on. [ width, height ]. This is less important for sliders\n",
231
+ " # as we are not teaching the model anything it doesn't already know\n",
232
+ " # but must be a size it understands [ 512, 512 ] for sd_v1.5 and [ 768, 768 ] for sd_v2.1\n",
233
+ " # and [ 1024, 1024 ] for sd_xl\n",
234
+ " # you can do as many as you want here\n",
235
+ " 'resolutions': [\n",
236
+ " [512, 512],\n",
237
+ " # [ 512, 768 ]\n",
238
+ " # [ 768, 768 ]\n",
239
+ " ],\n",
240
+ " # slider training uses 4 combined steps for a single round. This will do it in one gradient\n",
241
+ " # step. It is highly optimized and shouldn't take anymore vram than doing without it,\n",
242
+ " # since we break down batches for gradient accumulation now. so just leave it on.\n",
243
+ " 'batch_full_slide': True,\n",
244
+ " # These are the concepts to train on. You can do as many as you want here,\n",
245
+ " # but they can conflict outweigh each other. Other than experimenting, I recommend\n",
246
+ " # just doing one for good results\n",
247
+ " 'targets': [\n",
248
+ " # target_class is the base concept we are adjusting the representation of\n",
249
+ " # for example, if we are adjusting the representation of a person, we would use \"person\"\n",
250
+ " # if we are adjusting the representation of a cat, we would use \"cat\" It is not\n",
251
+ " # a keyword necessarily but what the model understands the concept to represent.\n",
252
+ " # \"person\" will affect men, women, children, etc but will not affect cats, dogs, etc\n",
253
+ " # it is the models base general understanding of the concept and everything it represents\n",
254
+ " # you can leave it blank to affect everything. In this example, we are adjusting\n",
255
+ " # detail, so we will leave it blank to affect everything\n",
256
+ " {\n",
257
+ " 'target_class': \"\",\n",
258
+ " # positive is the prompt for the positive side of the slider.\n",
259
+ " # It is the concept that will be excited and amplified in the model when we slide the slider\n",
260
+ " # to the positive side and forgotten / inverted when we slide\n",
261
+ " # the slider to the negative side. It is generally best to include the target_class in\n",
262
+ " # the prompt. You want it to be the extreme of what you want to train on. For example,\n",
263
+ " # if you want to train on fat people, you would use \"an extremely fat, morbidly obese person\"\n",
264
+ " # as the prompt. Not just \"fat person\"\n",
265
+ " # max 75 tokens for now\n",
266
+ " 'positive': \"high detail, 8k, intricate, detailed, high resolution, high res, high quality\",\n",
267
+ " # negative is the prompt for the negative side of the slider and works the same as positive\n",
268
+ " # it does not necessarily work the same as a negative prompt when generating images\n",
269
+ " # these need to be polar opposites.\n",
270
+ " # max 76 tokens for now\n",
271
+ " 'negative': \"blurry, boring, fuzzy, low detail, low resolution, low res, low quality\",\n",
272
+ " # the loss for this target is multiplied by this number.\n",
273
+ " # if you are doing more than one target it may be good to set less important ones\n",
274
+ " # to a lower number like 0.1 so they don't outweigh the primary target\n",
275
+ " 'weight': 1.0,\n",
276
+ " },\n",
277
+ " ],\n",
278
+ " },\n",
279
+ " },\n",
280
+ " ]\n",
281
+ " },\n",
282
+ "\n",
283
+ " # You can put any information you want here, and it will be saved in the model.\n",
284
+ " # The below is an example, but you can put your grocery list in it if you want.\n",
285
+ " # It is saved in the model so be aware of that. The software will include this\n",
286
+ " # plus some other information for you automatically\n",
287
+ " 'meta': {\n",
288
+ " # [name] gets replaced with the name above\n",
289
+ " 'name': \"[name]\",\n",
290
+ " 'version': '1.0',\n",
291
+ " # 'creator': {\n",
292
+ " # 'name': 'your name',\n",
293
+ " # 'email': '[email protected]',\n",
294
+ " # 'website': 'https://your.website'\n",
295
+ " # }\n",
296
+ " }\n",
297
+ "})\n"
298
+ ],
299
+ "metadata": {
300
+ "id": "_t28QURYjRQO"
301
+ },
302
+ "execution_count": null,
303
+ "outputs": []
304
+ },
305
+ {
306
+ "cell_type": "markdown",
307
+ "source": [
308
+ "## Run it\n",
309
+ "\n",
310
+ "Below does all the magic. Check your folders to the left. Items will be in output/LoRA/your_name_v1 In the samples folder, there are preiodic sampled. This doesnt work great with colab. Ill update soon."
311
+ ],
312
+ "metadata": {
313
+ "id": "h6F1FlM2Wb3l"
314
+ }
315
+ },
316
+ {
317
+ "cell_type": "code",
318
+ "source": [
319
+ "run_job(job_to_run)\n"
320
+ ],
321
+ "metadata": {
322
+ "id": "HkajwI8gteOh"
323
+ },
324
+ "execution_count": null,
325
+ "outputs": []
326
+ },
327
+ {
328
+ "cell_type": "markdown",
329
+ "source": [
330
+ "## Done\n",
331
+ "\n",
332
+ "Check your ourput dir and get your slider\n"
333
+ ],
334
+ "metadata": {
335
+ "id": "Hblgb5uwW5SD"
336
+ }
337
+ }
338
+ ]
339
+ }
optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f2a6938c50ea1e94ab65f006c34141c55a64d5f2de742ba8a2366f54170065b
3
+ size 175676612
samples/1733773241950__000000000_0.jpg ADDED

Git LFS Details

  • SHA256: 8f3fe7b1cdf5ac1c04a89d2cd7e577b227e2a17b72b3489bbce7bd06dbdee8df
  • Pointer size: 130 Bytes
  • Size of remote file: 60.1 kB
samples/1733773260524__000000000_1.jpg ADDED

Git LFS Details

  • SHA256: 61602bac161fc467feb5284b35b03057498044755beceb694ffbe1fab079648e
  • Pointer size: 131 Bytes
  • Size of remote file: 118 kB
samples/1733773279091__000000000_2.jpg ADDED

Git LFS Details

  • SHA256: dc20d53c9f28d9c0d399b80428902fac4a1515fb1a5676b6ffd29d91da8ff9ec
  • Pointer size: 130 Bytes
  • Size of remote file: 89.6 kB
samples/1733773297647__000000000_3.jpg ADDED

Git LFS Details

  • SHA256: 435d03454449f71823914b198ba006eedf43fb0ea311fb48e7f463e72e1e873f
  • Pointer size: 130 Bytes
  • Size of remote file: 87.1 kB
samples/1733773316204__000000000_4.jpg ADDED

Git LFS Details

  • SHA256: 84e0264352d5e946b5ba3bc406b8febd5e7225cae8b5e5d11a27870625ddf442
  • Pointer size: 130 Bytes
  • Size of remote file: 74.7 kB
samples/1733773334761__000000000_5.jpg ADDED

Git LFS Details

  • SHA256: e71a4a6bce827691842ae60ddb47b585cadcb5eea6c13fe204c95880cb671876
  • Pointer size: 130 Bytes
  • Size of remote file: 80.9 kB
samples/1733773854236__000000250_0.jpg ADDED

Git LFS Details

  • SHA256: abe989665717a6f5d37cd8d87952233f43d3a6a346f336eb872f794b0f5d3f0e
  • Pointer size: 130 Bytes
  • Size of remote file: 49 kB
samples/1733773872815__000000250_1.jpg ADDED

Git LFS Details

  • SHA256: c2e402f61495c86d6c195d166542121980f5756162f19190f893e7c7701cd2d3
  • Pointer size: 131 Bytes
  • Size of remote file: 129 kB
samples/1733773891384__000000250_2.jpg ADDED

Git LFS Details

  • SHA256: 962e251f27d9343f542038fb6b4e4937777d5fce8dcc6f1515c7471154def307
  • Pointer size: 130 Bytes
  • Size of remote file: 85.2 kB
samples/1733773909950__000000250_3.jpg ADDED

Git LFS Details

  • SHA256: 721b65cd42ac405049655553a2529c5af734bd0da984db71bea78e6da7bf95c3
  • Pointer size: 130 Bytes
  • Size of remote file: 91.2 kB
samples/1733773928516__000000250_4.jpg ADDED

Git LFS Details

  • SHA256: d0d61638c383d1fc1146333985e5f6e5959124bb4664c36aaaff13a528cc30d5
  • Pointer size: 130 Bytes
  • Size of remote file: 79.5 kB
samples/1733773947082__000000250_5.jpg ADDED

Git LFS Details

  • SHA256: f15f916d96ee88e2321fcfa9cc4fd9423155488c0c483c532a87be71623259ca
  • Pointer size: 130 Bytes
  • Size of remote file: 83.7 kB
samples/1733774468334__000000500_0.jpg ADDED

Git LFS Details

  • SHA256: 3874fbb0892d71141d0e95beb0d780ab0f63ba74a980f0e9d92af2f1dd3944b9
  • Pointer size: 130 Bytes
  • Size of remote file: 52.7 kB
samples/1733774486925__000000500_1.jpg ADDED

Git LFS Details

  • SHA256: c4159ee37c7ede7440ae5523ee497c2be4abd55f1b7ca89bb8cd814e4a32acf7
  • Pointer size: 131 Bytes
  • Size of remote file: 107 kB
samples/1733774505504__000000500_2.jpg ADDED

Git LFS Details

  • SHA256: bc3da5bb5ec3d926bfc98e580541af39da070f4342898b53690215d91bf68c2f
  • Pointer size: 130 Bytes
  • Size of remote file: 78.4 kB
samples/1733774524087__000000500_3.jpg ADDED

Git LFS Details

  • SHA256: ec83197dadef40cae0311f068ce93fae0ba9e4a74f6551e736ee6fa7d8c8b5f9
  • Pointer size: 130 Bytes
  • Size of remote file: 87.2 kB
samples/1733774542663__000000500_4.jpg ADDED

Git LFS Details

  • SHA256: 7c30e1c322a05223ea6f69707cd436d87a3c05c59469b6ed9e642cc3a71440ef
  • Pointer size: 130 Bytes
  • Size of remote file: 55.3 kB
samples/1733774561257__000000500_5.jpg ADDED

Git LFS Details

  • SHA256: 7654f7ca9d2576d6355e305a64803791ca583036b7507bb43d76122aac785b88
  • Pointer size: 130 Bytes
  • Size of remote file: 80.7 kB
samples/1733775084642__000000750_0.jpg ADDED

Git LFS Details

  • SHA256: acf98a1c4dcbdd687d2f852350232bd3676e4de13143082352131191dda4c6c4
  • Pointer size: 130 Bytes
  • Size of remote file: 57.4 kB
samples/1733775103231__000000750_1.jpg ADDED

Git LFS Details

  • SHA256: 660eb5f1a40da42783d8d3013db45ce1fd8a429d231fc82aeecbe7ef45e6f99f
  • Pointer size: 131 Bytes
  • Size of remote file: 120 kB
samples/1733775121813__000000750_2.jpg ADDED

Git LFS Details

  • SHA256: 1ebb9889984f39a2f00a22ed4f11af0f02efea46d50562407e0c376e237aa6c8
  • Pointer size: 130 Bytes
  • Size of remote file: 84.4 kB
samples/1733775140394__000000750_3.jpg ADDED

Git LFS Details

  • SHA256: a9dabd17f05f97570e225abb44fcca0acba515f50526041110ac6555a85867d0
  • Pointer size: 130 Bytes
  • Size of remote file: 95.2 kB
samples/1733775158977__000000750_4.jpg ADDED

Git LFS Details

  • SHA256: 4d2db4d7a765336fabdb12a811b19d58486cd2ac2992f557d160959da9af5ec6
  • Pointer size: 130 Bytes
  • Size of remote file: 65.3 kB
samples/1733775177573__000000750_5.jpg ADDED

Git LFS Details

  • SHA256: ed12b218f26d9ef67deec8e0df91b9ac3cc638236d81650712ba05bbe7de3435
  • Pointer size: 130 Bytes
  • Size of remote file: 83.2 kB
samples/1733775697559__000001000_0.jpg ADDED

Git LFS Details

  • SHA256: aa7a4c3a2bf6b9dc0f525d143dd41162f03ed57a01986c3862a3fb4985c6c327
  • Pointer size: 130 Bytes
  • Size of remote file: 57.1 kB
samples/1733775716149__000001000_1.jpg ADDED

Git LFS Details

  • SHA256: d5cdf3227001168c77dc432a19024fb983fa4a0f9ac0b0a1cc069ada032d2aa4
  • Pointer size: 131 Bytes
  • Size of remote file: 104 kB
samples/1733775734740__000001000_2.jpg ADDED

Git LFS Details

  • SHA256: 7f466d7e8021d72dd0bf1df599e57df3823f14da195c2f2811f10ae1f174b90e
  • Pointer size: 130 Bytes
  • Size of remote file: 77.7 kB
samples/1733775753330__000001000_3.jpg ADDED

Git LFS Details

  • SHA256: 82ac2251553d11c2f2ceb361189cbc80fba030eb1584a296a6f40d99b1a55d32
  • Pointer size: 131 Bytes
  • Size of remote file: 100 kB
samples/1733775771915__000001000_4.jpg ADDED

Git LFS Details

  • SHA256: db062a47ccfca40cd76a31c8f75d576209d48798625311962ab1536ef74ad27b
  • Pointer size: 130 Bytes
  • Size of remote file: 64 kB
samples/1733775790516__000001000_5.jpg ADDED

Git LFS Details

  • SHA256: 7fdcd9a7e7a3a7336efe7309d7740798c477f72f06a7a9385d259bbb7e3dd316
  • Pointer size: 130 Bytes
  • Size of remote file: 85.5 kB
samples/1733776314928__000001250_0.jpg ADDED

Git LFS Details

  • SHA256: ae9a228b46581633f2f3e98fc97fda76466f388f85b33d39ecb8e86abc9ddb91
  • Pointer size: 130 Bytes
  • Size of remote file: 55.7 kB
samples/1733776333525__000001250_1.jpg ADDED

Git LFS Details

  • SHA256: fa77c20b0ce2e64478050f48b60225562520a3b9e6a3fcf88fff01aa7041268c
  • Pointer size: 131 Bytes
  • Size of remote file: 103 kB
samples/1733776352107__000001250_2.jpg ADDED

Git LFS Details

  • SHA256: 5272c0e35ec0192b7f6ccc1f3bf8b6d19ad20d794fe139030f2f5c8a25f89f66
  • Pointer size: 130 Bytes
  • Size of remote file: 78.8 kB
samples/1733776370700__000001250_3.jpg ADDED

Git LFS Details

  • SHA256: 136db8036d870828312cd2f58dd9b9dab1aad21410d564407a4e5b9eb6a7b8c3
  • Pointer size: 130 Bytes
  • Size of remote file: 90.1 kB
samples/1733776389287__000001250_4.jpg ADDED

Git LFS Details

  • SHA256: b6b228953d5b73745e481c5a194068af3aa2ad662518567388591df3e0b052e5
  • Pointer size: 130 Bytes
  • Size of remote file: 57 kB
samples/1733776407876__000001250_5.jpg ADDED

Git LFS Details

  • SHA256: 9977feb855c88252e3bc17941e6882237bc4090e0546b8927c1a38bbac94369c
  • Pointer size: 130 Bytes
  • Size of remote file: 78.6 kB
samples/1733776931659__000001500_0.jpg ADDED

Git LFS Details

  • SHA256: 2d0d18688f4d864b8e8fa8f72c733947d118064744b8b2428c57bcfc28cee190
  • Pointer size: 130 Bytes
  • Size of remote file: 60.1 kB
samples/1733776950250__000001500_1.jpg ADDED

Git LFS Details

  • SHA256: e91e76dd93fe973392a688355e1e9b1aa2e9e933c58f4ec49ebd16a348127ddd
  • Pointer size: 131 Bytes
  • Size of remote file: 110 kB
samples/1733776968841__000001500_2.jpg ADDED

Git LFS Details

  • SHA256: 903b8dc9cb8eefda2c89e56cde960358c5c044de2f217053bb93023b7377d067
  • Pointer size: 130 Bytes
  • Size of remote file: 84.3 kB
samples/1733776987424__000001500_3.jpg ADDED

Git LFS Details

  • SHA256: ff9467ac0cc87e2f44318ceb25d50172f9d21b69055b67b73a2fd6a7225158ef
  • Pointer size: 130 Bytes
  • Size of remote file: 87.4 kB
samples/1733777006010__000001500_4.jpg ADDED

Git LFS Details

  • SHA256: d60d5032a4c937d03332b958e48f1b57714957d9f217316cf41be77d05d5a599
  • Pointer size: 130 Bytes
  • Size of remote file: 61.8 kB
samples/1733777024585__000001500_5.jpg ADDED

Git LFS Details

  • SHA256: 61fff0d335edcd7aea8ebfcb0330424e5eeb02f7c3a3257a3aad36d6c145ef67
  • Pointer size: 130 Bytes
  • Size of remote file: 77.6 kB