vcm07 commited on
Commit
a5a2c99
·
1 Parent(s): 29bf365

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +297 -0
README.md ADDED
@@ -0,0 +1,297 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ How to setup the model
2
+ ======================
3
+
4
+ Thank you for downloading and trying the Tokkyu Model. You need to do a quick setup to get the best results. Here is a quick visual guide with minimal text. The most important part of the process is the Aesthetic Gradients configured correctly.
5
+
6
+ Load the model and try a few blank gens
7
+ ---------------------------------------
8
+
9
+ ![](images/How_to_setup_the_model/media_1669521054021.png)
10
+
11
+ 1 - Toggle the Sampling Steps slider to 25. No need to change it anymore throughout the rest of the guide.
12
+ 2 - Just press generate after loading the model.
13
+ 3 - A bad gen should appear.
14
+
15
+ Use the token (tokki girl)
16
+ --------------------------
17
+
18
+ ![](images/How_to_setup_the_model/media_1669521599750.png)
19
+
20
+ 1 - Write (tokki girl) just like in the picture. There is no need to increase strength, but if you want to I recommend (tokki girl:1.2). After a lot of testing I recommend that you keep it in parenthesis. If you take out the parenthesis a photograph of a girl will show up and the model will get confused with the "photo of a girl" type of prompt.
21
+ 2 - Press Generate
22
+ 3 - A chibi type of generation might appear. If a photograph shows up, no problem, just keep generating for 3-12 times.
23
+
24
+ Add the negative prompt
25
+ -----------------------
26
+
27
+ ![](images/How_to_setup_the_model/media_1669522767753.png)
28
+
29
+ 1 - Add the negative prompt.
30
+ 2 - Press Generate (try 3-12 generations to start getting other types of images)
31
+ 3 - The picture got a lot better.
32
+
33
+ You will notice the chibi image will get much better in quality. You can use your own neg prompts or get one in a list. I recommend the following negative prompt:
34
+
35
+ ((Keith Haring)), (((3d))), lowres, (((((photography))))), (((((((photorealism:1.09))))))), (((((((photo)))))))), bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, ((poorly\_drawn\_face)), ((poorly drawn hands)), ((poorly drawn feet)), fat, (disfigured), ((out of frame)), (((long neck))), (big ears), (((poo art))), ((((tiling)))), ((bad hands)), (bad art), (((penis))), (((mutation))), (((deformed))), ((ugly)), cloned face, (missing lips), ((ugly face)), blurry, undefined, rough, extra limbs, mutated hands, bad anatomy, bad proportions, weird hands, disproportionate, disfigured, face out of frame, poorly drawn face, ((morbid)), ((mutilated)), , out of frame, extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))). (((more than 2 nipples))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), extra arms, extra legs, mutated hands, (fused fingers), (too many fingers), lipstick, nipples, (((blush))), (((long neck))), ((flat chest)), ((poorly drawn eyes))
36
+
37
+ Try a few more generations (3-12) and you will get more types of portraits.
38
+
39
+ If you want to download neg prompts or make your own try:
40
+ [Here](https://pastes.io/x9crpin0pq) and [here](https://www.reddit.com/r/WaifuDiffusion/comments/yrpovu/img2img_from_my_own_loose_sketch/) or just use the Negative prompt section of ptsearch.info/ as we will see in the next sections.
41
+
42
+ Adapting and using a proper prompt
43
+ ----------------------------------
44
+
45
+ ![](images/How_to_setup_the_model/media_1669524609240.png)
46
+
47
+ Go to [https://www.ptsearch.info/home/](https://www.ptsearch.info/home/)
48
+
49
+ ![](images/How_to_setup_the_model/media_1669524768803.png)
50
+
51
+ Select the prompt that you want. For the purpose of this guide we will select the one from: [https://www.ptsearch.info/articles/detail/150140](https://www.ptsearch.info/articles/detail/150140)
52
+ 1- Select the prompt
53
+ 2-Click on WEBUI
54
+
55
+ ![](images/How_to_setup_the_model/media_1669525021708.png)
56
+
57
+ 1 - Paste the prompt and start adapting. Remove any artists so that you can get only the Tokkyu style. Replace the 1girl, woman, girl etc for (tokki girl) or (tokki girl:1.2) and place it at the beginning if possible as the order of the tokens is important. And remove anything else you think is not worth it. The shorter the prompt the better.
58
+
59
+ My final result is:
60
+
61
+ (tokki girl), (Masterpiece:1.1), (best quality:1.1), ((chromatic aberration)), ((caustic)), dynamic angle, ((ultra-detailed)), (illustration:1.1), ((disheveled hair)), beautiful detailed glow, detailed, Cinematic light, intricate detail, highres, ((lomography, full body portrait photo of women like reol holding dressed like witches in a library, open books, wearing an ornate robe and choker, moody, realistic, dark, skin tinted a warm tone, light blue filter, hdr, rounded eyes, detailed facial features, gold black, Wizard, (floating palaces:1.2), magic, black lightning magic, )), trending on ArtStation Pixiv, high detail, sharp focus, smooth, aesthetic, rule of thirds,
62
+
63
+ Some notes about the adaptation:
64
+
65
+ \-I removed (1girl:1.2), as (tokki girl) is in the begining.
66
+ \-I removed anime, as the model already pulls anime art. The anime token can ruin the style and bring the standard SD anime style.
67
+ \-I removed the "by alphonse mucha" as we want the pure Tokkyu style
68
+
69
+ You could remove the "trending on Artstation Pixiv" if you wish a more pure Tokkyu style.
70
+
71
+ ![](images/How_to_setup_the_model/media_1669526399014.png)
72
+
73
+ You should be getting this type of generation already.
74
+
75
+ ![](images/How_to_setup_the_model/media_1669526607659.png)
76
+
77
+ 1 - Scroll down
78
+ 2 - Scroll the sliders to 768 x 768. The model was trained on 768 x 768 natively with variable aspect ratio buckets a batch size of 6 on a A100 and you should use 768 x 768 for all generations. The quality will improve a lot.
79
+
80
+ Generate a few more images to test the quality.
81
+
82
+ ![](images/How_to_setup_the_model/img_guide.png)
83
+
84
+ Just by changing the slider to 768 x 768 we went from this
85
+
86
+ ![](images/How_to_setup_the_model/download__30_.png)
87
+
88
+ ![](images/How_to_setup_the_model/download__28_.png)
89
+
90
+ ![](images/How_to_setup_the_model/download__29_.png)
91
+
92
+ to this. It gets the style. You could stop here and have fun with your quality generations, but let's make it even better.
93
+
94
+ You don' t need any VAEs or Hypernetworks, but if you want to add them I recommend that you do it now. If you add VAEs it won't make much improvement for the eyes but it will distort the style a little towards the VAE eye style. I recommend the kl-f8-anime.vae from WD 1.4, it will make the eyes a little better but you don't need it if you don't want to use it. For Hypernetworks the NAI group will make little difference, but it will get closer to the NAI anime types. But in my testing neither improves much. The Aesthetic Gradient is more important here.
95
+
96
+ Download and install the Tokkyu Aesthetic Gradient
97
+ --------------------------------------------------
98
+
99
+ ![](images/How_to_setup_the_model/media_1669529653626.png)
100
+
101
+ 1- Install Aesthetic Gradients extension in your Auto1111 (Install from URL in the Extensions tab) and download the .pt file on HF, put it in the /extensions/stable-diffusion-webui-aesthetic-gradients directory.
102
+ 2 - Check if it is installed.
103
+ 3 - Go back to the txt2img tab
104
+
105
+ The Tokkyu Aesthetic Gradient was trained on the same images that the model itself was trained on and it is meant to reinforce the style even more. This results in txt2img direct generations that get almost on par with the originals used for training the model.
106
+
107
+ ![](images/How_to_setup_the_model/media_1669531063424.png)
108
+
109
+ Scroll down to the bottom
110
+
111
+ ![](images/How_to_setup_the_model/media_1669531267050.png)
112
+
113
+ 1 - Click in the "Open for Clip Aesthetic Tab"
114
+
115
+ ![](images/How_to_setup_the_model/media_1669531511340.png)
116
+
117
+ 1 - Set the steps to 25
118
+ 2 - Select the tokkiu\_pure Aesthetic Gradient. If it doesn't show up in the list click in the square
119
+ 3 - Add 66 zeros to the left of the number 1. Or just paste this into the box: 0.0000000000000000000000000000000000000000000000000000000000000000000001
120
+ 4- Generate!
121
+
122
+ And BOOM! You go from this:
123
+ ---------------------------
124
+
125
+ ![](images/How_to_setup_the_model/download__34_.png)
126
+
127
+ ![](images/How_to_setup_the_model/download__33_.png)
128
+
129
+ From this, without the Aesthetic Gradient
130
+
131
+ To this:
132
+ --------
133
+
134
+ ![](images/How_to_setup_the_model/download__35_.png)
135
+
136
+ ![](images/How_to_setup_the_model/download__36_.png)
137
+
138
+ ![](images/How_to_setup_the_model/download__37_.png)
139
+
140
+ ![](images/How_to_setup_the_model/download__38_.png)
141
+
142
+ ![](images/How_to_setup_the_model/download__40_.png)
143
+
144
+ ![](images/How_to_setup_the_model/download__41_.png)
145
+
146
+ ...with the Aesthetic Gradients. You get the style almost as if it was the artist himself that painted your generation. And there is some stuff that needs to be noted:
147
+
148
+ \-It nailed the style. I've put it in comparison with the training images and its almost identical.
149
+ \-It got creative. You can see that she has a golden witch hat, the magic light comes out of the book, and it is out of the window, the hair and the clothes have ornaments that easily rival in quality with those you see in Midjourney and this is txt2img we are talking about in a model that was trained using plain SD 1.5. The lighting truly is cinematic. Tokkiu's style tends toward the realism, and the AG got it.
150
+ \-It improved prompt recognition. You can see that she is wearing the hat and the black/golden choker it didn't get that in the non AG generations.
151
+
152
+ ![](images/How_to_setup_the_model/media_1669533838379.png)
153
+
154
+ If you want to know how the Aesthetic Gradients should look in your generations, look at the bottom of the page in Auto1111. It should be like in this picture. The Aesthetic LR is really important to be just like in the picture, 1 epoch at 70 repeats. You can change it if you want to make more tests but it is not necessary.
155
+
156
+ Poses
157
+ -----
158
+
159
+ The model has some poses that it was trained on that mimic the originals. The main two are standing and sitting. Here are some modifiers for pose that were trained:
160
+
161
+ \-sitting on the ground
162
+ \-sitting on the floor
163
+ \-sits on a finger
164
+ \-sitting on a chair
165
+ \-sitting at a table
166
+ \-sitting on a staircase
167
+ \-sitting on a couch
168
+ \-sitting on a bench
169
+ \-sitting on a chair
170
+ \-sitting on top of a couch
171
+ \-sitting at a piano
172
+ \-sitting on a broom
173
+ \-sitting on top of a wooden chair
174
+ \-sitting on a window sill
175
+ \-kneeling on the ground
176
+ \-standing/standing in the
177
+ \-standing on a pier
178
+ \-standing in a field
179
+ \-standing in the rain
180
+ \-standing in front of a window
181
+ \-walking barefoot
182
+ \-walking in
183
+ \-laying on a bed
184
+
185
+ Sitting on the ground examples:
186
+ -------------------------------
187
+
188
+ ![](images/How_to_setup_the_model/download__60_.png)
189
+
190
+ in the prompt I just wrote "sitting on the ground" and then I altered to "books on the floor"
191
+
192
+ ![](images/How_to_setup_the_model/download__61_.png)
193
+
194
+ ![](images/How_to_setup_the_model/download__69_.png)
195
+
196
+ Artists
197
+ -------
198
+
199
+ ![](images/How_to_setup_the_model/tmpz_32s934.png)
200
+
201
+ If you wish to join the Tokkyu style with other artists you can directly in the model without the need for merges or TI.
202
+ The generation above was made using wlop in the prompt. Wlop and Tokkyu share the realistic preferences in their art and both go really well together. You can get wonderful results mixing just the name of the artist that was trained on the model. Here is a list of the artists that do modify significantly the style just by putting them in the prompt:
203
+
204
+ by Muqi (second with more weight in the model)
205
+ by Lü Ji (third with more weight in the model)
206
+ by Pu Hua (more weight in the model. Better for backgrounds)
207
+ by Kaburagi Kiyokata
208
+ by Sengai
209
+ by Kobayashi Kiyochika
210
+ featured on pixiv
211
+ by Yuumei
212
+ junko enoshima (generates ponytail girls)
213
+ narumi kakinouchi
214
+ cgsociety
215
+ by Jin Homura
216
+ as a tarot card
217
+ pixiv contest winner
218
+ by WLOP
219
+ featured on pixiv
220
+ by Takeuchi Seihō
221
+
222
+ Use it exactly as shown. So for wlop, put in the prompt: by WLOP
223
+
224
+ Merges and TI
225
+ -------------
226
+
227
+ As of this writing I haven't tested merging the model with other models yet. But for merging I recommend the inverse sigmoid for precision. They removed from Auto so you have to calculate it. So for 0.8 inverse sigmoid:
228
+
229
+ Formula: 0.5 - sin(\\arcsin (1 - 2 \* 0.8) -: 3) .
230
+
231
+ Result: 0.71285927
232
+
233
+ Calculator you can paste the formula above: https://www.mathway.com/Calculus
234
+
235
+ More instructions at:
236
+ www.reddit.com/r/StableDiffusion/comments/y0rt9m/checkpoint\_merger\_comparison\_automatic1111/ e https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/c250cb289c97fe303cef69064bf45899406f6a40
237
+
238
+ I haven't tested TI. You gotta test it yourself.
239
+
240
+ Troubleshooting
241
+ ---------------
242
+
243
+ ![](images/How_to_setup_the_model/download__70_.png)
244
+
245
+ Sometimes you will get some images that have distortions, several legs or arms. The solution to most of the several legs and arms is still to inpaint them out.
246
+
247
+ ![](images/How_to_setup_the_model/download__52_.png)
248
+
249
+ This was an excellent gen with a difficult pose. It is worth an inpainting.
250
+
251
+ Chibi Distortion
252
+ ----------------
253
+
254
+ ![](images/How_to_setup_the_model/tmpyaagiuor.png)
255
+
256
+ Remember the chibi at the beginning? Sometimes it will create some chibi in high quality distorting the style but keeping some elements. It wasn't supposed to do that, but it happens. An adorable accident. Its probably related to some of the aspect ratio variables.
257
+
258
+ High Resolution Distortions
259
+ ---------------------------
260
+
261
+ ![](images/How_to_setup_the_model/download__71_.png)
262
+
263
+ A solution for the cloning or other distortions due to high resolution is to use the highresfix specially if you use height of 1024 or 1216. I avoid to use highresfix, and you won't need it in 768 x 768 but some times it happens.
264
+
265
+ Photographs
266
+ -----------
267
+
268
+ Sometimes depending on your prompt photos will appear. They usually appear if there is a lot of photorealistic type tokens. The solution is to just use the (tokki girl) token and it should remove the photos after a few generations. Sometimes the sampler saturates and needs to be reset. Otherwise it will start to show photographs. Just change the sampler and try various generations on the other sampler before going back to the previous. And change your prompt. The prompt always can be biased towards a photo generation.
269
+
270
+ Comparison to the Original
271
+ --------------------------
272
+
273
+ ![](images/How_to_setup_the_model/comp_01.png)
274
+
275
+ 1 - Generated by the model
276
+ 2 - Original by Tokkyu (image used for training the model)
277
+
278
+ ![](images/How_to_setup_the_model/comp_02.png)
279
+
280
+ 1 - Generated by the model
281
+ 2 - Original by Tokkyu (image used for training the model)
282
+
283
+ Specs and Credits
284
+ -----------------
285
+
286
+ ![](images/How_to_setup_the_model/comp_03.png)
287
+
288
+ The specs are:
289
+ I trained this DreamBooth model using EveryDream. The model was trained using the art from artist Tokkyu at 57 images, for 3 epochs in a total of 1440 steps. I trained it on a A100 (80gb Vram) with 768x768 native size/variable buckets, batch size 6, at around 68gb Vram per epoch using very high resolution training images non cropped. EveryDream handled the aspect ratios correctly. The images were captioned manually each with a little help from pharmapsychotic's excellent new 2.1 CLIP Interrogator.
290
+
291
+ Credits:
292
+
293
+ Model/AG/Guide created by: vcm07 . I can be found at the official SD #anime channel.
294
+
295
+ Art by Tokkyu. Check his work at: https://www.pixiv.net/users/23098486
296
+
297
+ Special thanks to Freon for the wonderful EveryDream.