vcm07
/

Tokkyu_Model

Model card Files Files and versions

xet

Community

vcm07 commited on Nov 28, 2022

Commit

a5a2c99

1 Parent(s): 29bf365

Create README.md

Browse files

Files changed (1) hide show

README.md +297 -0

README.md ADDED Viewed

	@@ -0,0 +1,297 @@

+How to setup the model
+======================
+Thank you for downloading and trying the Tokkyu Model. You need to do a quick setup to get the best results. Here is a quick visual guide with minimal text. The most important part of the process is the Aesthetic Gradients configured correctly.
+Load the model and try a few blank gens
+---------------------------------------
+![](images/How_to_setup_the_model/media_1669521054021.png)
+1 - Toggle the Sampling Steps slider to 25. No need to change it anymore throughout the rest of the guide.
+2 - Just press generate after loading the model.
+3 - A bad gen should appear.
+Use the token (tokki girl)
+--------------------------
+![](images/How_to_setup_the_model/media_1669521599750.png)
+1 - Write (tokki girl) just like in the picture. There is no need to increase strength, but if you want to I recommend (tokki girl:1.2). After a lot of testing I recommend that you keep it in parenthesis. If you take out the parenthesis a photograph of a girl will show up and the model will get confused with the "photo of a girl" type of prompt.
+2 - Press Generate
+3 - A chibi type of generation might appear. If a photograph shows up, no problem, just keep generating for 3-12 times.
+Add the negative prompt
+-----------------------
+![](images/How_to_setup_the_model/media_1669522767753.png)
+1 - Add the negative prompt.
+2 - Press Generate (try 3-12 generations to start getting other types of images)
+3 - The picture got a lot better.
+You will notice the chibi image will get much better in quality. You can use your own neg prompts or get one in a list. I recommend the following negative prompt:
+((Keith Haring)), (((3d))), lowres, (((((photography))))), (((((((photorealism:1.09))))))), (((((((photo)))))))), bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, ((poorly\_drawn\_face)), ((poorly drawn hands)), ((poorly drawn feet)), fat, (disfigured), ((out of frame)), (((long neck))), (big ears), (((poo art))), ((((tiling)))), ((bad hands)), (bad art), (((penis))), (((mutation))), (((deformed))), ((ugly)), cloned face, (missing lips), ((ugly face)), blurry, undefined, rough, extra limbs, mutated hands, bad anatomy, bad proportions, weird hands, disproportionate, disfigured, face out of frame, poorly drawn face, ((morbid)), ((mutilated)), , out of frame, extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))). (((more than 2 nipples))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), extra arms, extra legs, mutated hands, (fused fingers), (too many fingers), lipstick, nipples, (((blush))), (((long neck))), ((flat chest)), ((poorly drawn eyes))
+Try a few more generations (3-12) and you will get more types of portraits.
+If you want to download neg prompts or make your own try:
+[Here](https://pastes.io/x9crpin0pq) and [here](https://www.reddit.com/r/WaifuDiffusion/comments/yrpovu/img2img_from_my_own_loose_sketch/) or just use the Negative prompt section of ptsearch.info/ as we will see in the next sections.
+Adapting and using a proper prompt
+----------------------------------
+![](images/How_to_setup_the_model/media_1669524609240.png)
+Go to [https://www.ptsearch.info/home/](https://www.ptsearch.info/home/)
+![](images/How_to_setup_the_model/media_1669524768803.png)
+Select the prompt that you want. For the purpose of this guide we will select the one from: [https://www.ptsearch.info/articles/detail/150140](https://www.ptsearch.info/articles/detail/150140)
+1- Select the prompt
+2-Click on WEBUI
+![](images/How_to_setup_the_model/media_1669525021708.png)
+1 - Paste the prompt and start adapting. Remove any artists so that you can get only the Tokkyu style. Replace the 1girl, woman, girl etc for (tokki girl) or (tokki girl:1.2) and place it at the beginning if possible as the order of the tokens is important. And remove anything else you think is not worth it. The shorter the prompt the better.
+My final result is:
+(tokki girl), (Masterpiece:1.1), (best quality:1.1), ((chromatic aberration)), ((caustic)), dynamic angle, ((ultra-detailed)), (illustration:1.1), ((disheveled hair)), beautiful detailed glow, detailed, Cinematic light, intricate detail, highres, ((lomography, full body portrait photo of women like reol holding dressed like witches in a library, open books, wearing an ornate robe and choker, moody, realistic, dark, skin tinted a warm tone, light blue filter, hdr, rounded eyes, detailed facial features, gold black, Wizard, (floating palaces:1.2), magic, black lightning magic, )), trending on ArtStation Pixiv, high detail, sharp focus, smooth, aesthetic, rule of thirds,
+Some notes about the adaptation:
+\-I removed (1girl:1.2), as (tokki girl) is in the begining.
+\-I removed anime, as the model already pulls anime art. The anime token can ruin the style and bring the standard SD anime style.
+\-I removed the "by alphonse mucha" as we want the pure Tokkyu style
+You could remove the "trending on Artstation Pixiv" if you wish a more pure Tokkyu style.
+![](images/How_to_setup_the_model/media_1669526399014.png)
+You should be getting this type of generation already.
+![](images/How_to_setup_the_model/media_1669526607659.png)
+1 - Scroll down
+2 - Scroll the sliders to 768 x 768. The model was trained on 768 x 768 natively with variable aspect ratio buckets a batch size of 6 on a A100 and you should use 768 x 768 for all generations. The quality will improve a lot.
+Generate a few more images to test the quality.
+![](images/How_to_setup_the_model/img_guide.png)
+Just by changing the slider to 768 x 768 we went from this
+![](images/How_to_setup_the_model/download__30_.png)
+![](images/How_to_setup_the_model/download__28_.png)
+![](images/How_to_setup_the_model/download__29_.png)
+to this. It gets the style. You could stop here and have fun with your quality generations, but let's make it even better.
+You don' t need any VAEs or Hypernetworks, but if you want to add them I recommend that you do it now. If you add VAEs it won't make much improvement for the eyes but it will distort the style a little towards the VAE eye style. I recommend the kl-f8-anime.vae from WD 1.4, it will make the eyes a little better but you don't need it if you don't want to use it. For Hypernetworks the NAI group will make little difference, but it will get closer to the NAI anime types. But in my testing neither improves much. The Aesthetic Gradient is more important here.
+Download and install the Tokkyu Aesthetic Gradient
+--------------------------------------------------
+![](images/How_to_setup_the_model/media_1669529653626.png)
+1- Install Aesthetic Gradients extension in your Auto1111 (Install from URL in the Extensions tab) and download the .pt file on HF, put it in the /extensions/stable-diffusion-webui-aesthetic-gradients directory.
+2 - Check if it is installed.
+3 - Go back to the txt2img tab
+The Tokkyu Aesthetic Gradient was trained on the same images that the model itself was trained on and it is meant to reinforce the style even more. This results in txt2img direct generations that get almost on par with the originals used for training the model.
+![](images/How_to_setup_the_model/media_1669531063424.png)
+Scroll down to the bottom
+![](images/How_to_setup_the_model/media_1669531267050.png)
+1 - Click in the "Open for Clip Aesthetic Tab"
+![](images/How_to_setup_the_model/media_1669531511340.png)
+1 - Set the steps to 25
+2 - Select the tokkiu\_pure Aesthetic Gradient. If it doesn't show up in the list click in the square
+3 - Add 66 zeros to the left of the number 1. Or just paste this into the box: 0.0000000000000000000000000000000000000000000000000000000000000000000001
+4- Generate!
+And BOOM! You go from this:
+---------------------------
+![](images/How_to_setup_the_model/download__34_.png)
+![](images/How_to_setup_the_model/download__33_.png)
+From this, without the Aesthetic Gradient
+To this:
+--------
+![](images/How_to_setup_the_model/download__35_.png)
+![](images/How_to_setup_the_model/download__36_.png)
+![](images/How_to_setup_the_model/download__37_.png)
+![](images/How_to_setup_the_model/download__38_.png)
+![](images/How_to_setup_the_model/download__40_.png)
+![](images/How_to_setup_the_model/download__41_.png)
+...with the Aesthetic Gradients. You get the style almost as if it was the artist himself that painted your generation. And there is some stuff that needs to be noted:
+\-It nailed the style. I've put it in comparison with the training images and its almost identical.
+\-It got creative. You can see that she has a golden witch hat, the magic light comes out of the book, and it is out of the window, the hair and the clothes have ornaments that easily rival in quality with those you see in Midjourney and this is txt2img we are talking about in a model that was trained using plain SD 1.5. The lighting truly is cinematic. Tokkiu's style tends toward the realism, and the AG got it.
+\-It improved prompt recognition. You can see that she is wearing the hat and the black/golden choker it didn't get that in the non AG generations.
+![](images/How_to_setup_the_model/media_1669533838379.png)
+If you want to know how the Aesthetic Gradients should look in your generations, look at the bottom of the page in Auto1111. It should be like in this picture. The Aesthetic LR is really important to be just like in the picture, 1 epoch at 70 repeats. You can change it if you want to make more tests but it is not necessary.
+Poses
+-----
+The model has some poses that it was trained on that mimic the originals. The main two are standing and sitting. Here are some modifiers for pose that were trained:
+\-sitting on the ground
+\-sitting on the floor
+\-sits on a finger
+\-sitting on a chair
+\-sitting at a table
+\-sitting on a staircase
+\-sitting on a couch
+\-sitting on a bench
+\-sitting on a chair
+\-sitting on top of a couch
+\-sitting at a piano
+\-sitting on a broom
+\-sitting on top of a wooden chair
+\-sitting on a window sill
+\-kneeling on the ground
+\-standing/standing in the
+\-standing on a pier
+\-standing in a field
+\-standing in the rain
+\-standing in front of a window
+\-walking barefoot
+\-walking in
+\-laying on a bed
+Sitting on the ground examples:
+-------------------------------
+![](images/How_to_setup_the_model/download__60_.png)
+in the prompt I just wrote "sitting on the ground" and then I altered to "books on the floor"
+![](images/How_to_setup_the_model/download__61_.png)
+![](images/How_to_setup_the_model/download__69_.png)
+Artists
+-------
+![](images/How_to_setup_the_model/tmpz_32s934.png)
+If you wish to join the Tokkyu style with other artists you can directly in the model without the need for merges or TI.
+The generation above was made using wlop in the prompt. Wlop and Tokkyu share the realistic preferences in their art and both go really well together. You can get wonderful results mixing just the name of the artist that was trained on the model. Here is a list of the artists that do modify significantly the style just by putting them in the prompt:
+by Muqi (second with more weight in the model)
+by Lü Ji (third with more weight in the model)
+by Pu Hua (more weight in the model. Better for backgrounds)
+by Kaburagi Kiyokata
+by Sengai
+by Kobayashi Kiyochika
+featured on pixiv
+by Yuumei
+junko enoshima (generates ponytail girls)
+narumi kakinouchi
+cgsociety
+by Jin Homura
+as a tarot card
+pixiv contest winner
+by WLOP
+featured on pixiv
+by Takeuchi Seihō
+Use it exactly as shown. So for wlop, put in the prompt: by WLOP
+Merges and TI
+-------------
+As of this writing I haven't tested merging the model with other models yet. But for merging I recommend the inverse sigmoid for precision. They removed from Auto so you have to calculate it. So for 0.8 inverse sigmoid:
+Formula: 0.5 - sin(\\arcsin (1 - 2 \* 0.8) -: 3) .
+Result: 0.71285927
+Calculator you can paste the formula above: https://www.mathway.com/Calculus
+More instructions at:
+www.reddit.com/r/StableDiffusion/comments/y0rt9m/checkpoint\_merger\_comparison\_automatic1111/ e https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/c250cb289c97fe303cef69064bf45899406f6a40
+I haven't tested TI. You gotta test it yourself.
+Troubleshooting
+---------------
+![](images/How_to_setup_the_model/download__70_.png)
+Sometimes you will get some images that have distortions, several legs or arms. The solution to most of the several legs and arms is still to inpaint them out.
+![](images/How_to_setup_the_model/download__52_.png)
+This was an excellent gen with a difficult pose. It is worth an inpainting.
+Chibi Distortion
+----------------
+![](images/How_to_setup_the_model/tmpyaagiuor.png)
+Remember the chibi at the beginning? Sometimes it will create some chibi in high quality distorting the style but keeping some elements. It wasn't supposed to do that, but it happens. An adorable accident. Its probably related to some of the aspect ratio variables.
+High Resolution Distortions
+---------------------------
+![](images/How_to_setup_the_model/download__71_.png)
+A solution for the cloning or other distortions due to high resolution is to use the highresfix specially if you use height of 1024 or 1216. I avoid to use highresfix, and you won't need it in 768 x 768 but some times it happens.
+Photographs
+-----------
+Sometimes depending on your prompt photos will appear. They usually appear if there is a lot of photorealistic type tokens. The solution is to just use the (tokki girl) token and it should remove the photos after a few generations. Sometimes the sampler saturates and needs to be reset. Otherwise it will start to show photographs. Just change the sampler and try various generations on the other sampler before going back to the previous. And change your prompt. The prompt always can be biased towards a photo generation.
+Comparison to the Original
+--------------------------
+![](images/How_to_setup_the_model/comp_01.png)
+1 - Generated by the model
+2 - Original by Tokkyu (image used for training the model)
+![](images/How_to_setup_the_model/comp_02.png)
+1 - Generated by the model
+2 - Original by Tokkyu (image used for training the model)
+Specs and Credits
+-----------------
+![](images/How_to_setup_the_model/comp_03.png)
+The specs are:
+I trained this DreamBooth model using EveryDream. The model was trained using the art from artist Tokkyu at 57 images, for 3 epochs in a total of 1440 steps. I trained it on a A100 (80gb Vram) with 768x768 native size/variable buckets, batch size 6, at around 68gb Vram per epoch using very high resolution training images non cropped. EveryDream handled the aspect ratios correctly. The images were captioned manually each with a little help from pharmapsychotic's excellent new 2.1 CLIP Interrogator.
+Credits:
+Model/AG/Guide created by: vcm07 . I can be found at the official SD #anime channel.
+Art by Tokkyu. Check his work at: https://www.pixiv.net/users/23098486
+Special thanks to Freon for the wonderful EveryDream.