Spaces:
Build error
Build error
[Textual Inversion](https://textual-inversion.github.io/) ã®åŠç¿ã«ã€ããŠã®èª¬æã§ãã | |
[åŠç¿ã«ã€ããŠã®å ±éããã¥ã¡ã³ã](./train_README-ja.md) ãããããŠã芧ãã ããã | |
å®è£ ã«åœãã£ãŠã¯ https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion ã倧ãã«åèã«ããŸããã | |
åŠç¿ããã¢ãã«ã¯Web UIã§ããã®ãŸãŸäœ¿ããŸãã | |
# åŠç¿ã®æé | |
ããããããã®ãªããžããªã®READMEãåç §ããç°å¢æŽåãè¡ã£ãŠãã ããã | |
## ããŒã¿ã®æºå | |
[åŠç¿ããŒã¿ã®æºåã«ã€ããŠ](./train_README-ja.md) ãåç §ããŠãã ããã | |
## åŠç¿ã®å®è¡ | |
``train_textual_inversion.py`` ãçšããŸãã以äžã¯ã³ãã³ãã©ã€ã³ã®äŸã§ãïŒDreamBoothææ³ïŒã | |
``` | |
accelerate launch --num_cpu_threads_per_process 1 train_textual_inversion.py | |
--dataset_config=<ããŒã¿æºåã§äœæãã.tomlãã¡ã€ã«> | |
--output_dir=<åŠç¿ããã¢ãã«ã®åºåå ãã©ã«ã> | |
--output_name=<åŠç¿ããã¢ãã«åºåæã®ãã¡ã€ã«å> | |
--save_model_as=safetensors | |
--prior_loss_weight=1.0 | |
--max_train_steps=1600 | |
--learning_rate=1e-6 | |
--optimizer_type="AdamW8bit" | |
--xformers | |
--mixed_precision="fp16" | |
--cache_latents | |
--gradient_checkpointing | |
--token_string=mychar4 --init_word=cute --num_vectors_per_token=4 | |
``` | |
``--token_string`` ã«åŠç¿æã®ããŒã¯ã³æååãæå®ããŸãã__åŠç¿æã®ããã³ããã¯ããã®æååãå«ãããã«ããŠãã ããïŒtoken_stringãmychar4ãªãã``mychar4 1girl`` ãªã©ïŒ__ãããã³ããã®ãã®æååã®éšåããTextual Inversionã®æ°ããtokenã«çœ®æãããŠåŠç¿ãããŸããDreamBooth, class+identifier圢åŒã®ããŒã¿ã»ãããšããŠã`token_string` ãããŒã¯ã³æååã«ããã®ãæãç°¡åã§ç¢ºå®ã§ãã | |
ããã³ããã«ããŒã¯ã³æååãå«ãŸããŠãããã©ããã¯ã``--debug_dataset`` ã§çœ®æåŸã®token idã衚瀺ãããŸãã®ã§ã以äžã®ããã« ``49408`` 以éã®tokenãååšãããã©ããã§ç¢ºèªã§ããŸãã | |
``` | |
input ids: tensor([[49406, 49408, 49409, 49410, 49411, 49412, 49413, 49414, 49415, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, 49407, | |
49407, 49407, 49407, 49407, 49407, 49407, 49407]]) | |
``` | |
tokenizerããã§ã«æã£ãŠããåèªïŒäžè¬çãªåèªïŒã¯äœ¿çšã§ããŸããã | |
``--init_word`` ã«embeddingsãåæåãããšãã®ã³ããŒå ããŒã¯ã³ã®æååãæå®ããŸããåŠã°ãããæŠå¿µãè¿ããã®ãéžã¶ãšããããã§ããäºã€ä»¥äžã®ããŒã¯ã³ã«ãªãæååã¯æå®ã§ããŸããã | |
``--num_vectors_per_token`` ã«ããã€ã®ããŒã¯ã³ããã®åŠç¿ã§äœ¿ãããæå®ããŸããå€ãã»ããè¡šçŸåãå¢ããŸããããã®åå€ãã®ããŒã¯ã³ãæ¶è²»ããŸããããšãã°num_vectors_per_token=8ã®å Žåãæå®ããããŒã¯ã³æååã¯ïŒäžè¬çãªããã³ããã®77ããŒã¯ã³å¶éã®ãã¡ïŒ8ããŒã¯ã³ãæ¶è²»ããŸãã | |
以äžãTextual Inversionã®ããã®äž»ãªãªãã·ã§ã³ã§ãã以éã¯ä»ã®åŠç¿ã¹ã¯ãªãããšåæ§ã§ãã | |
`num_cpu_threads_per_process` ã«ã¯éåžžã¯1ãæå®ãããšããããã§ãã | |
`pretrained_model_name_or_path` ã«è¿œå åŠç¿ãè¡ãå ãšãªãã¢ãã«ãæå®ããŸããStable Diffusionã®checkpointãã¡ã€ã«ïŒ.ckptãŸãã¯.safetensorsïŒãDiffusersã®ããŒã«ã«ãã£ã¹ã¯ã«ããã¢ãã«ãã£ã¬ã¯ããªãDiffusersã®ã¢ãã«IDïŒ"stabilityai/stable-diffusion-2"ãªã©ïŒãæå®ã§ããŸãã | |
`output_dir` ã«åŠç¿åŸã®ã¢ãã«ãä¿åãããã©ã«ããæå®ããŸãã`output_name` ã«ã¢ãã«ã®ãã¡ã€ã«åãæ¡åŒµåãé€ããŠæå®ããŸãã`save_model_as` ã§safetensors圢åŒã§ã®ä¿åãæå®ããŠããŸãã | |
`dataset_config` ã« `.toml` ãã¡ã€ã«ãæå®ããŸãããã¡ã€ã«å ã§ã®ããããµã€ãºæå®ã¯ãåœåã¯ã¡ã¢ãªæ¶è²»ãæããããã« `1` ãšããŠãã ããã | |
åŠç¿ãããã¹ãããæ° `max_train_steps` ã10000ãšããŸããåŠç¿ç `learning_rate` ã¯ããã§ã¯5e-6ãæå®ããŠããŸãã | |
çã¡ã¢ãªåã®ãã `mixed_precision="fp16"` ãæå®ããŸãïŒRTX30 ã·ãªãŒãºä»¥éã§ã¯ `bf16` ãæå®ã§ããŸããç°å¢æŽåæã«accelerateã«è¡ã£ãèšå®ãšåãããŠãã ããïŒããŸã `gradient_checkpointing` ãæå®ããŸãã | |
ãªããã£ãã€ã¶ïŒã¢ãã«ãåŠç¿ããŒã¿ã«ããããã«æé©åïŒåŠç¿ãããã¯ã©ã¹ïŒã«ã¡ã¢ãªæ¶è²»ã®å°ãªã 8bit AdamW ã䜿ãããã `optimizer_type="AdamW8bit"` ãæå®ããŸãã | |
`xformers` ãªãã·ã§ã³ãæå®ããxformersã®CrossAttentionãçšããŸããxformersãã€ã³ã¹ããŒã«ããŠããªãå Žåããšã©ãŒãšãªãå ŽåïŒç°å¢ã«ããããŸãã `mixed_precision="no"` ã®å Žåãªã©ïŒã代ããã« `mem_eff_attn` ãªãã·ã§ã³ãæå®ãããšçã¡ã¢ãªçCrossAttentionã䜿çšããŸãïŒé床ã¯é ããªããŸãïŒã | |
ããçšåºŠã¡ã¢ãªãããå Žåã¯ã`.toml` ãã¡ã€ã«ãç·šéããŠããããµã€ãºãããšãã° `8` ãããã«å¢ãããŠãã ããïŒé«éåãšç²ŸåºŠåäžã®å¯èœæ§ããããŸãïŒã | |
### ãã䜿ããããªãã·ã§ã³ã«ã€ã㊠| |
以äžã®å Žåã«ã¯ãªãã·ã§ã³ã«é¢ããããã¥ã¡ã³ããåç §ããŠãã ããã | |
- Stable Diffusion 2.xãŸãã¯ããããã®æŽŸçã¢ãã«ãåŠç¿ãã | |
- clip skipã2以äžãåæãšããã¢ãã«ãåŠç¿ãã | |
- 75ããŒã¯ã³ãè¶ ãããã£ãã·ã§ã³ã§åŠç¿ãã | |
### Textual Inversionã§ã®ããããµã€ãºã«ã€ã㊠| |
ã¢ãã«å šäœãåŠç¿ããDreamBoothãfine tuningã«æ¯ã¹ãŠã¡ã¢ãªäœ¿çšéãå°ãªããããããããµã€ãºã¯å€§ããã«ã§ããŸãã | |
# Textual Inversionã®ãã®ä»ã®äž»ãªãªãã·ã§ã³ | |
ãã¹ãŠã®ãªãã·ã§ã³ã«ã€ããŠã¯å¥ææžãåç §ããŠãã ããã | |
* `--weights` | |
* åŠç¿åã«åŠç¿æžã¿ã®embeddingsãèªã¿èŸŒã¿ãããããè¿œå ã§åŠç¿ããŸãã | |
* `--use_object_template` | |
* ãã£ãã·ã§ã³ã§ã¯ãªãæ¢å®ã®ç©äœçšãã³ãã¬ãŒãæååïŒ``a photo of a {}``ãªã©ïŒã§åŠç¿ããŸããå ¬åŒå®è£ ãšåãã«ãªããŸãããã£ãã·ã§ã³ã¯ç¡èŠãããŸãã | |
* `--use_style_template` | |
* ãã£ãã·ã§ã³ã§ã¯ãªãæ¢å®ã®ã¹ã¿ã€ã«çšãã³ãã¬ãŒãæååã§åŠç¿ããŸãïŒ``a painting in the style of {}``ãªã©ïŒãå ¬åŒå®è£ ãšåãã«ãªããŸãããã£ãã·ã§ã³ã¯ç¡èŠãããŸãã | |
## åœãªããžããªå ã®ç»åçæã¹ã¯ãªããã§çæãã | |
gen_img_diffusers.pyã«ã``--textual_inversion_embeddings`` ãªãã·ã§ã³ã§åŠç¿ããembeddingsãã¡ã€ã«ãæå®ããŠãã ããïŒè€æ°å¯ïŒãããã³ããã§embeddingsãã¡ã€ã«ã®ãã¡ã€ã«åïŒæ¡åŒµåãé€ãïŒã䜿ããšããã®embeddingsãé©çšãããŸãã | |