File size: 11,511 Bytes
e8cdbdc df42d07 6985143 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Running 1 job
0it [00:00, ?it/s]
0it [00:00, ?it/s]
/usr/local/lib/python3.10/dist-packages/controlnet_aux/mediapipe_face/mediapipe_face_common.py:7: UserWarning: The module 'mediapipe' is not installed. The package will have limited functionality. Please install it using the command: pip install 'mediapipe'
warnings.warn(
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
{
"type": "sd_trainer",
"training_folder": "output",
"device": "cuda:0",
"network": {
"type": "lora",
"linear": 16,
"linear_alpha": 16
},
"save": {
"dtype": "float16",
"save_every": 500,
"max_step_saves_to_keep": 4,
"push_to_hub": false
},
"datasets": [
{
"folder_path": "/workspace/ai-toolkit/images",
"caption_ext": "txt",
"caption_dropout_rate": 0.05,
"shuffle_tokens": false,
"cache_latents_to_disk": true,
"resolution": [
512,
768,
1024
]
}
],
"train": {
"batch_size": 1,
"steps": 3000,
"gradient_accumulation_steps": 1,
"train_unet": true,
"train_text_encoder": false,
"gradient_checkpointing": true,
"noise_scheduler": "flowmatch",
"optimizer": "adamw8bit",
"lr": 0.0001,
"ema_config": {
"use_ema": true,
"ema_decay": 0.99
},
"dtype": "bf16"
},
"model": {
"name_or_path": "black-forest-labs/FLUX.1-dev",
"is_flux": true,
"quantize": true
},
"sample": {
"sampler": "flowmatch",
"sample_every": 500,
"width": 1024,
"height": 1024,
"prompts": [
"woman with red hair, playing chess at the park, bomb going off in the background",
"a woman holding a coffee cup, in a beanie, sitting at a cafe",
"a horse is a DJ at a night club, fish eye lens, smoke machine, lazer lights, holding a martini",
"a man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background",
"a bear building a log cabin in the snow covered mountains",
"woman playing the guitar, on stage, singing a song, laser lights, punk rocker",
"hipster man with a beard, building a chair, in a wood shop",
"photo of a man, white background, medium shot, modeling clothing, studio lighting, white backdrop",
"a man holding a sign that says, 'this is a sign'",
"a bulldog, in a post apocalyptic world, with a shotgun, in a leather jacket, in a desert, with a motorcycle"
],
"neg": "",
"seed": 42,
"walk_seed": true,
"guidance_scale": 4,
"sample_steps": 20
},
"trigger_word": "p3r5on"
}
Using EMA
#############################################
# Running job: my_first_flux_lora_v1
#############################################
Running 1 process
Loading Flux model
Loading transformer
Quantizing transformer
Loading vae
Loading t5
Downloading shards: 0%| | 0/2 [00:00<?, ?it/s]
Downloading shards: 50%|βββββ | 1/2 [00:26<00:26, 26.58s/it]
Downloading shards: 100%|ββββββββββ| 2/2 [00:48<00:00, 23.78s/it]
Downloading shards: 100%|ββββββββββ| 2/2 [00:48<00:00, 24.20s/it]
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|βββββ | 1/2 [00:00<00:00, 5.41it/s]
Loading checkpoint shards: 100%|ββββββββββ| 2/2 [00:00<00:00, 6.00it/s]
Loading checkpoint shards: 100%|ββββββββββ| 2/2 [00:00<00:00, 5.90it/s]
Quantizing T5
Loading clip
making pipe
preparing
create LoRA network. base dim (rank): 16, alpha: 16
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder: 0 modules.
create LoRA for U-Net: 494 modules.
enable LoRA for U-Net
Dataset: /workspace/ai-toolkit/images
- Preprocessing image dimensions
0%| | 0/11 [00:00<?, ?it/s]
100%|ββββββββββ| 11/11 [00:00<00:00, 414.98it/s]
- Found 11 images
Bucket sizes for /workspace/ai-toolkit/images:
448x576: 11 files
1 buckets made
Caching latents for /workspace/ai-toolkit/images
- Saving latents to disk
Caching latents to disk: 0%| | 0/11 [00:00<?, ?it/s]
Caching latents to disk: 9%|β | 1/11 [00:00<00:03, 2.65it/s]
Caching latents to disk: 27%|βββ | 3/11 [00:00<00:01, 6.49it/s]
Caching latents to disk: 45%|βββββ | 5/11 [00:00<00:00, 8.75it/s]
Caching latents to disk: 64%|βββββββ | 7/11 [00:00<00:00, 10.24it/s]
Caching latents to disk: 82%|βββββββββ | 9/11 [00:00<00:00, 11.33it/s]
Caching latents to disk: 100%|ββββββββββ| 11/11 [00:01<00:00, 12.11it/s]
Caching latents to disk: 100%|ββββββββββ| 11/11 [00:01<00:00, 9.80it/s]
Dataset: /workspace/ai-toolkit/images
- Preprocessing image dimensions
0%| | 0/11 [00:00<?, ?it/s]
100%|ββββββββββ| 11/11 [00:00<00:00, 42719.76it/s]
- Found 11 images
Bucket sizes for /workspace/ai-toolkit/images:
640x832: 11 files
1 buckets made
Caching latents for /workspace/ai-toolkit/images
- Saving latents to disk
Caching latents to disk: 0%| | 0/11 [00:00<?, ?it/s]
Caching latents to disk: 9%|β | 1/11 [00:00<00:01, 6.85it/s]
Caching latents to disk: 18%|ββ | 2/11 [00:00<00:01, 7.43it/s]
Caching latents to disk: 27%|βββ | 3/11 [00:00<00:00, 8.13it/s]
Caching latents to disk: 36%|ββββ | 4/11 [00:00<00:00, 8.52it/s]
Caching latents to disk: 45%|βββββ | 5/11 [00:00<00:00, 8.78it/s]
Caching latents to disk: 55%|ββββββ | 6/11 [00:00<00:00, 8.90it/s]
Caching latents to disk: 64%|βββββββ | 7/11 [00:00<00:00, 9.01it/s]
Caching latents to disk: 73%|ββββββββ | 8/11 [00:00<00:00, 9.03it/s]
Caching latents to disk: 82%|βββββββββ | 9/11 [00:01<00:00, 9.02it/s]
Caching latents to disk: 91%|βββββββββ | 10/11 [00:01<00:00, 9.11it/s]
Caching latents to disk: 100%|ββββββββββ| 11/11 [00:01<00:00, 8.66it/s]
Caching latents to disk: 100%|ββββββββββ| 11/11 [00:01<00:00, 8.64it/s]
Dataset: /workspace/ai-toolkit/images
- Preprocessing image dimensions
0%| | 0/11 [00:00<?, ?it/s]
100%|ββββββββββ| 11/11 [00:00<00:00, 35710.02it/s]
- Found 11 images
Bucket sizes for /workspace/ai-toolkit/images:
832x1152: 11 files
1 buckets made
Caching latents for /workspace/ai-toolkit/images
- Saving latents to disk
Caching latents to disk: 0%| | 0/11 [00:00<?, ?it/s]
Caching latents to disk: 9%|β | 1/11 [00:00<00:01, 5.11it/s]
Caching latents to disk: 18%|ββ | 2/11 [00:00<00:01, 5.29it/s]
Caching latents to disk: 27%|βββ | 3/11 [00:00<00:01, 5.41it/s]
Caching latents to disk: 36%|ββββ | 4/11 [00:00<00:01, 5.48it/s]
Caching latents to disk: 45%|βββββ | 5/11 [00:00<00:01, 5.29it/s]
Caching latents to disk: 55%|ββββββ | 6/11 [00:01<00:00, 5.42it/s]
Caching latents to disk: 64%|βββββββ | 7/11 [00:01<00:00, 5.53it/s]
Caching latents to disk: 73%|ββββββββ | 8/11 [00:01<00:00, 5.57it/s]
Caching latents to disk: 82%|βββββββββ | 9/11 [00:01<00:00, 5.52it/s]
Caching latents to disk: 91%|βββββββββ | 10/11 [00:01<00:00, 5.51it/s]
Caching latents to disk: 100%|ββββββββββ| 11/11 [00:02<00:00, 5.56it/s]
Caching latents to disk: 100%|ββββββββββ| 11/11 [00:02<00:00, 5.48it/s]
Generating baseline samples before training
Generating Images: 0%| | 0/10 [00:00<?, ?it/s]
Generating Images: 10%|β | 1/10 [01:26<12:56, 86.27s/it]
Generating Images: 20%|ββ | 2/10 [02:04<07:44, 58.03s/it]
Generating Images: 30%|βββ | 3/10 [02:43<05:44, 49.28s/it]
Generating Images: 40%|ββββ | 4/10 [03:22<04:31, 45.27s/it]
Generating Images: 50%|βββββ | 5/10 [04:01<03:35, 43.08s/it]
Generating Images: 60%|ββββββ | 6/10 [04:40<02:46, 41.73s/it]
Generating Images: 70%|βββββββ | 7/10 [05:19<02:02, 40.85s/it]
Generating Images: 80%|ββββββββ | 8/10 [05:58<01:20, 40.23s/it]
Generating Images: 90%|βββββββββ | 9/10 [06:37<00:39, 39.82s/it]
Generating Images: 100%|ββββββββββ| 10/10 [07:16<00:00, 39.53s/it]
my_first_flux_lora_v1: 0%| | 0/3000 [00:00<?, ?it/s]
my_first_flux_lora_v1: 0%| | 0/3000 [00:04<?, ?it/s, lr: 1.0e-04 loss: 4.015e-01]
my_first_flux_lora_v1: 0%| | 0/3000 [00:04<?, ?it/s, lr: 1.0e-04 loss: 4.015e-01]
my_first_flux_lora_v1: 0%| | 0/3000 [00:10<?, ?it/s, lr: 1.0e-04 loss: 4.988e-01]
my_first_flux_lora_v1: 0%| | 1/3000 [00:10<4:56:11, 5.93s/it, lr: 1.0e-04 loss: 4.988e-01] |