GeroldMeisinger
/

controlnet-pointwise

Model card Files Files and versions Community

Gerold Meisinger commited on Oct 19, 2023

Commit

47a2196

·

1 Parent(s): 940a19a

eval

Files changed (2) hide show

README.md +33 -0
eval.zip +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,36 @@
 ---
 license: cc-by-nc-sa-4.0
 ---

 ---
 license: cc-by-nc-sa-4.0
 ---
+**Convert color images to grayscale**
+See the corresponding discussion at https://github.com/lllyasviel/ControlNet/discussions/561 !
+I have trained a ControlNet (214244a32 drop=0.5 mp=fp16 lr=1e-5) for 1.25 epochs by using a pointwise function to convert RGB to grayscale... which effectively makes it a pointless ControlNet 🤣
+I wanted to see how fast it converges on a simple linear-transformation. To emphasize again: it doesn't colorize grayscale images, it desaturates color images... which you might as well do in an image editor. It's the most ineffective way to make grayscale images. But it lets us evaluate the model very easily and we can peer into the inner workings of ControlNet a bit. And it's also a good baseline for inpainting assuming 0% masking and tells us which artefacts to expect in the unmasked area. I chose drop=0.5 because I assumed the CN should pick up on "ignore the prompt"-task very fast, similar to the desaturation task, and it lets us compare the influence of prompts, and it keeps it comparable with inpainting. I don't think it would have converged faster without any prompts.
+# Training
+```
+accelerate launch train_controlnet.py \
+  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
+  --train_batch_size=4 \
+  --gradient_accumulation_steps=8 \
+  --proportion_empty_prompts=0.5
+  --mixed_precision="fp16" \
+  --learning_rate=1e-5 \
+  --enable_xformers_memory_efficient_attention \
+  --use_8bit_adam \
+  --set_grads_to_none \
+  --seed=0
+```
+# Image dataset
+* laion2B-en aesthetics>=6.5 dataset
+* --min_image_size 512 --max_aspect_ratio 2 --resize_mode="center_crop" --image_size 512
+* Cleaned with `fastdup` default settings
+* Data augmented with right-left flipped images
+* Resulting in 214244 images
+* Converted to grayscale with `cv2`

eval.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2d834d2d9ed03a15e9be690359b7d4c337c7ff18e46a55572d45793871902002
+size 262594427