|
--- |
|
license: openrail |
|
base_model: runwayml/stable-diffusion-v1-5 |
|
tags: |
|
- art |
|
- controlnet |
|
- stable-diffusion |
|
--- |
|
|
|
# Controlnet |
|
|
|
Controlnet is an auxiliary model which augments pre-trained diffusion models with an additional conditioning. |
|
|
|
Controlnet comes with multiple auxiliary models, each which allows a different type of conditioning |
|
|
|
Controlnet's auxiliary models are trained with stable diffusion 1.5. Experimentally, the auxiliary models can be used with other diffusion models such as dreamboothed stable diffusion. |
|
|
|
The auxiliary conditioning is passed directly to the diffusers pipeline. If you want to process an image to create the auxiliary conditioning, external dependencies are required. |
|
|
|
Some of the additional conditionings can be extracted from images via additional models. We extracted these |
|
additional models from the original controlnet repo into a separate package that can be found on [github](https://github.com/patrickvonplaten/human_pose.git). |
|
|
|
## Canny edge detection |
|
|
|
Install opencv |
|
|
|
```sh |
|
$ pip install opencv-contrib-python |
|
``` |
|
|
|
```python |
|
import cv2 |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
import numpy as np |
|
|
|
image = Image.open('images/bird.png') |
|
image = np.array(image) |
|
|
|
low_threshold = 100 |
|
high_threshold = 200 |
|
|
|
image = cv2.Canny(image, low_threshold, high_threshold) |
|
image = image[:, :, None] |
|
image = np.concatenate([image, image, image], axis=2) |
|
image = Image.fromarray(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-canny", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("bird", image).images[0] |
|
|
|
image.save('images/bird_canny_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/ac5a7/ac5a7a8d96395d53d25a14df8cf92825828b1928" alt="bird" |
|
|
|
data:image/s3,"s3://crabby-images/4f6de/4f6dee00b06733ff046f782c692b28963e6e54b5" alt="bird_canny" |
|
|
|
data:image/s3,"s3://crabby-images/07480/07480261e6b485178edb6acbf978cea0e7600e53" alt="bird_canny_out" |
|
|
|
## M-LSD Straight line detection |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import MLSDdetector |
|
|
|
mlsd = MLSDdetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/room.png') |
|
|
|
image = mlsd(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-mlsd", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("room", image).images[0] |
|
|
|
image.save('images/room_mlsd_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/7a758/7a7580af0dd4db830307e2f59740148569ed23d4" alt="room" |
|
|
|
data:image/s3,"s3://crabby-images/b53c4/b53c4dc518678f2faa6fca4e74c4282941baa966" alt="room_mlsd" |
|
|
|
data:image/s3,"s3://crabby-images/ae06a/ae06a19ce04a90b97d5fc1513c8b205965be368a" alt="room_mlsd_out" |
|
|
|
## Pose estimation |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import OpenposeDetector |
|
|
|
openpose = OpenposeDetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/pose.png') |
|
|
|
image = openpose(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-openpose", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("chef in the kitchen", image).images[0] |
|
|
|
image.save('images/chef_pose_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/db137/db1375f4c45465cb433e50845aa3e7054266a24a" alt="pose" |
|
|
|
data:image/s3,"s3://crabby-images/2604e/2604e775ee36c4046400c345ff3f1165f29073b6" alt="openpose" |
|
|
|
data:image/s3,"s3://crabby-images/9c4a8/9c4a86b830b13fac2fa419aef941635f9f253373" alt="chef_pose_out" |
|
|
|
## Semantic Segmentation |
|
|
|
Semantic segmentation relies on transformers. Transformers is a |
|
dependency of diffusers for running controlnet, so you should |
|
have it installed already. |
|
|
|
```py |
|
from transformers import AutoImageProcessor, UperNetForSemanticSegmentation |
|
from PIL import Image |
|
import numpy as np |
|
from controlnet_utils import ade_palette |
|
import torch |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
|
|
image_processor = AutoImageProcessor.from_pretrained("openmmlab/upernet-convnext-small") |
|
image_segmentor = UperNetForSemanticSegmentation.from_pretrained("openmmlab/upernet-convnext-small") |
|
|
|
image = Image.open("./images/house.png").convert('RGB') |
|
|
|
pixel_values = image_processor(image, return_tensors="pt").pixel_values |
|
|
|
with torch.no_grad(): |
|
outputs = image_segmentor(pixel_values) |
|
|
|
seg = image_processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0] |
|
|
|
color_seg = np.zeros((seg.shape[0], seg.shape[1], 3), dtype=np.uint8) # height, width, 3 |
|
|
|
palette = np.array(ade_palette()) |
|
|
|
for label, color in enumerate(palette): |
|
color_seg[seg == label, :] = color |
|
|
|
color_seg = color_seg.astype(np.uint8) |
|
|
|
image = Image.fromarray(color_seg) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-seg", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("house", image).images[0] |
|
|
|
image.save('./images/house_seg_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/f2767/f276770c1665032db8e943bc7fd4ff1a2d7bd853" alt="house" |
|
|
|
data:image/s3,"s3://crabby-images/83a81/83a8160d920d570feebd8fbd3bd54a11149b30f9" alt="house_seg" |
|
|
|
data:image/s3,"s3://crabby-images/d8209/d8209e4b76cf78bf2750a9828da67cfc3088cb29" alt="house_seg_out" |
|
|
|
## Depth control |
|
|
|
Depth control relies on transformers. Transformers is a dependency of diffusers for running controlnet, so |
|
you should have it installed already. |
|
|
|
```py |
|
from transformers import pipeline |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
from PIL import Image |
|
import numpy as np |
|
|
|
depth_estimator = pipeline('depth-estimation') |
|
|
|
image = Image.open('./images/stormtrooper.png') |
|
image = depth_estimator(image)['depth'] |
|
image = np.array(image) |
|
image = image[:, :, None] |
|
image = np.concatenate([image, image, image], axis=2) |
|
image = Image.fromarray(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-depth", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("Stormtrooper's lecture", image).images[0] |
|
|
|
image.save('./images/stormtrooper_depth_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/f985a/f985aa3dfc6867ef5da7de4f7003b46bc5669675" alt="stormtrooper" |
|
|
|
data:image/s3,"s3://crabby-images/c3580/c3580c88819afba6b1c13973ce2d392b7161af88" alt="stormtrooler_depth" |
|
|
|
data:image/s3,"s3://crabby-images/b5197/b5197e716a986ec71aaf629254c303f5e0601f3e" alt="stormtrooler_depth_out" |
|
|
|
|
|
## Normal map |
|
|
|
```py |
|
from PIL import Image |
|
from transformers import pipeline |
|
import numpy as np |
|
import cv2 |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
|
|
image = Image.open("images/toy.png").convert("RGB") |
|
|
|
depth_estimator = pipeline("depth-estimation", model ="Intel/dpt-hybrid-midas" ) |
|
|
|
image = depth_estimator(image)['predicted_depth'][0] |
|
|
|
image = image.numpy() |
|
|
|
image_depth = image.copy() |
|
image_depth -= np.min(image_depth) |
|
image_depth /= np.max(image_depth) |
|
|
|
bg_threhold = 0.4 |
|
|
|
x = cv2.Sobel(image, cv2.CV_32F, 1, 0, ksize=3) |
|
x[image_depth < bg_threhold] = 0 |
|
|
|
y = cv2.Sobel(image, cv2.CV_32F, 0, 1, ksize=3) |
|
y[image_depth < bg_threhold] = 0 |
|
|
|
z = np.ones_like(x) * np.pi * 2.0 |
|
|
|
image = np.stack([x, y, z], axis=2) |
|
image /= np.sum(image ** 2.0, axis=2, keepdims=True) ** 0.5 |
|
image = (image * 127.5 + 127.5).clip(0, 255).astype(np.uint8) |
|
image = Image.fromarray(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-normal", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("cute toy", image).images[0] |
|
|
|
image.save('images/toy_normal_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/76d44/76d4403e3440b2cf20b39bc88fd7d8f58ac2cd40" alt="toy" |
|
|
|
data:image/s3,"s3://crabby-images/33dcf/33dcfc841fe27a0bd2f388e8ace3b3a82f75412f" alt="toy_normal" |
|
|
|
data:image/s3,"s3://crabby-images/5734b/5734b5c05210a5a09abf5ecfc6a78a1af6d7a6c9" alt="toy_normal_out" |
|
|
|
## Scribble |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import HEDdetector |
|
|
|
hed = HEDdetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/bag.png') |
|
|
|
image = hed(image, scribble=True) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-scribble", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("bag", image).images[0] |
|
|
|
image.save('images/bag_scribble_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/f14d2/f14d2ffd94dbf529bb63eef40c596f93d930b222" alt="bag" |
|
|
|
data:image/s3,"s3://crabby-images/cefed/cefed28d570b7ea6ee50800826eab786cf449c25" alt="bag_scribble" |
|
|
|
data:image/s3,"s3://crabby-images/4a241/4a241725fc1f7890e3d0fb255760d26aece9e6d0" alt="bag_scribble_out" |
|
|
|
## HED Boundary |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import HEDdetector |
|
|
|
hed = HEDdetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/man.png') |
|
|
|
image = hed(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-hed", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("oil painting of handsome old man, masterpiece", image).images[0] |
|
|
|
image.save('images/man_hed_out.png') |
|
``` |
|
|
|
data:image/s3,"s3://crabby-images/84822/84822915baacf48e6f2cde6e7107fbd8484999a8" alt="man" |
|
|
|
data:image/s3,"s3://crabby-images/72293/722931565545d37cb00a18776625c2aab26f9e3f" alt="man_hed" |
|
|
|
data:image/s3,"s3://crabby-images/b3438/b3438042183f3f99fbf8b3bda90a3dade2bf1959" alt="man_hed_out" |
|
|