Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ base_model: Qwen/Qwen-Image
|
|
14 |
---
|
15 |
|
16 |
# Qwen-Image-ControlNet-Union
|
17 |
-
This repository provides a unified ControlNet that supports 4 control types (canny, soft edge, depth, pose) for [Qwen-Image](https://
|
18 |
|
19 |
|
20 |
# Model Cards
|
@@ -48,19 +48,17 @@ This repository provides a unified ControlNet that supports 4 control types (can
|
|
48 |
import torch
|
49 |
from diffusers.utils import load_image
|
50 |
|
51 |
-
#
|
52 |
-
|
53 |
-
from
|
54 |
-
from pipeline_qwenimage_controlnet import QwenImageControlNetPipeline
|
55 |
|
56 |
base_model = "Qwen/Qwen-Image"
|
57 |
controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"
|
58 |
|
59 |
controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
|
60 |
-
transformer = QwenImageTransformer2DModel.from_pretrained(base_model, subfolder="transformer", torch_dtype=torch.bfloat16)
|
61 |
|
62 |
pipe = QwenImageControlNetPipeline.from_pretrained(
|
63 |
-
base_model, controlnet=controlnet,
|
64 |
)
|
65 |
pipe.to("cuda")
|
66 |
|
@@ -70,15 +68,15 @@ control_image = load_image("conds/canny.png")
|
|
70 |
prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
|
71 |
controlnet_conditioning_scale = 1.0
|
72 |
|
73 |
-
# soft edge
|
74 |
# control_image = load_image("conds/soft_edge.png")
|
75 |
# prompt = "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy."
|
76 |
-
# controlnet_conditioning_scale = 0
|
77 |
|
78 |
# depth
|
79 |
# control_image = load_image("conds/depth.png")
|
80 |
# prompt = "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor."
|
81 |
-
# controlnet_conditioning_scale = 0
|
82 |
|
83 |
# pose
|
84 |
# control_image = load_image("conds/pose.png")
|
@@ -99,7 +97,7 @@ image = pipe(
|
|
99 |
image.save(f"qwenimage_cn_union_result.png")
|
100 |
```
|
101 |
|
102 |
-
#
|
103 |
You can adjust control strength via controlnet_conditioning_scale.
|
104 |
- Canny: use cv2.Canny, set controlnet_conditioning_scale in [0.8, 1.0]
|
105 |
- Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), set controlnet_conditioning_scale in [0.8, 1.0]
|
@@ -108,11 +106,16 @@ You can adjust control strength via controlnet_conditioning_scale.
|
|
108 |
|
109 |
We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster".
|
110 |
|
|
|
|
|
|
|
|
|
|
|
111 |
# Community Support
|
112 |
-
[Liblib AI](https://www.liblib.art/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://www.liblib.art/) for more details.
|
113 |
|
114 |
# Limitations
|
115 |
-
We find that the model was unable to preserve some details, such as small font text.
|
116 |
|
117 |
# Acknowledgements
|
118 |
This model is developed by InstantX Team. All copyright reserved.
|
|
|
14 |
---
|
15 |
|
16 |
# Qwen-Image-ControlNet-Union
|
17 |
+
This repository provides a unified ControlNet that supports 4 common control types (canny, soft edge, depth, pose) for [Qwen-Image](https://github.com/QwenLM/Qwen-Image).
|
18 |
|
19 |
|
20 |
# Model Cards
|
|
|
48 |
import torch
|
49 |
from diffusers.utils import load_image
|
50 |
|
51 |
+
# https://github.com/huggingface/diffusers/pull/12215
|
52 |
+
# pip install git+https://github.com/huggingface/diffusers
|
53 |
+
from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel
|
|
|
54 |
|
55 |
base_model = "Qwen/Qwen-Image"
|
56 |
controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"
|
57 |
|
58 |
controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
|
|
|
59 |
|
60 |
pipe = QwenImageControlNetPipeline.from_pretrained(
|
61 |
+
base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
|
62 |
)
|
63 |
pipe.to("cuda")
|
64 |
|
|
|
68 |
prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
|
69 |
controlnet_conditioning_scale = 1.0
|
70 |
|
71 |
+
# soft edge
|
72 |
# control_image = load_image("conds/soft_edge.png")
|
73 |
# prompt = "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy."
|
74 |
+
# controlnet_conditioning_scale = 1.0
|
75 |
|
76 |
# depth
|
77 |
# control_image = load_image("conds/depth.png")
|
78 |
# prompt = "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor."
|
79 |
+
# controlnet_conditioning_scale = 1.0
|
80 |
|
81 |
# pose
|
82 |
# control_image = load_image("conds/pose.png")
|
|
|
97 |
image.save(f"qwenimage_cn_union_result.png")
|
98 |
```
|
99 |
|
100 |
+
# Inference Setting
|
101 |
You can adjust control strength via controlnet_conditioning_scale.
|
102 |
- Canny: use cv2.Canny, set controlnet_conditioning_scale in [0.8, 1.0]
|
103 |
- Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), set controlnet_conditioning_scale in [0.8, 1.0]
|
|
|
106 |
|
107 |
We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster".
|
108 |
|
109 |
+
For multiple conditions inference, please refer to [PR](https://github.com/huggingface/diffusers/pull/12215).
|
110 |
+
|
111 |
+
# ComfyUI Support
|
112 |
+
[ComfyUI](https://www.comfy.org/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://github.com/comfyanonymous/ComfyUI/pull/9488) for more details.
|
113 |
+
|
114 |
# Community Support
|
115 |
+
[Liblib AI](https://www.liblib.art/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://www.liblib.art/modelinfo/4d3f51c2bf1e4c51ae8dedd8c19da827?from=personal_page&versionUuid=5b5f21d2b80445598db19e924bd3a409) for more details.
|
116 |
|
117 |
# Limitations
|
118 |
+
We find that the model was unable to preserve some details without explicit 'TEXT' in prompt, such as small font text.
|
119 |
|
120 |
# Acknowledgements
|
121 |
This model is developed by InstantX Team. All copyright reserved.
|