wanghaofan commited on
Commit
0a46e3b
·
verified ·
1 Parent(s): 43f94b2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -13
README.md CHANGED
@@ -14,7 +14,7 @@ base_model: Qwen/Qwen-Image
14
  ---
15
 
16
  # Qwen-Image-ControlNet-Union
17
- This repository provides a unified ControlNet that supports 4 control types (canny, soft edge, depth, pose) for [Qwen-Image](https://huggingface.co/Qwen/Qwen-Image).
18
 
19
 
20
  # Model Cards
@@ -48,19 +48,17 @@ This repository provides a unified ControlNet that supports 4 control types (can
48
  import torch
49
  from diffusers.utils import load_image
50
 
51
- # before merging, please import via local path
52
- from controlnet_qwenimage import QwenImageControlNetModel
53
- from transformer_qwenimage import QwenImageTransformer2DModel
54
- from pipeline_qwenimage_controlnet import QwenImageControlNetPipeline
55
 
56
  base_model = "Qwen/Qwen-Image"
57
  controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"
58
 
59
  controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
60
- transformer = QwenImageTransformer2DModel.from_pretrained(base_model, subfolder="transformer", torch_dtype=torch.bfloat16)
61
 
62
  pipe = QwenImageControlNetPipeline.from_pretrained(
63
- base_model, controlnet=controlnet, transformer=transformer, torch_dtype=torch.bfloat16
64
  )
65
  pipe.to("cuda")
66
 
@@ -70,15 +68,15 @@ control_image = load_image("conds/canny.png")
70
  prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
71
  controlnet_conditioning_scale = 1.0
72
 
73
- # soft edge, recommended scale: 0.8 - 1.0
74
  # control_image = load_image("conds/soft_edge.png")
75
  # prompt = "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy."
76
- # controlnet_conditioning_scale = 0.9
77
 
78
  # depth
79
  # control_image = load_image("conds/depth.png")
80
  # prompt = "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor."
81
- # controlnet_conditioning_scale = 0.9
82
 
83
  # pose
84
  # control_image = load_image("conds/pose.png")
@@ -99,7 +97,7 @@ image = pipe(
99
  image.save(f"qwenimage_cn_union_result.png")
100
  ```
101
 
102
- # Recommended Parameters
103
  You can adjust control strength via controlnet_conditioning_scale.
104
  - Canny: use cv2.Canny, set controlnet_conditioning_scale in [0.8, 1.0]
105
  - Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), set controlnet_conditioning_scale in [0.8, 1.0]
@@ -108,11 +106,16 @@ You can adjust control strength via controlnet_conditioning_scale.
108
 
109
  We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster".
110
 
 
 
 
 
 
111
  # Community Support
112
- [Liblib AI](https://www.liblib.art/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://www.liblib.art/) for more details.
113
 
114
  # Limitations
115
- We find that the model was unable to preserve some details, such as small font text.
116
 
117
  # Acknowledgements
118
  This model is developed by InstantX Team. All copyright reserved.
 
14
  ---
15
 
16
  # Qwen-Image-ControlNet-Union
17
+ This repository provides a unified ControlNet that supports 4 common control types (canny, soft edge, depth, pose) for [Qwen-Image](https://github.com/QwenLM/Qwen-Image).
18
 
19
 
20
  # Model Cards
 
48
  import torch
49
  from diffusers.utils import load_image
50
 
51
+ # https://github.com/huggingface/diffusers/pull/12215
52
+ # pip install git+https://github.com/huggingface/diffusers
53
+ from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel
 
54
 
55
  base_model = "Qwen/Qwen-Image"
56
  controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"
57
 
58
  controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
 
59
 
60
  pipe = QwenImageControlNetPipeline.from_pretrained(
61
+ base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
62
  )
63
  pipe.to("cuda")
64
 
 
68
  prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
69
  controlnet_conditioning_scale = 1.0
70
 
71
+ # soft edge
72
  # control_image = load_image("conds/soft_edge.png")
73
  # prompt = "Photograph of a young man with light brown hair jumping mid-air off a large, reddish-brown rock. He's wearing a navy blue sweater, light blue shirt, gray pants, and brown shoes. His arms are outstretched, and he has a slight smile on his face. The background features a cloudy sky and a distant, leafless tree line. The grass around the rock is patchy."
74
+ # controlnet_conditioning_scale = 1.0
75
 
76
  # depth
77
  # control_image = load_image("conds/depth.png")
78
  # prompt = "A swanky, minimalist living room with a huge floor-to-ceiling window letting in loads of natural light. A beige couch with white cushions sits on a wooden floor, with a matching coffee table in front. The walls are a soft, warm beige, decorated with two framed botanical prints. A potted plant chills in the corner near the window. Sunlight pours through the leaves outside, casting cool shadows on the floor."
79
+ # controlnet_conditioning_scale = 1.0
80
 
81
  # pose
82
  # control_image = load_image("conds/pose.png")
 
97
  image.save(f"qwenimage_cn_union_result.png")
98
  ```
99
 
100
+ # Inference Setting
101
  You can adjust control strength via controlnet_conditioning_scale.
102
  - Canny: use cv2.Canny, set controlnet_conditioning_scale in [0.8, 1.0]
103
  - Soft Edge: use [AnylineDetector](https://github.com/huggingface/controlnet_aux), set controlnet_conditioning_scale in [0.8, 1.0]
 
106
 
107
  We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster".
108
 
109
+ For multiple conditions inference, please refer to [PR](https://github.com/huggingface/diffusers/pull/12215).
110
+
111
+ # ComfyUI Support
112
+ [ComfyUI](https://www.comfy.org/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://github.com/comfyanonymous/ComfyUI/pull/9488) for more details.
113
+
114
  # Community Support
115
+ [Liblib AI](https://www.liblib.art/) offers native support for Qwen-Image-ControlNet-Union. [Visit](https://www.liblib.art/modelinfo/4d3f51c2bf1e4c51ae8dedd8c19da827?from=personal_page&versionUuid=5b5f21d2b80445598db19e924bd3a409) for more details.
116
 
117
  # Limitations
118
+ We find that the model was unable to preserve some details without explicit 'TEXT' in prompt, such as small font text.
119
 
120
  # Acknowledgements
121
  This model is developed by InstantX Team. All copyright reserved.