Qwen-Image Image Structure Control Model - Depth ControlNet

Model Introduction

This model is a structure control model for images, trained based on Qwen-Image .The model architecture is ControlNet, which can control the generated image structure according to the depth (Depth) map .The training framework is built onDiffSynth-Studio and the dataset used is BLIP3oใ€‚

Effect Demonstration

Structure Map Generated Image 1 Generated Image 2

Inference Code

git clone https://github.com/modelscope/DiffSynth-Studio.git  
cd DiffSynth-Studio
pip install -e .
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig, ControlNetInput
from PIL import Image
import torch
from modelscope import dataset_snapshot_download


pipe = QwenImagePipeline.from_pretrained(
    torch_dtype=torch.bfloat16,
    device="cuda",
    model_configs=[
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
        ModelConfig(model_id="DiffSynth-Studio/Qwen-Image-Blockwise-ControlNet-Depth", origin_file_pattern="model.safetensors"),
    ],
    tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"),
)

dataset_snapshot_download(
    dataset_id="DiffSynth-Studio/example_image_dataset",
    local_dir="./data/example_image_dataset",
    allow_file_pattern="depth/image_1.jpg"
)

controlnet_image = Image.open("data/example_image_dataset/depth/image_1.jpg").resize((1328, 1328))

prompt = "Exquisite portrait of an underwater girl with flowing blue dress and fluttering hair. Transparent light and shadow, surrounded by bubbles. Her face is serene, with exquisite details and dreamy beauty."
image = pipe(
    prompt, seed=0,
    blockwise_controlnet_inputs=[ControlNetInput(image=controlnet_image)]
)
image.save("image.jpg")

license: apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
1.13B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SahilCarterr/Qwen-Image-Blockwise-ControlNet-Depth

Base model

Qwen/Qwen-Image
Adapter
(40)
this model