BleachNick's picture
upload required packages
87d40d2

A newer version of the Gradio SDK is available: 5.23.3

Upgrade

Text-guided depth-to-image 생성

[[open-in-colab]]

[StableDiffusionDepth2ImgPipeline]을 μ‚¬μš©ν•˜λ©΄ ν…μŠ€νŠΈ ν”„λ‘¬ν”„νŠΈμ™€ 초기 이미지λ₯Ό μ „λ‹¬ν•˜μ—¬ μƒˆ μ΄λ―Έμ§€μ˜ 생성을 μ‘°μ ˆν•  수 μžˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ 이미지 ꡬ쑰λ₯Ό λ³΄μ‘΄ν•˜κΈ° μœ„ν•΄ depth_map을 전달할 μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€. depth_map이 μ œκ³΅λ˜μ§€ μ•ŠμœΌλ©΄ νŒŒμ΄ν”„λΌμΈμ€ ν†΅ν•©λœ depth-estimation model을 톡해 μžλ™μœΌλ‘œ 깊이λ₯Ό μ˜ˆμΈ‘ν•©λ‹ˆλ‹€.

λ¨Όμ € [StableDiffusionDepth2ImgPipeline]의 μΈμŠ€ν„΄μŠ€λ₯Ό μƒμ„±ν•©λ‹ˆλ‹€:

import torch
import requests
from PIL import Image

from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-depth",
    torch_dtype=torch.float16,
).to("cuda")

이제 ν”„λ‘¬ν”„νŠΈλ₯Ό νŒŒμ΄ν”„λΌμΈμ— μ „λ‹¬ν•©λ‹ˆλ‹€. νŠΉμ • 단어가 이미지 생성을 κ°€μ΄λ“œ ν•˜λŠ”κ²ƒμ„ λ°©μ§€ν•˜κΈ° μœ„ν•΄ negative_promptλ₯Ό 전달할 μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€:

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)
prompt = "two tigers"
n_prompt = "bad, deformed, ugly, bad anatomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_prompt, strength=0.7).images[0]
image
Input Output

μ•„λž˜μ˜ Spacesλ₯Ό 가지고 놀며 depth map이 μžˆλŠ” 이미지와 μ—†λŠ” μ΄λ―Έμ§€μ˜ 차이가 μžˆλŠ”μ§€ 확인해 λ³΄μ„Έμš”!