Update README.md
Browse files
README.md
CHANGED
@@ -15,8 +15,10 @@ pipeline_tag: image-to-image
|
|
15 |
|
16 |
This model pipeline demonstrates an advanced workflow for restoring grayscale images, performing inpainting, and converting them to RGB. The pipeline leverages two models based on the Stable Diffusion 2 architecture:
|
17 |
|
18 |
-
1. **Gray-Inpainting Model**: Fills missing regions of a grayscale image using a masked inpainting process.
|
19 |
-
|
|
|
|
|
20 |
|
21 |
---
|
22 |
|
@@ -39,41 +41,43 @@ This model pipeline demonstrates an advanced workflow for restoring grayscale im
|
|
39 |
```python
|
40 |
import torch
|
41 |
import numpy as np
|
|
|
42 |
from PIL import Image
|
43 |
from diffusers.utils import load_image
|
44 |
-
from transformers import AutoModel
|
45 |
|
46 |
-
# Load and preprocess images
|
47 |
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
|
48 |
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
|
49 |
|
50 |
-
image_gray = load_image(img_url).resize((512, 512)).convert('L').convert('RGB')
|
51 |
mask_image = load_image(mask_url).resize((512, 512))
|
52 |
-
mask = (np.array(mask_image)
|
53 |
-
image_gray_masked = Image.fromarray(((1
|
54 |
|
55 |
-
# Load
|
56 |
gray_inpaintor = AutoModel.from_pretrained(
|
57 |
'jwengr/stable-diffusion-2-gray-inpaint-to-rgb',
|
58 |
subfolder='gray-inpaint',
|
59 |
-
trust_remote_code=True
|
60 |
)
|
|
|
|
|
61 |
gray2rgb = AutoModel.from_pretrained(
|
62 |
'jwengr/stable-diffusion-2-gray-inpaint-to-rgb',
|
63 |
subfolder='gray2rgb',
|
64 |
-
trust_remote_code=True
|
65 |
)
|
66 |
|
67 |
-
|
68 |
gray_inpaintor.to('cuda')
|
69 |
gray2rgb.to('cuda')
|
70 |
|
71 |
-
#
|
72 |
# gray2rgb.unet.enable_xformers_memory_efficient_attention()
|
73 |
# gray_inpaintor.unet.enable_xformers_memory_efficient_attention()
|
74 |
|
75 |
-
|
76 |
-
with torch.autocast('cuda', dtype=torch.bfloat16):
|
77 |
with torch.no_grad():
|
78 |
-
|
|
|
79 |
image_restored = gray2rgb(image_gray_restored.convert('RGB'))
|
|
|
15 |
|
16 |
This model pipeline demonstrates an advanced workflow for restoring grayscale images, performing inpainting, and converting them to RGB. The pipeline leverages two models based on the Stable Diffusion 2 architecture:
|
17 |
|
18 |
+
1. **Gray-Inpainting Model**: Fills missing regions of a grayscale image using a masked inpainting process based on an **autoencoder (AE)** instead of a variational autoencoder (VAE). This simplifies the model while retaining high-quality reconstruction for the inpainted areas.
|
19 |
+
|
20 |
+
2. **Gray-to-RGB Conversion Model**: Converts the grayscale image (or inpainted output) into a full-color RGB image by introducing a **residual path in the autoencoder (AE)**. Instead of utilizing a diffusion process, the model directly predicts the latent representation of the color image, enabling efficient and accurate conversion.
|
21 |
+
|
22 |
|
23 |
---
|
24 |
|
|
|
41 |
```python
|
42 |
import torch
|
43 |
import numpy as np
|
44 |
+
|
45 |
from PIL import Image
|
46 |
from diffusers.utils import load_image
|
47 |
+
from transformers import AutoConfig, AutoModel, ModelCard
|
48 |
|
|
|
49 |
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
|
50 |
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
|
51 |
|
52 |
+
image_gray = load_image(img_url).resize((512, 512)).convert('L').convert('RGB') # image must be 3 channel
|
53 |
mask_image = load_image(mask_url).resize((512, 512))
|
54 |
+
mask = (np.array(mask_image)>128)*1
|
55 |
+
image_gray_masked = Image.fromarray(((1-mask) * np.array(image_gray)).astype(np.uint8))
|
56 |
|
57 |
+
# Load the gray-inpaint model
|
58 |
gray_inpaintor = AutoModel.from_pretrained(
|
59 |
'jwengr/stable-diffusion-2-gray-inpaint-to-rgb',
|
60 |
subfolder='gray-inpaint',
|
61 |
+
trust_remote_code=True,
|
62 |
)
|
63 |
+
|
64 |
+
Load the gray2rgb model
|
65 |
gray2rgb = AutoModel.from_pretrained(
|
66 |
'jwengr/stable-diffusion-2-gray-inpaint-to-rgb',
|
67 |
subfolder='gray2rgb',
|
68 |
+
trust_remote_code=True,
|
69 |
)
|
70 |
|
71 |
+
Move models to GPU
|
72 |
gray_inpaintor.to('cuda')
|
73 |
gray2rgb.to('cuda')
|
74 |
|
75 |
+
# Enable memory-efficient attention
|
76 |
# gray2rgb.unet.enable_xformers_memory_efficient_attention()
|
77 |
# gray_inpaintor.unet.enable_xformers_memory_efficient_attention()
|
78 |
|
79 |
+
with torch.autocast('cuda',dtype=torch.bfloat16):
|
|
|
80 |
with torch.no_grad():
|
81 |
+
# each model's input image should be one of PIL.Image, List[PIL.Image], preprocessed tensor (B,3,H,W). Image must be 3 channel
|
82 |
+
image_gray_restored = gray_inpaintor(image_gray_masked, num_inference_steps=250, seed=10)[0].convert('L') # you can pass 'mask' arg explictly. mask : Tensor (B,1,512,512)
|
83 |
image_restored = gray2rgb(image_gray_restored.convert('RGB'))
|