Upload folder using huggingface_hub
Browse files
.gitattributes
CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
inpaint-examples-min.png filter=lfs diff=lfs merge=lfs -text
|
37 |
+
*.gguf filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,116 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
---
|
3 |
+
license: openrail++
|
4 |
+
base_model: stabilityai/stable-diffusion-xl-base-1.0
|
5 |
+
tags:
|
6 |
+
- stable-diffusion-xl
|
7 |
+
- stable-diffusion-xl-diffusers
|
8 |
+
- text-to-image
|
9 |
+
- diffusers
|
10 |
+
- inpainting
|
11 |
+
inference: false
|
12 |
+
---
|
13 |
+
|
14 |
+
# stable-diffusion-xl-inpainting-1.0-GGUF
|
15 |
+
|
16 |
+
!!! Experimental supported by [gpustack/llama-box v0.0.98+](https://github.com/gpustack/llama-box) only !!!
|
17 |
+
|
18 |
+
**Model creator**: [Diffusers](https://huggingface.co/diffusers)<br/>
|
19 |
+
**Original model**: [stable-diffusion-xl-1.0-inpainting-0.1](https://huggingface.co/diffusers/stable-diffusion-xl-1.0-inpainting-0.1)<br/>
|
20 |
+
**GGUF quantization**: based on stable-diffusion.cpp [ac54e](https://github.com/leejet/stable-diffusion.cpp/commit/ac54e0076052a196b7df961eb1f792c9ff4d7f22) that patched by llama-box.<br/>
|
21 |
+
|
22 |
+
| Quantization | OpenAI CLIP ViT-L/14 Quantization | OpenCLIP ViT-G/14 Quantization | VAE Quantization |
|
23 |
+
| --- | --- | --- | --- |
|
24 |
+
| FP16 | FP16 | FP16 | FP16 |
|
25 |
+
| Q8_0 | FP16 | FP16 | FP16 |
|
26 |
+
| Q4_1 | FP16 | FP16 | FP16 |
|
27 |
+
| Q4_0 | FP16 | FP16 | FP16 |
|
28 |
+
|
29 |
+
|
30 |
+
# SD-XL Inpainting 0.1 Model Card
|
31 |
+
|
32 |
+
![inpaint-example](inpaint-examples-min.png)
|
33 |
+
|
34 |
+
SD-XL Inpainting 0.1 is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.
|
35 |
+
|
36 |
+
The SD-XL Inpainting 0.1 was initialized with the `stable-diffusion-xl-base-1.0` weights. The model is trained for 40k steps at resolution 1024x1024 and 5% dropping of the text-conditioning to improve classifier-free classifier-free guidance sampling. For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and, in 25% mask everything.
|
37 |
+
|
38 |
+
|
39 |
+
## How to use
|
40 |
+
|
41 |
+
```py
|
42 |
+
from diffusers import AutoPipelineForInpainting
|
43 |
+
from diffusers.utils import load_image
|
44 |
+
import torch
|
45 |
+
|
46 |
+
pipe = AutoPipelineForInpainting.from_pretrained("diffusers/stable-diffusion-xl-1.0-inpainting-0.1", torch_dtype=torch.float16, variant="fp16").to("cuda")
|
47 |
+
|
48 |
+
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
|
49 |
+
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
|
50 |
+
|
51 |
+
image = load_image(img_url).resize((1024, 1024))
|
52 |
+
mask_image = load_image(mask_url).resize((1024, 1024))
|
53 |
+
|
54 |
+
prompt = "a tiger sitting on a park bench"
|
55 |
+
generator = torch.Generator(device="cuda").manual_seed(0)
|
56 |
+
|
57 |
+
image = pipe(
|
58 |
+
prompt=prompt,
|
59 |
+
image=image,
|
60 |
+
mask_image=mask_image,
|
61 |
+
guidance_scale=8.0,
|
62 |
+
num_inference_steps=20, # steps between 15 and 30 work well for us
|
63 |
+
strength=0.99, # make sure to use `strength` below 1.0
|
64 |
+
generator=generator,
|
65 |
+
).images[0]
|
66 |
+
```
|
67 |
+
|
68 |
+
**How it works:**
|
69 |
+
`image` | `mask_image`
|
70 |
+
:-------------------------:|:-------------------------:|
|
71 |
+
<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="300"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="300"/>
|
72 |
+
|
73 |
+
|
74 |
+
`prompt` | `Output`
|
75 |
+
:-------------------------:|:-------------------------:|
|
76 |
+
<span style="position: relative;bottom: 150px;">a tiger sitting on a park bench</span> | <img src="https://huggingface.co/datasets/valhalla/images/resolve/main/tiger.png" alt="drawing" width="300"/>
|
77 |
+
|
78 |
+
## Model Description
|
79 |
+
|
80 |
+
- **Developed by:** The Diffusers team
|
81 |
+
- **Model type:** Diffusion-based text-to-image generative model
|
82 |
+
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
|
83 |
+
- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
|
84 |
+
|
85 |
+
|
86 |
+
## Uses
|
87 |
+
|
88 |
+
### Direct Use
|
89 |
+
|
90 |
+
The model is intended for research purposes only. Possible research areas and tasks include
|
91 |
+
|
92 |
+
- Generation of artworks and use in design and other artistic processes.
|
93 |
+
- Applications in educational or creative tools.
|
94 |
+
- Research on generative models.
|
95 |
+
- Safe deployment of models which have the potential to generate harmful content.
|
96 |
+
- Probing and understanding the limitations and biases of generative models.
|
97 |
+
|
98 |
+
Excluded uses are described below.
|
99 |
+
|
100 |
+
### Out-of-Scope Use
|
101 |
+
|
102 |
+
The model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
|
103 |
+
|
104 |
+
## Limitations and Bias
|
105 |
+
|
106 |
+
### Limitations
|
107 |
+
|
108 |
+
- The model does not achieve perfect photorealism
|
109 |
+
- The model cannot render legible text
|
110 |
+
- The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
|
111 |
+
- Faces and people in general may not be generated properly.
|
112 |
+
- The autoencoding part of the model is lossy.
|
113 |
+
- When the strength parameter is set to 1 (i.e. starting in-painting from a fully masked image), the quality of the image is degraded. The model retains the non-masked contents of the image, but images look less sharp. We're investing this and working on the next version.
|
114 |
+
|
115 |
+
### Bias
|
116 |
+
While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
|
inpaint-examples-min.png
ADDED
stable-diffusion-xl-inpainting-1.0-FP16.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:49381c731b1cfef58340a64d647a7be45938f805b4493f8e354efa6358bbf676
|
3 |
+
size 6937989920
|
stable-diffusion-xl-inpainting-1.0-Q4_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:608375abc7d6abce2aca0512719e701c3ce8ddee306dcb1a67b4cfcc1911e386
|
3 |
+
size 3772771520
|
stable-diffusion-xl-inpainting-1.0-Q4_1.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1eba9873f3f353be1621a12e7b564f27e062fd280eb8ec93970ec00868701bac
|
3 |
+
size 3910386528
|
stable-diffusion-xl-inpainting-1.0-Q8_0.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4a08b13f91933e06625a7026c787408100e2116326aeda961c9a428e61c2d9c0
|
3 |
+
size 4873718560
|