GabeIsHaxkee commited on
Commit
fb8dafa
·
1 Parent(s): f165a3f

Upload 9 files

Browse files
Files changed (9) hide show
  1. CHANGELOG.md +12 -0
  2. LICENSE +26 -0
  3. LICENSE-MODEL +175 -0
  4. README.md +346 -11
  5. requirements-dev.txt +2 -0
  6. requirements-test.txt +3 -0
  7. requirements.txt +15 -0
  8. setup.cfg +13 -0
  9. setup.py +57 -0
CHANGELOG.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ v1.0.1
3
+ ------
4
+
5
+ - renamed main model `IF-I-IF` --> `IF-I-XL`
6
+ - moved dir `notebooks` to HF storage https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/tree/main/notebooks; lets keep new notebooks there;
7
+ - added additional kaggle notebook (more free GPU resources) how to generate pictures 1k: [![Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://www.kaggle.com/code/shonenkov/deepfloyd-if-4-3b-generator-of-pictures)
8
+
9
+ v1.0.0
10
+ ------
11
+
12
+ - initial version
LICENSE ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright (c) 2023 DeepFloyd, StabilityAI
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ 1. The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ 2. All persons obtaining a copy or substantial portion of the Software,
14
+ a modified version of the Software (or substantial portion thereof), or
15
+ a derivative work based upon this Software (or substantial portion thereof)
16
+ must not delete, remove, disable, diminish, or circumvent any inference filters or
17
+ inference filter mechanisms in the Software, or any portion of the Software that
18
+ implements any such filters or filter mechanisms.
19
+
20
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
+ SOFTWARE.
LICENSE-MODEL ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ DEEPFLOYD IF LICENSE AGREEMENT
2
+
3
+ This License Agreement (as may be amended in accordance with this License Agreement, “License”),
4
+ between you, or your employer or other entity (if you are entering into this agreement on behalf
5
+ of your employer or other entity) (“Licensee” or “you”) and Stability AI Ltd.. (“Stability AI” or “we”)
6
+ applies to your use of any computer program, algorithm, source code, object code, or software that is made
7
+ available by Stability AI under this License (“Software”) and any specifications, manuals, documentation,
8
+ and other written information provided by Stability AI related to the Software (“Documentation”).
9
+ By clicking “I Accept” below or by using the Software, you agree to the terms of this License.
10
+ If you do not agree to this License, then you do not have any rights to use the Software or
11
+ Documentation (collectively, the “Software Products”), and you must immediately cease using
12
+ the Software Products. If you are agreeing to be bound by the terms of this License on behalf
13
+ of your employer or other entity, you represent and warrant to Stability AI that you have full legal
14
+ authority to bind your employer or such entity to this License. If you do not have the requisite authority,
15
+ you may not accept the License or access the Software Products on behalf of your employer or other entity.
16
+
17
+ 1. LICENSE GRANT
18
+
19
+ a. Subject to your compliance with the Documentation and Sections 2, 3, and 5, Stability AI grants
20
+ you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty free and limited
21
+ license under Stability AI’s copyright interests to reproduce, distribute, and create derivative works of
22
+ the Software solely for your non-commercial research purposes. The foregoing license is personal to you,
23
+ and you may not assign or sublicense this License or any other rights or obligations under this License
24
+ without Stability AI’s prior written consent; any such assignment or sublicense will be void and will
25
+ automatically and immediately terminate this License.
26
+
27
+ b. You may make a reasonable number of copies of the Documentation solely for use in connection with
28
+ the license to the Software granted above.
29
+
30
+ c. The grant of rights expressly set forth in this Section 1 (License Grant) are the complete
31
+ grant of rights to you in the Software Products, and no other licenses are granted, whether by waiver,
32
+ estoppel, implication, equity or otherwise. Stability AI and its licensors reserve all rights
33
+ not expressly granted by this License.
34
+
35
+
36
+ 2. RESTRICTIONS
37
+
38
+ You will not, and will not permit, assist or cause any third party to:
39
+
40
+ a. use, modify, copy, reproduce, create derivative works of, or distribute the Software Products
41
+ (or any derivative works thereof, works incorporating the Software Products, or any data produced
42
+ by the Software), in whole or in part, for (i) any commercial or production purposes,
43
+ (ii) military purposes or in the service of nuclear technology, (iii) purposes of surveillance,
44
+ including any research or development relating to surveillance, (iv) biometric processing,
45
+ (v) in any manner that infringes, misappropriates, or otherwise violates any third-party rights,
46
+ or (vi) in any manner that violates any applicable law and violating any privacy or security laws,
47
+ rules, regulations, directives, or governmental requirements (including the General Data Privacy
48
+ Regulation (Regulation (EU) 2016/679), the California Consumer Privacy Act, and any and all laws
49
+ governing the processing of biometric information), as well as all amendments and successor laws
50
+ to any of the foregoing;
51
+
52
+ b. alter or remove copyright and other proprietary notices which appear on or in the Software Products;
53
+
54
+ c. utilize any equipment, device, software, or other means to circumvent or remove any security or
55
+ protection used by Stability AI in connection with the Software, or to circumvent or remove any
56
+ usage restrictions, or to enable functionality disabled by Stability AI; or
57
+
58
+ d. offer or impose any terms on the Software Products that alter, restrict, or are inconsistent
59
+ with the terms of this License.
60
+
61
+ e. 1) violate any applicable U.S. and non-U.S. export control and trade sanctions laws
62
+ (“Export Laws”); 2) directly or indirectly export, re-export, provide, or otherwise
63
+ transfer Software Products: (a) to any individual, entity, or country prohibited by Export Laws; (b)
64
+ to anyone on U.S. or non-U.S. government restricted parties lists; or (c) for any purpose prohibited
65
+ by Export Laws, including nuclear, chemical or biological weapons, or missile technology applications;
66
+ 3) use or download Software Products if you or they are: (a) located in a comprehensively sanctioned
67
+ jurisdiction, (b) currently listed on any U.S. or non-U.S. restricted parties list, or (c) for any
68
+ purpose prohibited by Export Laws; and (4) will not disguise your location through IP proxying or other methods.
69
+
70
+
71
+ 3. ATTRIBUTION
72
+
73
+ Together with any copies of the Software Products (as well as derivative works thereof or works
74
+ incorporating the Software Products) that you distribute, you must provide (i) a copy of this License,
75
+ and (ii) the following attribution notice: “DeepFloyd is licensed under the DeepFloyd License,
76
+ Copyright (c) Stability AI Ltd. All Rights Reserved.”
77
+
78
+
79
+ 4. DISCLAIMERS
80
+
81
+ THE SOFTWARE PRODUCTS ARE PROVIDED “AS IS” and “WITH ALL FAULTS” WITH NO WARRANTY OF ANY KIND,
82
+ EXPRESS OR IMPLIED. STABILITY AIEXPRESSLY DISCLAIMS ALL REPRESENTATIONS AND WARRANTIES, EXPRESS OR IMPLIED,
83
+ WHETHER BY STATUTE, CUSTOM, USAGE OR OTHERWISE AS TO ANY MATTERS RELATED TO THE SOFTWARE PRODUCTS,
84
+ INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
85
+ TITLE, SATISFACTORY QUALITY, OR NON-INFRINGEMENT. STABILITY AI MAKES NO WARRANTIES OR REPRESENTATIONS
86
+ THAT THE SOFTWARE PRODUCTS WILL BE ERROR FREE OR FREE OF VIRUSES OR OTHER HARMFUL COMPONENTS,
87
+ OR PRODUCE ANY PARTICULAR RESULTS.
88
+
89
+
90
+ 5. LIMITATION OF LIABILITY
91
+
92
+ TO THE FULLEST EXTENT PERMITTED BY LAW, IN NO EVENT WILL STABILITY AI BE LIABLE TO YOU (A) UNDER
93
+ ANY THEORY OF LIABILITY, WHETHER BASED IN CONTRACT, TORT, NEGLIGENCE, STRICT LIABILITY, WARRANTY,
94
+ OR OTHERWISE UNDER THIS LICENSE, OR (B) FOR ANY INDIRECT, CONSEQUENTIAL, EXEMPLARY, INCIDENTAL,
95
+ PUNITIVE OR SPECIAL DAMAGES OR LOST PROFITS, EVEN IF STABILITY AI HAS BEEN ADVISED OF THE POSSIBILITY
96
+ OF SUCH DAMAGES. THE SOFTWARE PRODUCTS, THEIR CONSTITUENT COMPONENTS, AND ANY OUTPUT
97
+ (COLLECTIVELY, “SOFTWARE MATERIALS”) ARE NOT DESIGNED OR INTENDED FOR USE IN ANY APPLICATION OR
98
+ SITUATION WHERE FAILURE OR FAULT OF THE SOFTWARE MATERIALS COULD REASONABLY BE ANTICIPATED TO LEAD
99
+ TO SERIOUS INJURY OF ANY PERSON, INCLUDING POTENTIAL DISCRIMINATION OR VIOLATION OF AN INDIVIDUAL’S
100
+ PRIVACY RIGHTS, OR TO SEVERE PHYSICAL, PROPERTY, OR ENVIRONMENTAL DAMAGE (EACH, A “HIGH-RISK USE”).
101
+ IF YOU ELECT TO USE ANY OF THE SOFTWARE MATERIALS FOR A HIGH-RISK USE, YOU DO SO AT YOUR OWN RISK.
102
+ YOU AGREE TO DESIGN AND IMPLEMENT APPROPRIATE DECISION-MAKING AND RISK-MITIGATION PROCEDURES AND
103
+ POLICIES IN CONNECTION WITH A HIGH-RISK USE SUCH THAT EVEN IF THERE IS A FAILURE OR FAULT IN ANY
104
+ OF THE SOFTWARE MATERIALS, THE SAFETY OF PERSONS OR PROPERTY AFFECTED BY THE ACTIVITY STAYS AT A LEVEL
105
+ THAT IS REASONABLE, APPROPRIATE, AND LAWFUL FOR THE FIELD OF THE HIGH-RISK USE.
106
+
107
+
108
+ 6. INDEMNIFICATION
109
+
110
+ You will indemnify, defend and hold harmless Stability AI and our subsidiaries and affiliates,
111
+ and each of our respective shareholders, directors, officers, employees, agents, successors,
112
+ and assigns (collectively, the “Stability AI Parties”) from and against any losses, liabilities,
113
+ damages, fines, penalties, and expenses (including reasonable attorneys’ fees) incurred by any
114
+ Stability AI Party in connection with any claim, demand, allegation, lawsuit, proceeding, or
115
+ investigation (collectively, “Claims”) arising out of or related to: (a) your access to or
116
+ use of the Software Products (as well as any results or data generated from such access or use),
117
+ including any High-Risk Use (defined below); (b) your violation of this License; or (c)
118
+ your violation, misappropriation or infringement of any rights of another (including intellectual
119
+ property or other proprietary rights and privacy rights). You will promptly notify the Stability AI
120
+ Parties of any such Claims, and cooperate with Stability AI Parties in defending such Claims.
121
+ You will also grant the Stability AI Parties sole control of the defense or settlement,
122
+ at Stability AI’s sole option, of any Claims. This indemnity is in addition to, and not in lieu of,
123
+ any other indemnities or remedies set forth in a written agreement between you and
124
+ Stability AI or the other Stability AI Parties.
125
+
126
+
127
+ 7. TERMINATION; SURVIVAL
128
+
129
+ a. This License will automatically terminate upon any breach by you of the terms of this License.
130
+
131
+ b. We may terminate this License, in whole or in part, at any time upon notice (including electronic) to you.
132
+
133
+ c. The following sections survive termination of this License: 2 (Restrictions), 3 (Attribution),
134
+ 4 (Disclaimers), 5 (Limitation on Liability), 6 (Indemnification) 7 (Termination; Survival),
135
+ 8 (Third Party Materials), 9 (Trademarks), 10 (Applicable Law; Dispute Resolution), and 11 (Miscellaneous).
136
+
137
+
138
+ 8. THIRD PARTY MATERIALS
139
+
140
+ The Software Products may contain third-party software or other components (including free and
141
+ open source software) (all of the foregoing, “Third Party Materials”), which are subject to
142
+ the license terms of the respective third-party licensors. Your dealings or correspondence
143
+ with third parties and your use of or interaction with any Third Party Materials are solely
144
+ between you and the third party. Stability AI does not control or endorse, and makes
145
+ no representations or warranties regarding, any Third Party Materials, and your access
146
+ to and use of such Third Party Materials are at your own risk.
147
+
148
+
149
+ 9. TRADEMARKS
150
+
151
+ Licensee has not been granted any trademark license as part of this License and may not use any name
152
+ or mark associated with Stability AI without the prior written permission of Stability AI, except to
153
+ the extent necessary to make the reference required by the “ATTRIBUTION” section of this Agreement.
154
+
155
+
156
+ 10. APPLICABLE LAW; DISPUTE RESOLUTION
157
+
158
+ This License will be governed and construed under the laws of the State of California without regard
159
+ to conflicts of law provisions. Any suit or proceeding arising out of or relating to this License
160
+ will be brought in the federal or state courts, as applicable, in San Mateo County, California,
161
+ and each party irrevocably submits to the jurisdiction and venue of such courts.
162
+
163
+
164
+ 11. MISCELLANEOUS
165
+
166
+ If any provision or part of a provision of this License is unlawful, void or unenforceable,
167
+ that provision or part of the provision is deemed severed from this License, and will not affect
168
+ the validity and enforceability of any remaining provisions. The failure of Stability AI to exercise
169
+ or enforce any right or provision of this License will not operate as a waiver of such right or provision.
170
+ This License does not confer any third-party beneficiary rights upon any other person or entity.
171
+ This License, together with the Documentation, contains the entire understanding between you and
172
+ Stability AI regarding the subject matter of this License, and supersedes all other written or
173
+ oral agreements and understandings between you and Stability AI regarding such subject matter.
174
+ No change or addition to any provision of this License will be binding unless it is in writing and
175
+ signed by an authorized representative of both you and Stability AI.
README.md CHANGED
@@ -1,11 +1,346 @@
1
- ---
2
- title: Ass
3
- emoji: 👁
4
- colorFrom: red
5
- colorTo: blue
6
- sdk: docker
7
- pinned: false
8
- license: deepfloyd-if-license
9
- ---
10
-
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [![License](https://img.shields.io/badge/Code_License-Modified_MIT-blue.svg)](LICENSE)
2
+ [![License](https://img.shields.io/badge/Weights_License-DeepFloyd_IF-orange.svg)](LICENSE-MODEL)
3
+ [![Downloads](https://pepy.tech/badge/deepfloyd_if)](https://pepy.tech/project/deepfloyd_if)
4
+
5
+ # IF by DeepFloyd Lab at [StabilityAI](https://stability.ai/)
6
+
7
+ <p align="center">
8
+ <img src="./pics/nabla.jpg" width="100%">
9
+ </p>
10
+
11
+ We introduce DeepFloyd IF, a novel state-of-the-art open-source text-to-image model with a high degree of photorealism and language understanding. DeepFloyd IF is a modular composed of a frozen text encoder and three cascaded pixel diffusion modules: a base model that generates 64x64 px image based on text prompt and two super-resolution models, each designed to generate images of increasing resolution: 256x256 px and 1024x1024 px. All stages of the model utilize a frozen text encoder based on the T5 transformer to extract text embeddings, which are then fed into a UNet architecture enhanced with cross-attention and attention pooling. The result is a highly efficient model that outperforms current state-of-the-art models, achieving a zero-shot FID score of 6.66 on the COCO dataset. Our work underscores the potential of larger UNet architectures in the first stage of cascaded diffusion models and depicts a promising future for text-to-image synthesis.
12
+
13
+ <p align="center">
14
+ <img src="./pics/deepfloyd_if_scheme.jpg" width="100%">
15
+ </p>
16
+
17
+ *Inspired by* [*Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding*](https://arxiv.org/pdf/2205.11487.pdf)
18
+
19
+ ## Minimum requirements to use all IF models:
20
+ - 16GB vRAM for IF-I-XL (4.3B text to 64x64 base module) & IF-II-L (1.2B to 256x256 upscaler module)
21
+ - 24GB vRAM for IF-I-XL (4.3B text to 64x64 base module) & IF-II-L (1.2B to 256x256 upscaler module) & Stable x4 (to 1024x1024 upscaler)
22
+ - `xformers` and set env variable `FORCE_MEM_EFFICIENT_ATTN=1`
23
+
24
+
25
+ ## Quick Start
26
+ [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/deepfloyd_if_free_tier_google_colab.ipynb)
27
+ [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/DeepFloyd/IF)
28
+
29
+ ```shell
30
+ pip install deepfloyd_if==1.0.1
31
+ pip install xformers==0.0.16
32
+ pip install git+https://github.com/openai/CLIP.git --no-deps
33
+ ```
34
+
35
+ ## Local notebooks
36
+ [![Jupyter Notebook](https://img.shields.io/badge/jupyter_notebook-%23FF7A01.svg?logo=jupyter&logoColor=white)](https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/blob/main/notebooks/pipes-DeepFloyd-IF-v1.0.ipynb)
37
+ [![Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://www.kaggle.com/code/shonenkov/deepfloyd-if-4-3b-generator-of-pictures)
38
+
39
+ The Dream, Style Transfer, Super Resolution or Inpainting modes are avaliable in a Jupyter Notebook [here](https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/blob/main/notebooks/pipes-DeepFloyd-IF-v1.0.ipynb).
40
+
41
+
42
+
43
+ ## Integration with 🤗 Diffusers
44
+
45
+ IF is also integrated with the 🤗 Hugging Face [Diffusers library](https://github.com/huggingface/diffusers/).
46
+
47
+ Diffusers runs each stage individually allowing the user to customize the image generation process as well as allowing to inspect intermediate results easily.
48
+
49
+ ### Example
50
+
51
+ Before you can use IF, you need to accept its usage conditions. To do so:
52
+ 1. Make sure to have a [Hugging Face account](https://huggingface.co/join) and be loggin in
53
+ 2. Accept the license on the model card of [DeepFloyd/IF-I-XL-v1.0](https://huggingface.co/DeepFloyd/IF-I-XL-v1.0)
54
+ 3. Make sure to login locally. Install `huggingface_hub`
55
+ ```sh
56
+ pip install huggingface_hub --upgrade
57
+ ```
58
+
59
+ run the login function in a Python shell
60
+
61
+ ```py
62
+ from huggingface_hub import login
63
+
64
+ login()
65
+ ```
66
+
67
+ and enter your [Hugging Face Hub access token](https://huggingface.co/docs/hub/security-tokens#what-are-user-access-tokens).
68
+
69
+ Next we install `diffusers` and dependencies:
70
+
71
+ ```sh
72
+ pip install diffusers accelerate transformers safetensors
73
+ ```
74
+
75
+ And we can now run the model locally.
76
+
77
+ By default `diffusers` makes use of [model cpu offloading](https://huggingface.co/docs/diffusers/optimization/fp16#model-offloading-for-fast-inference-and-memory-savings) to run the whole IF pipeline with as little as 14 GB of VRAM.
78
+
79
+ If you are using `torch>=2.0.0`, make sure to **delete all** `enable_xformers_memory_efficient_attention()`
80
+ functions.
81
+
82
+ ```py
83
+ from diffusers import DiffusionPipeline
84
+ from diffusers.utils import pt_to_pil
85
+ import torch
86
+
87
+ # stage 1
88
+ stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", variant="fp16", torch_dtype=torch.float16)
89
+ stage_1.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
90
+ stage_1.enable_model_cpu_offload()
91
+
92
+ # stage 2
93
+ stage_2 = DiffusionPipeline.from_pretrained(
94
+ "DeepFloyd/IF-II-L-v1.0", text_encoder=None, variant="fp16", torch_dtype=torch.float16
95
+ )
96
+ stage_2.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
97
+ stage_2.enable_model_cpu_offload()
98
+
99
+ # stage 3
100
+ safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "watermarker": stage_1.watermarker}
101
+ stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules, torch_dtype=torch.float16)
102
+ stage_3.enable_xformers_memory_efficient_attention() # remove line if torch.__version__ >= 2.0.0
103
+ stage_3.enable_model_cpu_offload()
104
+
105
+ prompt = 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'
106
+
107
+ # text embeds
108
+ prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)
109
+
110
+ generator = torch.manual_seed(0)
111
+
112
+ # stage 1
113
+ image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
114
+ pt_to_pil(image)[0].save("./if_stage_I.png")
115
+
116
+ # stage 2
117
+ image = stage_2(
118
+ image=image, prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt"
119
+ ).images
120
+ pt_to_pil(image)[0].save("./if_stage_II.png")
121
+
122
+ # stage 3
123
+ image = stage_3(prompt=prompt, image=image, generator=generator, noise_level=100).images
124
+ image[0].save("./if_stage_III.png")
125
+ ```
126
+
127
+ There are multiple ways to speed up the inference time and lower the memory consumption even more with `diffusers`. To do so, please have a look at the Diffusers docs:
128
+
129
+ - 🚀 [Optimizing for inference time](https://huggingface.co/docs/diffusers/api/pipelines/if#optimizing-for-speed)
130
+ - ⚙️ [Optimizing for low memory during inference](https://huggingface.co/docs/diffusers/api/pipelines/if#optimizing-for-memory)
131
+
132
+ For more in-detail information about how to use IF, please have a look at [the IF blog post](https://huggingface.co/blog/if) and [the documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/if) 📖.
133
+
134
+ ## Run the code locally
135
+
136
+ ### Loading the models into VRAM
137
+
138
+ ```python
139
+ from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
140
+ from deepfloyd_if.modules.t5 import T5Embedder
141
+
142
+ device = 'cuda:0'
143
+ if_I = IFStageI('IF-I-XL-v1.0', device=device)
144
+ if_II = IFStageII('IF-II-L-v1.0', device=device)
145
+ if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device)
146
+ t5 = T5Embedder(device="cpu")
147
+ ```
148
+
149
+ ### I. Dream
150
+ Dream is the text-to-image mode of the IF model
151
+
152
+ ```python
153
+ from deepfloyd_if.pipelines import dream
154
+
155
+ prompt = 'ultra close-up color photo portrait of rainbow owl with deer horns in the woods'
156
+ count = 4
157
+
158
+ result = dream(
159
+ t5=t5, if_I=if_I, if_II=if_II, if_III=if_III,
160
+ prompt=[prompt]*count,
161
+ seed=42,
162
+ if_I_kwargs={
163
+ "guidance_scale": 7.0,
164
+ "sample_timestep_respacing": "smart100",
165
+ },
166
+ if_II_kwargs={
167
+ "guidance_scale": 4.0,
168
+ "sample_timestep_respacing": "smart50",
169
+ },
170
+ if_III_kwargs={
171
+ "guidance_scale": 9.0,
172
+ "noise_level": 20,
173
+ "sample_timestep_respacing": "75",
174
+ },
175
+ )
176
+
177
+ if_III.show(result['III'], size=14)
178
+ ```
179
+ ![](./pics/dream-III.jpg)
180
+
181
+ ## II. Zero-shot Image-to-Image Translation
182
+
183
+ ![](./pics/img_to_img_scheme.jpeg)
184
+
185
+ In Style Transfer mode, the output of your prompt comes out at the style of the `support_pil_img`
186
+ ```python
187
+ from deepfloyd_if.pipelines import style_transfer
188
+
189
+ result = style_transfer(
190
+ t5=t5, if_I=if_I, if_II=if_II,
191
+ support_pil_img=raw_pil_image,
192
+ style_prompt=[
193
+ 'in style of professional origami',
194
+ 'in style of oil art, Tate modern',
195
+ 'in style of plastic building bricks',
196
+ 'in style of classic anime from 1990',
197
+ ],
198
+ seed=42,
199
+ if_I_kwargs={
200
+ "guidance_scale": 10.0,
201
+ "sample_timestep_respacing": "10,10,10,10,10,10,10,10,0,0",
202
+ 'support_noise_less_qsample_steps': 5,
203
+ },
204
+ if_II_kwargs={
205
+ "guidance_scale": 4.0,
206
+ "sample_timestep_respacing": 'smart50',
207
+ "support_noise_less_qsample_steps": 5,
208
+ },
209
+ )
210
+ if_I.show(result['II'], 1, 20)
211
+ ```
212
+
213
+ ![Alternative Text](./pics/deep_floyd_if_image_2_image.gif)
214
+
215
+
216
+ ## III. Super Resolution
217
+ For super-resolution, users can run `IF-II` and `IF-III` or 'Stable x4' on an image that was not necessarely generated by IF (two cascades):
218
+
219
+ ```python
220
+ from deepfloyd_if.pipelines import super_resolution
221
+
222
+ middle_res = super_resolution(
223
+ t5,
224
+ if_III=if_II,
225
+ prompt=['woman with a blue headscarf and a blue sweaterp, detailed picture, 4k dslr, best quality'],
226
+ support_pil_img=raw_pil_image,
227
+ img_scale=4.,
228
+ img_size=64,
229
+ if_III_kwargs={
230
+ 'sample_timestep_respacing': 'smart100',
231
+ 'aug_level': 0.5,
232
+ 'guidance_scale': 6.0,
233
+ },
234
+ )
235
+ high_res = super_resolution(
236
+ t5,
237
+ if_III=if_III,
238
+ prompt=[''],
239
+ support_pil_img=middle_res['III'][0],
240
+ img_scale=4.,
241
+ img_size=256,
242
+ if_III_kwargs={
243
+ "guidance_scale": 9.0,
244
+ "noise_level": 20,
245
+ "sample_timestep_respacing": "75",
246
+ },
247
+ )
248
+ show_superres(raw_pil_image, high_res['III'][0])
249
+ ```
250
+
251
+ ![](./pics/if_as_upscaler.jpg)
252
+
253
+
254
+ ### IV. Zero-shot Inpainting
255
+
256
+ ```python
257
+ from deepfloyd_if.pipelines import inpainting
258
+
259
+ result = inpainting(
260
+ t5=t5, if_I=if_I,
261
+ if_II=if_II,
262
+ if_III=if_III,
263
+ support_pil_img=raw_pil_image,
264
+ inpainting_mask=inpainting_mask,
265
+ prompt=[
266
+ 'oil art, a man in a hat',
267
+ ],
268
+ seed=42,
269
+ if_I_kwargs={
270
+ "guidance_scale": 7.0,
271
+ "sample_timestep_respacing": "10,10,10,10,10,0,0,0,0,0",
272
+ 'support_noise_less_qsample_steps': 0,
273
+ },
274
+ if_II_kwargs={
275
+ "guidance_scale": 4.0,
276
+ 'aug_level': 0.0,
277
+ "sample_timestep_respacing": '100',
278
+ },
279
+ if_III_kwargs={
280
+ "guidance_scale": 9.0,
281
+ "noise_level": 20,
282
+ "sample_timestep_respacing": "75",
283
+ },
284
+ )
285
+ if_I.show(result['I'], 2, 3)
286
+ if_I.show(result['II'], 2, 6)
287
+ if_I.show(result['III'], 2, 14)
288
+ ```
289
+ ![](./pics/deep_floyd_if_inpainting.gif)
290
+
291
+ ### 🤗 Model Zoo 🤗
292
+ The link to download the weights as well as the model cards will be available soon on each model of the model zoo
293
+
294
+ #### Original
295
+
296
+ | Name | Cascade | Params | FID | Batch size | Steps |
297
+ |:----------------------------------------------------------|:-------:|:------:|:----:|:----------:|:-----:|
298
+ | [IF-I-M](https://huggingface.co/DeepFloyd/IF-I-M-v1.0) | I | 400M | 8.86 | 3072 | 2.5M |
299
+ | [IF-I-L](https://huggingface.co/DeepFloyd/IF-I-L-v1.0) | I | 900M | 8.06 | 3200 | 3.0M |
300
+ | [IF-I-XL](https://huggingface.co/DeepFloyd/IF-I-XL-v1.0)* | I | 4.3B | 6.66 | 3072 | 2.42M |
301
+ | [IF-II-M](https://huggingface.co/DeepFloyd/IF-II-M-v1.0) | II | 450M | - | 1536 | 2.5M |
302
+ | [IF-II-L](https://huggingface.co/DeepFloyd/IF-II-L-v1.0)* | II | 1.2B | - | 1536 | 2.5M |
303
+ | IF-III-L* _(soon)_ | III | 700M | - | 3072 | 1.25M |
304
+
305
+ *best modules
306
+
307
+ ### Quantitative Evaluation
308
+
309
+ `FID = 6.66`
310
+
311
+ ![](./pics/fid30k_if.jpg)
312
+
313
+ ## License
314
+
315
+ The code in this repository is released under the bespoke license (see added [point two](https://github.com/deep-floyd/IF/blob/main/LICENSE#L13)).
316
+
317
+ The weights will be available soon via [the DeepFloyd organization at Hugging Face](https://huggingface.co/DeepFloyd) and have their own LICENSE.
318
+
319
+ **Disclaimer:** *The initial release of the IF model is under a restricted research-purposes-only license temporarily to gather feedback, and after that we intend to release a fully open-source model in line with other Stability AI models.*
320
+
321
+ ## Limitations and Biases
322
+
323
+ The models available in this codebase have known limitations and biases. Please refer to [the model card](https://huggingface.co/DeepFloyd/IF-I-L-v1.0) for more information.
324
+
325
+
326
+ ## 🎓 DeepFloyd IF creators:
327
+ - [Alex Shonenkov](https://github.com/shonenkov)
328
+ - [Misha Konstantinov](https://github.com/zeroshot-ai)
329
+ - [Daria Bakshandaeva](https://github.com/Gugutse)
330
+ - [Christoph Schuhmann](https://github.com/christophschuhmann)
331
+ - [Ksenia Ivanova](https://github.com/ivksu)
332
+ - [Nadiia Klokova](https://github.com/vauimpuls)
333
+
334
+ ## 📄 Research Paper (Soon)
335
+
336
+ ## Acknowledgements
337
+
338
+ Special thanks to [StabilityAI](http://stability.ai) and its CEO [Emad Mostaque](https://twitter.com/emostaque) for invaluable support, providing GPU compute and infrastructure to train the models (our gratitude goes to [Richard Vencu](https://github.com/rvencu)); thanks to [LAION](https://laion.ai) and [Christoph Schuhmann](https://github.com/christophschuhmann) in particular for contribution to the project and well-prepared datasets; thanks to [Huggingface](https://huggingface.co) teams for optimizing models' speed and memory consumption during inference, creating demos and giving cool advice!
339
+
340
+ ## 🚀 External Contributors 🚀
341
+ - The Biggest Thanks [@Apolinário](https://github.com/apolinario), for ideas, consultations, help and support on all stages to make IF available in open-source; for writing a lot of documentation and instructions; for creating a friendly atmosphere in difficult moments 🦉;
342
+ - Thanks, [@patrickvonplaten](https://github.com/patrickvonplaten), for improving loading time of unet models by 80%;
343
+ for integration Stable-Diffusion-x4 as native pipeline 💪;
344
+ - Thanks, [@williamberman](https://github.com/williamberman) and [@patrickvonplaten](https://github.com/patrickvonplaten) for diffusers integration 🙌;
345
+ - Thanks, [@hysts](https://github.com/hysts) and [@Apolinário](https://github.com/apolinario) for creating [the best gradio demo with IF](https://huggingface.co/spaces/DeepFloyd/IF) 🚀;
346
+ - Thanks, [@Dango233](https://github.com/Dango233), for adapting IF with xformers memory efficient attention 💪;
requirements-dev.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ -r requirements-test.txt
2
+ pre-commit
requirements-test.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ -r requirements.txt
2
+ pytest
3
+ pytest-cov
requirements.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ tqdm
2
+ numpy
3
+ torch<2.0.0
4
+ torchvision
5
+ omegaconf
6
+ matplotlib
7
+ Pillow>=9.2.0
8
+ huggingface_hub>=0.13.2
9
+ transformers~=4.25.1
10
+ accelerate~=0.15.0
11
+ diffusers~=0.16.0
12
+ tokenizers~=0.13.2
13
+ sentencepiece~=0.1.97
14
+ ftfy~=6.1.1
15
+ beautifulsoup4~=4.11.1
setup.cfg ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [pep8]
2
+ max-line-length = 120
3
+ exclude = .tox,*migrations*,.json
4
+
5
+ [flake8]
6
+ max-line-length = 120
7
+ exclude = .tox,*migrations*,.json
8
+
9
+ [autopep8-wrapper]
10
+ exclude = .tox,*migrations*,.json
11
+
12
+ [check-docstring-first]
13
+ exclude = .tox,*migrations*,.json
setup.py ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ import os
3
+ import re
4
+ from setuptools import setup
5
+
6
+
7
+ def read(filename):
8
+ with open(os.path.join(os.path.dirname(__file__), filename)) as f:
9
+ file_content = f.read()
10
+ return file_content
11
+
12
+
13
+ def get_requirements():
14
+ requirements = []
15
+ for requirement in read('requirements.txt').splitlines():
16
+ if requirement.startswith('git+') or requirement.startswith('svn+') or requirement.startswith('hg+'):
17
+ parsed_requires = re.findall(r'#egg=([\w\d\.]+)-([\d\.]+)$', requirement)
18
+ if parsed_requires:
19
+ package, version = parsed_requires[0]
20
+ requirements.append(f'{package}=={version}')
21
+ else:
22
+ print('WARNING! For correct matching dependency links need to specify package name and version'
23
+ 'such as <dependency url>#egg=<package_name>-<version>')
24
+ else:
25
+ requirements.append(requirement)
26
+ return requirements
27
+
28
+
29
+ def get_links():
30
+ return [
31
+ requirement for requirement in read('requirements.txt').splitlines()
32
+ if requirement.startswith('git+') or requirement.startswith('svn+') or requirement.startswith('hg+')
33
+ ]
34
+
35
+
36
+ def get_version():
37
+ """ Get version from the package without actually importing it. """
38
+ init = read('deepfloyd_if/__init__.py')
39
+ for line in init.split('\n'):
40
+ if line.startswith('__version__'):
41
+ return eval(line.split('=')[1])
42
+
43
+
44
+ setup(
45
+ name='deepfloyd_if',
46
+ version=get_version(),
47
+ author='DeepFloyd, StabilityAI',
48
+ author_email='[email protected]',
49
+ description='DeepFloyd-IF (Imagen Free)',
50
+ packages=['deepfloyd_if', 'deepfloyd_if/model', 'deepfloyd_if/modules', 'deepfloyd_if/pipelines',
51
+ 'deepfloyd_if/resources'],
52
+ package_data={'deepfloyd_if/resources': ['*.png', '*.npy', '*.npz']},
53
+ install_requires=get_requirements(),
54
+ dependency_links=get_links(),
55
+ long_description=read('README.md'),
56
+ long_description_content_type='text/markdown',
57
+ )