xtristan commited on
Commit
98f66e7
1 Parent(s): 423d8af

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: diffusers
6
+ pipeline_tag: text-to-image
7
+ tags:
8
+ - text-to-image
9
+ - image-generation
10
+ - shuttle
11
+ ---
12
+
13
+ # Shuttle 3 Diffusion
14
+
15
+ ## Model Variants
16
+ These model variants provide different precision levels and formats optimized for diverse hardware capabilities and use cases
17
+ - [bfloat16](https://huggingface.co/shuttleai/shuttle-3-diffusion)
18
+ - [GGUF](https://huggingface.co/shuttleai/shuttle-3-diffusion-GGUF)
19
+ - [fp8](https://huggingface.co/shuttleai/shuttle-3-diffusion-fp8)
20
+
21
+ Shuttle 3 Diffusion is a text-to-image AI model designed to create detailed and diverse images from textual prompts in just 4 steps. It offers enhanced performance in image quality, typography, understanding complex prompts, and resource efficiency.
22
+
23
+ ![image/png](https://huggingface.co/shuttleai/shuttle-3-diffusion/resolve/main/demo.png)
24
+
25
+ You can try out the model through a website at https://chat.shuttleai.com/images
26
+
27
+ ## Using the model via API
28
+ You can use Shuttle 3 Diffusion via API through ShuttleAI
29
+ - [ShuttleAI](https://shuttleai.com/)
30
+ - [ShuttleAI Docs](https://docs.shuttleai.com/)
31
+
32
+ ## Using the model with 🧨 Diffusers
33
+ Install or upgrade diffusers
34
+ ```shell
35
+ pip install -U diffusers
36
+ ```
37
+ Then you can use `DiffusionPipeline` to run the model
38
+ ```python
39
+ import torch
40
+ from diffusers import DiffusionPipeline
41
+
42
+ # Load the diffusion pipeline from a pretrained model, using bfloat16 for tensor types.
43
+ pipe = DiffusionPipeline.from_pretrained(
44
+ "shuttleai/shuttle-3-diffusion", torch_dtype=torch.bfloat16
45
+ ).to("cuda")
46
+
47
+ # Uncomment the following line to save VRAM by offloading the model to CPU if needed.
48
+ # pipe.enable_model_cpu_offload()
49
+
50
+ # Uncomment the lines below to enable torch.compile for potential performance boosts on compatible GPUs.
51
+ # Note that this can increase loading times considerably.
52
+ # pipe.transformer.to(memory_format=torch.channels_last)
53
+ # pipe.transformer = torch.compile(
54
+ # pipe.transformer, mode="max-autotune", fullgraph=True
55
+ # )
56
+
57
+ # Set your prompt for image generation.
58
+ prompt = "A cat holding a sign that says hello world"
59
+
60
+ # Generate the image using the diffusion pipeline.
61
+ image = pipe(
62
+ prompt,
63
+ height=1024,
64
+ width=1024,
65
+ guidance_scale=3.5,
66
+ num_inference_steps=4,
67
+ max_sequence_length=256,
68
+ # Uncomment the line below to use a manual seed for reproducible results.
69
+ # generator=torch.Generator("cpu").manual_seed(0)
70
+ ).images[0]
71
+
72
+ # Save the generated image.
73
+ image.save("shuttle.png")
74
+ ```
75
+ To learn more check out the [diffusers](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux) documentation
76
+
77
+ ## Using the model with ComfyUI
78
+
79
+ To run local inference with Shuttle 3 Diffusion using [ComfyUI](https://github.com/comfyanonymous/ComfyUI), you can use this [safetensors file](https://huggingface.co/shuttleai/shuttle-3-diffusion/blob/main/shuttle-3-diffusion.safetensors).
80
+
81
+ ## Comparison to other models
82
+ Shuttle 3 Diffusion can produce images better images than Flux Dev in just four steps, while being licensed under Apache 2.
83
+ ![image/png](https://huggingface.co/shuttleai/shuttle-3-diffusion/resolve/main/comparison.png)
84
+ [More examples](https://docs.shuttleai.com/getting-started/shuttle-diffusion)
85
+
86
+ ## Training Details
87
+ Shuttle 3 Diffusion uses Flux.1 Schnell as its base. It can produce images similar to Flux Dev or Pro in just 4 steps, and it is licensed under Apache 2. The model was partially de-distilled during training. When used beyond 10 steps, it enters "refiner mode," enhancing image details without altering the composition. We overcame the limitations of the Schnell-series models by employing a special training method, resulting in improved details and colors.