adipanda commited on
Commit
16f46b3
·
verified ·
1 Parent(s): ebf5efa

Model card auto-generated by SimpleTuner

Browse files
Files changed (1) hide show
  1. README.md +151 -0
README.md ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ base_model: "black-forest-labs/FLUX.1-dev"
4
+ tags:
5
+ - flux
6
+ - flux-diffusers
7
+ - text-to-image
8
+ - diffusers
9
+ - simpletuner
10
+ - safe-for-work
11
+ - lora
12
+ - template:sd-lora
13
+ - standard
14
+ inference: true
15
+ widget:
16
+ - text: 'unconditional (blank prompt)'
17
+ parameters:
18
+ negative_prompt: 'blurry, cropped, ugly'
19
+ output:
20
+ url: ./assets/image_0_0.png
21
+ - text: 'A scene from Chainsaw Man. Makima holding a sign that says ''I LOVE PROMPTS!'', she is standing full body on a beach at sunset. She is wearing her white button-up shirt, black tie, and black trousers. The setting sun casts a dynamic shadow on her composed and enigmatic expression.'
22
+ parameters:
23
+ negative_prompt: 'blurry, cropped, ugly'
24
+ output:
25
+ url: ./assets/image_1_0.png
26
+ - text: 'A scene from Chainsaw Man. Makima jumping out of a propeller airplane, sky diving. Her expression remains calm and controlled, her red hair flowing in the wind. The sky is clear and blue, with birds flying in the distance.'
27
+ parameters:
28
+ negative_prompt: 'blurry, cropped, ugly'
29
+ output:
30
+ url: ./assets/image_2_0.png
31
+ - text: 'A scene from Chainsaw Man. Makima spinning a basketball on her finger on a basketball court. She is wearing a Lakers jersey with the #12 on it. The basketball hoop and cheering crowd are in the background. She has a composed and confident smile.'
32
+ parameters:
33
+ negative_prompt: 'blurry, cropped, ugly'
34
+ output:
35
+ url: ./assets/image_3_0.png
36
+ - text: 'A scene from Chainsaw Man. Makima is wearing a professional suit in an office, shaking the hand of a businesswoman. The woman has purple hair and is wearing formal attire. There is a Google logo in the background. It is during daytime, and the overall sentiment is one of achievement and authority.'
37
+ parameters:
38
+ negative_prompt: 'blurry, cropped, ugly'
39
+ output:
40
+ url: ./assets/image_4_0.png
41
+ - text: 'A scene from Chainsaw Man. Makima is fighting a large brown grizzly bear, deep in a forest. The bear is tall and standing on two legs, roaring. The bear is also wearing a crown because it is the king of all bears. Around them are tall trees and other animals watching intently.'
42
+ parameters:
43
+ negative_prompt: 'blurry, cropped, ugly'
44
+ output:
45
+ url: ./assets/image_5_0.png
46
+ ---
47
+
48
+ # makima-standard-lora-1
49
+
50
+ This is a standard PEFT LoRA derived from [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev).
51
+
52
+
53
+ No validation prompt was used during training.
54
+
55
+ None
56
+
57
+
58
+
59
+ ## Validation settings
60
+ - CFG: `3.5`
61
+ - CFG Rescale: `0.0`
62
+ - Steps: `20`
63
+ - Sampler: `FlowMatchEulerDiscreteScheduler`
64
+ - Seed: `42`
65
+ - Resolution: `1024x1024`
66
+ - Skip-layer guidance:
67
+
68
+ Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
69
+
70
+ You can find some example images in the following gallery:
71
+
72
+
73
+ <Gallery />
74
+
75
+ The text encoder **was not** trained.
76
+ You may reuse the base model text encoder for inference.
77
+
78
+
79
+ ## Training settings
80
+
81
+ - Training epochs: 8
82
+ - Training steps: 100
83
+ - Learning rate: 0.0003
84
+ - Learning rate schedule: constant
85
+ - Warmup steps: 100
86
+ - Max grad norm: 2.0
87
+ - Effective batch size: 56
88
+ - Micro-batch size: 56
89
+ - Gradient accumulation steps: 1
90
+ - Number of GPUs: 1
91
+ - Gradient checkpointing: True
92
+ - Prediction type: flow-matching (extra parameters=['shift=3', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flow_matching_loss=compatible', 'flux_lora_target=all'])
93
+ - Optimizer: adamw_bf16
94
+ - Trainable parameter precision: Pure BF16
95
+ - Caption dropout probability: 0.0%
96
+
97
+
98
+ - LoRA Rank: 128
99
+ - LoRA Alpha: None
100
+ - LoRA Dropout: 0.1
101
+ - LoRA initialisation style: default
102
+
103
+
104
+ ## Datasets
105
+
106
+ ### makima-512
107
+ - Repeats: 2
108
+ - Total number of images: 172
109
+ - Total number of aspect buckets: 1
110
+ - Resolution: 0.262144 megapixels
111
+ - Cropped: False
112
+ - Crop style: None
113
+ - Crop aspect: None
114
+ - Used for regularisation data: No
115
+
116
+
117
+ ## Inference
118
+
119
+
120
+ ```python
121
+ import torch
122
+ from diffusers import DiffusionPipeline
123
+
124
+ model_id = 'black-forest-labs/FLUX.1-dev'
125
+ adapter_id = 'adipanda/makima-standard-lora-1'
126
+ pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) # loading directly in bf16
127
+ pipeline.load_lora_weights(adapter_id)
128
+
129
+ prompt = "An astronaut is riding a horse through the jungles of Thailand."
130
+
131
+
132
+ ## Optional: quantise the model to save on vram.
133
+ ## Note: The model was quantised during training, and so it is recommended to do the same during inference time.
134
+ from optimum.quanto import quantize, freeze, qint8
135
+ quantize(pipeline.transformer, weights=qint8)
136
+ freeze(pipeline.transformer)
137
+
138
+ pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu') # the pipeline is already in its target precision level
139
+ image = pipeline(
140
+ prompt=prompt,
141
+ num_inference_steps=20,
142
+ generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
143
+ width=1024,
144
+ height=1024,
145
+ guidance_scale=3.5,
146
+ ).images[0]
147
+ image.save("output.png", format="PNG")
148
+ ```
149
+
150
+
151
+