Trained for 0 epochs and 1000 steps.

Trained with datasets ['text-embeds-pixart-filter', 'photo-concept-bucket', 'midjourney-v6-520k-raw', 'sfwbooru', 'nijijourney-v6-520k-raw', 'dalle3']
Learning rate 1e-06, batch size 24, and 1 gradient accumulation steps.
Used DDPM noise scheduler for training with epsilon prediction type and rescaled_betas_zero_snr=False
Using 'trailing' timestep spacing.
Base model: terminusresearch/pixart-900m-1024-ft-v0.6
VAE: madebyollin/sdxl-vae-fp16-fix

Files changed (12) hide show

README.md +133 -0
optimizer.bin +3 -0
random_states_0.pkl +3 -0
scheduler.bin +3 -0
training_state-dalle3.json +0 -0
training_state-midjourney-v6-520k-raw.json +0 -0
training_state-nijijourney-v6-520k-raw.json +0 -0
training_state-photo-concept-bucket.json +0 -0
training_state-sfwbooru.json +0 -0
training_state.json +1 -0
transformer/config.json +30 -0
transformer/diffusion_pytorch_model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,133 @@

+---
+license: creativeml-openrail-m
+base_model: "terminusresearch/pixart-900m-1024-ft-v0.6"
+tags:
+  - stable-diffusion
+  - stable-diffusion-diffusers
+  - text-to-image
+  - diffusers
+  - simpletuner
+  - full
+inference: true
+---
+# pixart-900m-1024-vpred-zsnr
+This is a full rank finetune derived from [terminusresearch/pixart-900m-1024-ft-v0.6](https://huggingface.co/terminusresearch/pixart-900m-1024-ft-v0.6).
+The main validation prompt used during training was:
+```
+ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies
+```
+## Validation settings
+- CFG: `7.5`
+- CFG Rescale: `0.7`
+- Steps: `25`
+- Sampler: `None`
+- Seed: `42`
+- Resolutions: `1024x1024,1344x768,916x1152`
+Note: The validation settings are not necessarily the same as the [training settings](#training-settings).
+<Gallery />
+The text encoder **was not** trained.
+You may reuse the base model text encoder for inference.
+## Training settings
+- Training epochs: 0
+- Training steps: 1000
+- Learning rate: 1e-06
+- Effective batch size: 192
+  - Micro-batch size: 24
+  - Gradient accumulation steps: 1
+  - Number of GPUs: 8
+- Prediction type: epsilon
+- Rescaled betas zero SNR: False
+- Optimizer: AdamW, stochastic bf16
+- Precision: Pure BF16
+- Xformers: Not used
+## Datasets
+### photo-concept-bucket
+- Repeats: 0
+- Total number of images: ~567552
+- Total number of aspect buckets: 1
+- Resolution: 1.0 megapixels
+- Cropped: True
+- Crop style: random
+- Crop aspect: square
+### midjourney-v6-520k-raw
+- Repeats: 0
+- Total number of images: ~390912
+- Total number of aspect buckets: 1
+- Resolution: 1.0 megapixels
+- Cropped: True
+- Crop style: random
+- Crop aspect: square
+### sfwbooru
+- Repeats: 0
+- Total number of images: ~233664
+- Total number of aspect buckets: 1
+- Resolution: 1.0 megapixels
+- Cropped: True
+- Crop style: random
+- Crop aspect: square
+### nijijourney-v6-520k-raw
+- Repeats: 0
+- Total number of images: ~415680
+- Total number of aspect buckets: 1
+- Resolution: 1.0 megapixels
+- Cropped: True
+- Crop style: random
+- Crop aspect: square
+### dalle3
+- Repeats: 0
+- Total number of images: ~1121664
+- Total number of aspect buckets: 1
+- Resolution: 1.0 megapixels
+- Cropped: True
+- Crop style: random
+- Crop aspect: square
+## Inference
+```python
+import torch
+from diffusers import DiffusionPipeline
+model_id = 'pixart-900m-1024-vpred-zsnr'
+pipeline = DiffusionPipeline.from_pretrained(model_id)
+prompt = "ethnographic photography of teddy bear at a picnic, ears tucked behind a cozy hoodie looking darkly off to the stormy picnic skies"
+negative_prompt = "blurry, cropped, ugly"
+pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
+image = pipeline(
+    prompt=prompt,
+    negative_prompt='blurry, cropped, ugly',
+    num_inference_steps=25,
+    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
+    width=1152,
+    height=768,
+    guidance_scale=7.5,
+    guidance_rescale=0.7,
+).images[0]
+image.save("output.png", format="PNG")
+```

optimizer.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4f9f9a3a3f5451c22635b16e8cc7a837edd9ec41c43c63d460e8bb889a7a3472
+size 5451415117

random_states_0.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8f0edfc2c885f730ef911db10373dce0a3e814e4fdbb2de759c691606ecf21e3
+size 16100

scheduler.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:efff19450f55a9358b76f3e5170171942761d1c5b9128683028d0c09b8a24573
+size 1000

training_state-dalle3.json ADDED Viewed