codehappy commited on
Commit
da5dee0
·
verified ·
1 Parent(s): f80eb02

update README for epoch 16 ckpt

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -9,7 +9,7 @@ base_model:
9
  A latent diffusion model (LDM) geared toward illustration, style composability, and sample variety. Addresses a few deficiencies with the SDXL base model.
10
 
11
  * Architecture: SD XL (base model is v1.0)
12
- * Training procedure: U-Net fully unfrozen, all-parameter continued pretraining at LR between 3e-8 and 3e-7 for 15,800,000 steps (at epoch 15, batch size 4).
13
 
14
  Trained on the Puzzle Box dataset, a large collection of permissively licensed images from the public Internet (or generated by previous Puzzle Box models). Each image
15
  has from 3 to 17 different captions which are used interchangably during training. There are 9.3 million images and 62 million captions in the dataset.
@@ -23,7 +23,7 @@ booru-style, don't use underscores in your tags, replace those with spaces. Tags
23
  Vitamin phrases: *top quartile*, *top decile* (there are also anti-vitamins, *bottom quartile* and *bottom decile*). These are the primary aesthetic labels (see below.)
24
 
25
  Prompt adherence is unusually good; aesthetics are improved by human evaluation for generations between 1/4 and 1/2 megapixel in size for epochs 12-14, 1/4 to 2
26
- megapixels for epoch 15. CFG scales between 2 and 7 can work well with Puzzle Box; experimenting with resolution or scale for your prompts is encouraged.
27
 
28
  **Captioning:** About 1.4 million of the captions in the dataset are human-written. The remainder come from a variety of ML models, either vision transformers or
29
  classifers. Models used in captioning the Puzzle Box dataset include: Qwen 2 VL 72b, BLIP 2 OPT-6.5B COCO, Llava 1.5, MiniCPM 2.6, bakllava, Moondream, DeepSeek Janus 7b,
@@ -46,6 +46,7 @@ This allows later checkpoints to generate 1+ megapixel images without tiling or
46
 
47
  Model checkpoints currently available:
48
 
 
49
  - from epoch 15, **15800k** training steps, 08 March 2025
50
  - from epoch 14, **14290k** training steps, 02 December 2024
51
  - from epoch 13, **11930k** training steps, 15 August 2024
 
9
  A latent diffusion model (LDM) geared toward illustration, style composability, and sample variety. Addresses a few deficiencies with the SDXL base model.
10
 
11
  * Architecture: SD XL (base model is v1.0)
12
+ * Training procedure: U-Net fully unfrozen, all-parameter continued pretraining at LR between 3e-8 and 3e-7 for 16,950,000 steps (at epoch 16, batch size 4).
13
 
14
  Trained on the Puzzle Box dataset, a large collection of permissively licensed images from the public Internet (or generated by previous Puzzle Box models). Each image
15
  has from 3 to 17 different captions which are used interchangably during training. There are 9.3 million images and 62 million captions in the dataset.
 
23
  Vitamin phrases: *top quartile*, *top decile* (there are also anti-vitamins, *bottom quartile* and *bottom decile*). These are the primary aesthetic labels (see below.)
24
 
25
  Prompt adherence is unusually good; aesthetics are improved by human evaluation for generations between 1/4 and 1/2 megapixel in size for epochs 12-14, 1/4 to 2
26
+ megapixels for epoch 15+. CFG scales between 2 and 7 can work well with Puzzle Box; experimenting with resolution or scale for your prompts is encouraged.
27
 
28
  **Captioning:** About 1.4 million of the captions in the dataset are human-written. The remainder come from a variety of ML models, either vision transformers or
29
  classifers. Models used in captioning the Puzzle Box dataset include: Qwen 2 VL 72b, BLIP 2 OPT-6.5B COCO, Llava 1.5, MiniCPM 2.6, bakllava, Moondream, DeepSeek Janus 7b,
 
46
 
47
  Model checkpoints currently available:
48
 
49
+ - from epoch 16, **16950k** training steps, 05 May 2025
50
  - from epoch 15, **15800k** training steps, 08 March 2025
51
  - from epoch 14, **14290k** training steps, 02 December 2024
52
  - from epoch 13, **11930k** training steps, 15 August 2024