Update README.md
Browse files
README.md
CHANGED
@@ -24,7 +24,21 @@ Here it is, the BPModel, a Stable Diffusion model you may love or hate.
|
|
24 |
Trained with 5k high quality images that suit my taste (not necessary yours unfortunately) from [Sankaku Complex](https://chan.sankakucomplex.com) with annotations. Not the best strategy since pure combination of tags may not be the optimal way to describe the image, but I don't need to do extra work. And no, I won't feed any AI generated image
|
25 |
to the model even it might outlaw the model from being used in some countries.
|
26 |
|
27 |
-
The training of a high resolution model requires a significant amount of GPU
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
|
29 |
[Mikubill/naifu-diffusion](https://github.com/Mikubill/naifu-diffusion) is used as training script and I also recommend to
|
30 |
checkout [CCRcmcpe/scal-sdt](https://github.com/CCRcmcpe/scal-sdt).
|
|
|
24 |
Trained with 5k high quality images that suit my taste (not necessary yours unfortunately) from [Sankaku Complex](https://chan.sankakucomplex.com) with annotations. Not the best strategy since pure combination of tags may not be the optimal way to describe the image, but I don't need to do extra work. And no, I won't feed any AI generated image
|
25 |
to the model even it might outlaw the model from being used in some countries.
|
26 |
|
27 |
+
The training of a high resolution model requires a significant amount of GPU
|
28 |
+
hours and can be costly. In this particular case, 10 V100 GPU hours were spent
|
29 |
+
on training 30 epochs with a resolution of 512, while 60 V100 GPU hours were spent
|
30 |
+
on training 30 epochs with a resolution of 768. An additional 100 V100 GPU hours
|
31 |
+
were also spent on training a model with a resolution of 1024, although **ONLY** 10
|
32 |
+
epochs were run. The results of the training on the 1024 resolution model did
|
33 |
+
not show a significant improvement compared to the 768 resolution model, and the
|
34 |
+
resource demands, achieving a batch size of 1 on a V100 with 32G VRAM, were
|
35 |
+
high. However, training on the 768 resolution did yield better results than
|
36 |
+
training on the 512 resolution, and it is worth considering as an option. It is
|
37 |
+
worth noting that Stable Diffusion 2.x also chose to train on a 768 resolution
|
38 |
+
model. However, it may be more efficient to start with training on a 512
|
39 |
+
resolution model due to the slower training process and the need for additional
|
40 |
+
prior knowledge to speed up the training process when working with a 768
|
41 |
+
resolution.
|
42 |
|
43 |
[Mikubill/naifu-diffusion](https://github.com/Mikubill/naifu-diffusion) is used as training script and I also recommend to
|
44 |
checkout [CCRcmcpe/scal-sdt](https://github.com/CCRcmcpe/scal-sdt).
|