alea31415 commited on
Commit
7fcb3fe
·
1 Parent(s): f2e9e64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -9
README.md CHANGED
@@ -82,8 +82,9 @@ I made the following observations
82
  ![00091-20230327040330](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00091-20230327040330.png)
83
  ![00094-20230327052628](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00094-20230327052628.png)
84
  ![00095-20230327055221](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00095-20230327055221.png)
 
85
 
86
- One interesting observation is that in the first image we observe better background for small LoHa trained at higher resolution and larger LoHa trained only trained at resolution 512. This again suggests we may be able to get good results with small dimension if they are trained properly. It is however unclear how to achive that. Simply increasing the learning rate to 5e-4 does not seem to be sufficient in thise case (as can be seen from the above images).
87
 
88
  Finally, these results do not mean that you would always want to use larger dimension, as probably you do not really need all these details that the additional dimension brings you.
89
 
@@ -98,7 +99,7 @@ I tested several things, and here is what I can say
98
  - Setting the learning rate larger of course makes training faster as long as it does not fry things up. Here switching the learning rate from lr 2e-4 to 5e-4 increases the likeliness. Would it however be better to train longer with smaller learning rate? This still needs more test. (I will zoom in on the case where we only change the text encoder learning rate below.)
99
  - Cosine schduler learns slower than constant scheduler for a fixed learning rate.
100
  - It seems that Dadaptation trains faster at styles but slower at characters. Why?
101
- Since the outputs of Dadaptation seems to change more over time, I guess it may just have picked a larger learning rate.
102
  ![00074-20230326204643](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00074-20230326204643.png)
103
  ![00097-20230327063406](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00097-20230327063406.png)
104
 
@@ -123,12 +124,13 @@ This aspect is difficult to test, but it seems to be confirmed by this "umbrella
123
 
124
  ![00083-20230327015201](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00083-20230327015201.png)
125
 
 
126
  In any case, I still believe if we want to get the best result we should avoid compeletely text encoder training and do [pivotal tuning](https://github.com/cloneofsimo/lora/discussions/121) instead.
127
 
128
 
129
  #### LoRa, LoCon, LoHa
130
 
131
- It may seem weird to mention this so late, but honestly I do not find them to be that different here.
132
  The common belief is that LoHa trains more style than LoCon, which in turn trains more style than LoRa.
133
  This seems to be mostly true, but the difference is quite subtle. Moreover, I would rather use the word "texture" instead of style.
134
  I especially test whether any of them would be more favorable when transferred to different base model. No conclusion here.
@@ -147,17 +149,30 @@ Some remarks
147
  - Some comparaison between LoHa and LoCon do suggest that LoHa indeed trains faster at texture while LoCon faster at higher level traits. The difference is however very small so it is not really conclusive.
148
  ![00034-20230325234457](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00034-20230325234457.png)
149
  ![00035-20230325235521](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00035-20230325235521.png)
150
- - In an [early experiment](https://civitai.com/models/17336/roukin8-character-lohaloconfullckpt-8) I saw that LoHa and LoCon training lead to quite different result. One possible explanation is that I am training on NAI now while I was training on [BP](https://huggingface.co/Crosstyan/BPModel) there.
151
 
152
 
153
- #### Others
154
 
155
- 1. I barely see any difference for training at clip skip 1 and 2.
156
- 3. The difference between lora, locon, and loha are very subtle.
157
 
158
- ### Datasets
 
159
 
160
- Here is the composition of the datasets
 
 
 
 
 
 
 
 
 
 
 
 
 
161
  ```
162
  17_characters~fanart~OyamaMihari: 53
163
  19_characters~fanart~OyamaMahiro+OyamaMihari: 47
 
82
  ![00091-20230327040330](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00091-20230327040330.png)
83
  ![00094-20230327052628](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00094-20230327052628.png)
84
  ![00095-20230327055221](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00095-20230327055221.png)
85
+ - They seem to provide better style transfer. Please refer to the end (image of 144mb).
86
 
87
+ One interesting observation is that in the first image we get better background for small LoHa trained at higher resolution and larger LoHa trained only at resolution 512. This again suggests we may be able to get good results with small dimension if they are trained properly. It is however unclear how to achieve that. Simply increasing the learning rate to 5e-4 does not seem to be sufficient in this case (as can be seen from the above images).
88
 
89
  Finally, these results do not mean that you would always want to use larger dimension, as probably you do not really need all these details that the additional dimension brings you.
90
 
 
99
  - Setting the learning rate larger of course makes training faster as long as it does not fry things up. Here switching the learning rate from lr 2e-4 to 5e-4 increases the likeliness. Would it however be better to train longer with smaller learning rate? This still needs more test. (I will zoom in on the case where we only change the text encoder learning rate below.)
100
  - Cosine schduler learns slower than constant scheduler for a fixed learning rate.
101
  - It seems that Dadaptation trains faster at styles but slower at characters. Why?
102
+ Since the outputs of Dadaptation seems to change more over time, I guess it may just have picked a larger learning rate. Does this then mean larger learning rate would pick the style first?
103
  ![00074-20230326204643](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00074-20230326204643.png)
104
  ![00097-20230327063406](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00097-20230327063406.png)
105
 
 
124
 
125
  ![00083-20230327015201](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00083-20230327015201.png)
126
 
127
+ There may be some disadvantages as well but his needs to be further explored.
128
  In any case, I still believe if we want to get the best result we should avoid compeletely text encoder training and do [pivotal tuning](https://github.com/cloneofsimo/lora/discussions/121) instead.
129
 
130
 
131
  #### LoRa, LoCon, LoHa
132
 
133
+ It may seem weird to mention this so late, but honestly I do not find them to give very different result here.
134
  The common belief is that LoHa trains more style than LoCon, which in turn trains more style than LoRa.
135
  This seems to be mostly true, but the difference is quite subtle. Moreover, I would rather use the word "texture" instead of style.
136
  I especially test whether any of them would be more favorable when transferred to different base model. No conclusion here.
 
149
  - Some comparaison between LoHa and LoCon do suggest that LoHa indeed trains faster at texture while LoCon faster at higher level traits. The difference is however very small so it is not really conclusive.
150
  ![00034-20230325234457](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00034-20230325234457.png)
151
  ![00035-20230325235521](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00035-20230325235521.png)
152
+ - In an [early experiment](https://civitai.com/models/17336/roukin8-character-lohaloconfullckpt-8) I saw that LoHa and LoCon training lead to quite different result. One possible explanation is that I train on NAI here while I trained on [BP](https://huggingface.co/Crosstyan/BPModel) in that experiment.
153
 
154
 
155
+ #### Clip skip 1 versus 2
156
 
157
+ People say that wy should train on clip skip 2 for anime models, but honestly I cannot see any difference. The only important thing is to use the same clip skip for training and sampling.
 
158
 
159
+ ![00013-20230325200652](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00013-20230325200652.png)
160
+ ![00014-20230325203156](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/00014-20230325203156.png)
161
 
162
+
163
+ #### Style Transfer
164
+
165
+ The simple rule seems to be that we get better style transfer if the styles are better trained.
166
+ Although it is impossible to make any conclusion from a single image, dim 32/16 half alpha is clearly the winner here, followed by dim 32/16 5e-4.
167
+ Among the remaining ones LoRa and Dadaption are probably slightly better. This can be explained by the fact that they both train faster (LoRa has smaller dimension while Dadaption supposed uses larger learning rate) and thus the model just knows the styles better. However, the Dadaption LoHa completely fails at altering the style of Tilty, who only has anime screenshots in training set. After some tests I find this can be fixed by by weighting the prompts differently.
168
+
169
+
170
+ ![xyz_grid-0000-20230327073826](https://huggingface.co/alea31415/LyCORIS-experiments/resolve/main/generated_samples/xyz_grid-0000-20230327073826.png)
171
+
172
+
173
+ ### Dataset
174
+
175
+ Here is the composition of the dataset
176
  ```
177
  17_characters~fanart~OyamaMihari: 53
178
  19_characters~fanart~OyamaMahiro+OyamaMihari: 47