Jyrrata
/

LHC_XL

Text-to-Image

Diffusers

TensorBoard

English

Model card Files Files and versions Metrics Training metrics Community

Jyrrata commited on 2 days ago

Commit

003de7a

verified ·

1 Parent(s): 0d24fa0

Update README.md

Browse files

Files changed (1) hide show

README.md +14 -15

README.md CHANGED Viewed

@@ -6,23 +6,22 @@ license_link: https://freedevproject.org/faipl-1.0-sd/
 language:
 - en
 library_name: diffusers
-base_model: Laxhar/noobai-XL-Vpred-0.6
 ---
 The LHC (Large Heap o' Chuubas) is aiming to be the model for all of your VTuber needs.
-# The What, The Why, The How
-## The What
-The LHC models are a series of Vtuber centric finetunes.
-As opposed to many smallscale finetunes, where the aim is to improve aesthetics or a general concept like backgrounds, the aim of the LHC models is primarily to add specific characters while preserving as much of the base model as possible.
-## The Why
-The usual way of adding characters to an already trained model is LoRAs or similar methods, where you end up with a small model that can be applied to existing models, adding concepts in a very plug and play way. While this is a very convenient way of achieving this, that most modern consumer GPUs are capable of training, they come with several downsides to a model that has been fully trained on these concepts.
-1. Loras will have an effect on composition, style and character knowledge even outside of their intended concept. This will be especially apparent in incorrectly trained loras, where the concept will always be applied, causing poses to stiffen or characters taking on attributes of the new character. While this bleeding can and does happen on finetuned models as well, the effect is drastically smaller.
-2. Full finetunes usually result in the model being very capable of abstracting, meaning even if a specific combination of concepts isn't present in the training data, a well trained finetune will be able to combine these concepts in an almost logical way.
-3. As a specific example of 2.: while style and concept Loras usually work quite well when used together with other Loras of their type, character Loras tend to be much less capable of being used together with other character Loras for generating images of multiple new characters. While there are ways to make this work, and some Loras are more capable than others, this also limits the results in my experience. One workaround to this is training a Lora on multiple characters as well as images of these characters together, which does work quite well in my experience, however finding artwork of specific characters together is not always possible and when adding more and more characters, the one must also increased the size of the resulting Lora in order to learn more and more characters. This make Loras less and less effective the more characters one wants to add at once.
-4. It is possible to extract a Lora from a model, meaning the things the finetune learned can be applied to models that have a similar base. This means that even if the resulting finetune doesn't work for every application, the extract can profit from most of the advantages of a Lora.
-These factors make a fullscale finetune the best option for adding large amounts of characters to a model.
-## The How
-Thanks to optimizations to model training and technologies like Gradient Accumulation, it is possible to effectively finetune models even on normal consumer hardware like a single RTX 3090, as long as one has enough time.
-Additionally, by manually curating datasets of the different characters and using repeats to allow for a similar distribution between them, one can ensure a more balanced learning process.

 language:
 - en
 library_name: diffusers
+base_model: Laxhar/noobai-XL-Vpred-1.0
 ---
 The LHC (Large Heap o' Chuubas) is aiming to be the model for all of your VTuber needs.
+Currently on Version 0.5 (considered alpha). It is a v-prediction model, with the necessary settings enabled to allow for automatic detection in ComfyUI and (Re)Forge.
+For a comprehensive overview of all vtubers that were trained on, refer to https://catbox.moe/c/pjfwt1 or the txt file, detailing the vtubers, as well as useful/necessary tagging.
+# Training Details
+Due to some major changes between version 0.4 and 0.5, the training parameters were not yet dialed in, resulting in a very long training process that was resumed multiple times from intermediate checkpoints that weren't trained to satisfaction. Therefore, exact training parameters aren't stated, instead the training logs were uploaded, and a general overview is given here.
+## Parameters
+* Total Epochs: 102
+* Effective Batchsize: 32 (or 16, or 8)
+* Initial Learning Rate: 5e-5
+* Scheduler: Cosine
+* TE Training Epochs: ~10