Genshin_Impact_Mavuika HunyuanVideo LoRA

This repository contains the necessary setup and scripts to generate videos using the HunyuanVideo model with a LoRA (Low-Rank Adaptation) fine-tuned for Mavuika. Below are the instructions to install dependencies, download models, and run the demo.

Installation

Step 1: Install System Dependencies

Run the following command to install required system packages:

sudo apt-get update && sudo apt-get install git-lfs ffmpeg cbm

Step 2: Clone the Repository

Clone the repository and navigate to the project directory:

git clone https://huggingface.co/svjack/Genshin_Impact_Mavuika_HunyuanVideo_lora
cd Genshin_Impact_Mavuika_HunyuanVideo_lora

Step 3: Install Python Dependencies

Install the required Python packages:

conda create -n py310 python=3.10
conda activate py310
pip install ipykernel
python -m ipykernel install --user --name py310 --display-name "py310"

pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub
pip install moviepy==1.0.3
pip install sageattention==1.0.6

pip install torch==2.5.0 torchvision

Download Models

Step 1: Download HunyuanVideo Model

Download the HunyuanVideo model and place it in the ckpts directory:

huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts

Step 2: Download LLaVA Model

Download the LLaVA model and preprocess it:

cd ckpts
huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers
wget https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py
python preprocess_text_encoder_tokenizer_utils.py --input_dir llava-llama-3-8b-v1_1-transformers --output_dir text_encoder

Step 3: Download CLIP Model

Download the CLIP model for the text encoder:

huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2

Demo

Generate Video 1: Mavuika

Run the following command to generate a video of Mavuika:

python hv_generate_video.py \
    --fp8 \
    --video_size 544 960 \
    --video_length 60 \
    --infer_steps 30 \
    --prompt "Mavuika, featuring long, wavy red hair with golden highlights and large, star-shaped earrings. Mavuika wears dark sunglasses, a black choker, and a black leather glove on their left hand. Their attire includes a black and gold armor-like top with intricate designs. The background is a gradient of soft white to light blue, emphasizing Mavuika's confident expression and stylish appearance." \
    --save_path . \
    --output_type both \
    --dit ckpts/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states.pt \
    --attn_mode sdpa \
    --vae ckpts/hunyuan-video-t2v-720p/vae/pytorch_model.pt \
    --vae_chunk_size 32 \
    --vae_spatial_tile_sample_min_size 128 \
    --text_encoder1 ckpts/text_encoder \
    --text_encoder2 ckpts/text_encoder_2 \
    --seed 1234 \
    --lora_multiplier 1.0 \
    --lora_weight Mavuika_im_lora_dir/Mavuika_single_im_lora-000035.safetensors

Generate Video 2: Mavuika Sun

Run the following command to generate a video of Mavuika:

python hv_generate_video.py \
    --fp8 \
    --video_size 544 960 \
    --video_length 60 \
    --infer_steps 30 \
    --prompt "Fantastic artwork of Mavuika, featuring long, wavy red hair with golden highlights and large, star-shaped earrings. Mavuika wears dark sunglasses, a black choker, and a black leather glove on their left hand. Their attire includes a black and gold armor-like top with intricate designs, standing confidently in a warm sunset-lit rural village. The background transitions into the interior of a futuristic spaceship, blending the rustic and sci-fi elements seamlessly. The gradient of soft white to light blue in the sky enhances Mavuika's stylish and commanding presence." \
    --save_path . \
    --output_type both \
    --dit ckpts/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states.pt \
    --attn_mode sdpa \
    --vae ckpts/hunyuan-video-t2v-720p/vae/pytorch_model.pt \
    --vae_chunk_size 32 \
    --vae_spatial_tile_sample_min_size 128 \
    --text_encoder1 ckpts/text_encoder \
    --text_encoder2 ckpts/text_encoder_2 \
    --seed 1234 \
    --lora_multiplier 1.0 \
    --lora_weight Mavuika_im_lora_dir/Mavuika_single_im_lora-000035.safetensors

Notes

Ensure you have sufficient GPU resources for video generation.
Adjust the --video_size, --video_length, and --infer_steps parameters as needed for different output qualities and lengths.
The --prompt parameter can be modified to generate videos with different scenes or actions.