File size: 6,533 Bytes
c803147 0be5f99 c803147 0be5f99 10401fb 0be5f99 10401fb 0be5f99 10401fb 0be5f99 10401fb 0be5f99 10401fb 0be5f99 c803147 0be5f99 0b7644a 0be5f99 357d016 0be5f99 0b7644a 357d016 0b7644a 909d679 0b7644a 909d679 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
---
base_model: THUDM/CogVideoX-5b
datasets: finetrainers/cakeify-smol
library_name: diffusers
license: other
license_link: https://huggingface.co/THUDM/CogVideoX-5b/blob/main/LICENSE
instance_prompt: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.
widget:
- text: PIKA_CAKEIFY A blue soap is placed on a modern table. Suddenly, a knife appears and slices through the soap, revealing a cake inside. The soap turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.
output:
url: "./assets/output_0.mp4"
- text: PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
output:
url: "./assets/output_1.mp4"
- text: PIKA_CAKEIFY A red tea cup is placed on a wooden surface. Suddenly, a knife appears and slices through the cup, revealing a cake inside. The cake turns into a hyper-realistic prop cake, showcasing the creative transformation of everyday objects into something unexpected and delightful.
output:
url: "./assets/output_2.mp4"
tags:
- text-to-video
- diffusers-training
- diffusers
- cogvideox
- cogvideox-diffusers
- template:sd-lora
---
<Gallery />
This is a fine-tune of the [THUDM/CogVideoX-5b](https://huggingface.co/THUDM/CogVideoX-5b) model on the
[finetrainers/cakeify-smol](https://huggingface.co/datasets/finetrainers/cakeify-smol) dataset. We also provide
a LoRA variant of the params. Check it out [here](#lora).
Code: https://github.com/a-r-r-o-w/finetrainers
> [!IMPORTANT]
> This is an experimental checkpoint and its poor generalization is well-known.
Inference code:
```py
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
from diffusers.utils import export_to_video
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
"finetrainers/cakeify-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output.mp4", fps=25)
```
Training logs are available on WandB [here](https://wandb.ai/diffusion-guidance/finetrainers-cogvideox/runs/q7z660f3/).
## LoRA
We extracted a 64-rank LoRA from the finetuned checkpoint (script [here](./create_lora.py)). [This LoRA](./extracted_cakeify_lora_64.safetensors) can be used to emulate the same kind of effect:
<details>
<summary>Code</summary>
```py
from diffusers import DiffusionPipeline
from diffusers.utils import export_to_video
import torch
pipeline = DiffusionPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
pipeline.load_lora_weights("finetrainers/cakeify-v0", weight_name="extracted_cakeify_lora_64.safetensors")
prompt = """
PIKA_CAKEIFY On a gleaming glass display stand, a sleek black purse quietly commands attention. Suddenly, a knife appears and slices through the shoe, revealing a fluffy vanilla sponge at its core. Immediately, it turns into a hyper-realistic prop cake, delighting the senses with its playful juxtaposition of the everyday and the extraordinary.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output_lora.mp4", fps=25)
```
</details>
Below is a comparison between the LoRA and non-LoRA outputs (under same settings and seed):
<table>
<thead>
<tr>
<th>Full finetune</th>
<th>LoRA</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<video width="320" height="240" controls>
<source src="https://huggingface.co/finetrainers/cakeify-v0/resolve/main/comparisons/original_output_0.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
<td>
<video width="320" height="240" controls>
<source src="https://huggingface.co/finetrainers/cakeify-v0/resolve/main/comparisons/output_0.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
</tr>
<tr>
<td>
<video width="320" height="240" controls>
<source src="https://huggingface.co/finetrainers/cakeify-v0/resolve/main/comparisons/original_output_1.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
<td>
<video width="320" height="240" controls>
<source src="https://huggingface.co/finetrainers/cakeify-v0/resolve/main/comparisons/output_1.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
</tr>
<tr>
<td>
<video width="320" height="240" controls>
<source src="https://huggingface.co/finetrainers/cakeify-v0/resolve/main/comparisons/original_output_2.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
<td>
<video width="320" height="240" controls>
<source src="https://huggingface.co/finetrainers/cakeify-v0/resolve/main/comparisons/output_2.mp4" type="video/mp4">
Your browser does not support the video tag.
</video>
</td>
</tr>
</tbody>
</table>
|