π₯ Distilled Mochi Transformer
Current repository contains distilled transformer for genmoai mochi-1. This transformer consists of 42 blocks vs 48 blocks in original transformer.
Training details
We have analized MSE of latent after each block and iteratively dropped blocks which have minimum value of MSE.
After each block drop we have trained neighboring blocks (one before and one after deleted block) for 1K steps.
π Try it here: Interactive Demo
Usage
Minimal code example
import torch
from diffusers import MochiPipeline, MochiTransformer3DModel
from diffusers.utils import export_to_video
transformer = MochiTransformer3DModel.from_pretrained(
"NimVideo/mochi-1-transformer-42",
torch_dtype=torch.bfloat16,
)
pipe = MochiPipeline.from_pretrained(
"genmo/mochi-1-preview",
transformer=transformer,
variant="bf16",
torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
prompt = "Close-up of a chameleon's eye, with its scaly skin changing color. Ultra high resolution 4k."
frames = pipe(prompt, num_frames=85).frames[0]
export_to_video(frames, "mochi.mp4", fps=30)
Acknowledgements
Original code and models mochi.
Contacts
Issues should be raised directly in the repository.
- Downloads last month
- 128
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support text-to-video models for diffusers library.