Attempt to run Wan2.1-T2V-1.3B with lower VRAM

Changes made:

Diffusion Model: Changed all Linear layers from float32 to nf4 reducing model size from around 6GB to 1GB (approx)
VAE: No Linear layers so nothing to quantize here
T5-UMT Encoder: Pretty big model so having difficulty loading it in my poor 4060 (8GB VRAM) but this is the one which takes the most VRAM. If this can be quantized it can be very easy to run this.

Will add T5 Encoder model later if I can get it working

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-to-video models for diffusers library.

Model tree for sarthak247/Wan2.1-T2V-1.3B-nf4

Base model

Quantized

(1)

this model