CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation Paper β’ 2501.11325 β’ Published Jan 20 β’ 5
Runtime error 193 193 CosyVoice2-0.5B π₯³ Generate realistic voice audio from text and audio prompts