CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation Paper β’ 2501.11325 β’ Published Jan 20 β’ 5