--- library_name: diffusers license: cc-by-nc-2.0 base_model: - black-forest-labs/FLUX.1-Fill-dev pipeline_tag: image-to-image tags: - tryon - vto --- # Model Card for CATVTON-Flux CATVTON-Flux is an advanced virtual try-on solution that combines CATVTON (Contrastive Appearance and Topology Virtual Try-On) with Flux fill inpainting model for realistic and accurate clothing transfer. ## Update: Latest Achievement (2024/11/24): CatVton-Flux-Alpha achieved SOTA performance with FID: 5.593255043029785 on VITON-HD dataset. Test configuration: scale 30, step 30. My VITON-HD test inferencing results available [here](https://drive.google.com/file/d/1T2W5R1xH_uszGVD8p6UUAtWyx43rxGmI/view?usp=sharing) ## Model Details ### Model Description - **Developed by:** [X/Twitter:Black Magic An](https://x.com/MrsZaaa) ### Model Sources [optional] - **Repository:** [github](https://github.com/nftblackmagic/catvton-flux) ## Uses The model is designed for virtual try-on applications, allowing users to visualize how different garments would look on a person. It can be used directly through command-line interface with the following parameters: Input person image Person mask Garment image Random seed (optional) ## How to Get Started with the Model ``` transformer = FluxTransformer2DModel.from_pretrained( "xiaozaa/catvton-flux-alpha", torch_dtype=torch.bfloat16 ) pipe = FluxFillPipeline.from_pretrained( "black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16 ).to("cuda") ``` ## Training Details ### Training Data VITON-HD dataset ### Training Procedure Finetuning Flux1-dev-fill ## Evaluation #### Metrics FID: 5.593255043029785 (SOTA) ### Results [More Information Needed] #### Summary **BibTeX:** ``` @misc{chong2024catvtonconcatenationneedvirtual, title={CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models}, author={Zheng Chong and Xiao Dong and Haoxiang Li and Shiyue Zhang and Wenqing Zhang and Xujie Zhang and Hanqing Zhao and Xiaodan Liang}, year={2024}, eprint={2407.15886}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2407.15886}, } @article{lhhuang2024iclora, title={In-Context LoRA for Diffusion Transformers}, author={Huang, Lianghua and Wang, Wei and Wu, Zhi-Fan and Shi, Yupeng and Dou, Huanzhang and Liang, Chen and Feng, Yutong and Liu, Yu and Zhou, Jingren}, journal={arXiv preprint arxiv:2410.23775}, year={2024} } ```