KV-Edit: Training-Free Image Editing for Precise Background Preservation
Abstract
Background consistency remains a significant challenge in image editing tasks. Despite extensive developments, existing works still face a trade-off between maintaining similarity to the original image and generating content that aligns with the target. Here, we propose KV-Edit, a training-free approach that uses KV cache in DiTs to maintain background consistency, where background tokens are preserved rather than regenerated, eliminating the need for complex mechanisms or expensive training, ultimately generating new content that seamlessly integrates with the background within user-provided regions. We further explore the memory consumption of the KV cache during editing and optimize the space complexity to O(1) using an inversion-free method. Our approach is compatible with any DiT-based generative model without additional training. Experiments demonstrate that KV-Edit significantly outperforms existing approaches in terms of both background and image quality, even surpassing training-based methods. Project webpage is available at https://xilluill.github.io/projectpages/KV-Edit
Community
Arxiv: https://arxiv.org/abs/2502.17363
Project page: https://xilluill.github.io/projectpages/KV-Edit/
HuggingFace Gradio space: https://huggingface.co/spaces/xilluill/KV-Edit
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data (2025)
- Edicho: Consistent Image Editing in the Wild (2024)
- StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer (2025)
- Towards Consistent and Controllable Image Synthesis for Face Editing (2025)
- EliGen: Entity-Level Controlled Image Generation with Regional Attention (2025)
- EditAR: Unified Conditional Generation with Autoregressive Models (2025)
- AdaFlow: Efficient Long Video Editing via Adaptive Attention Slimming And Keyframe Selection (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper