# DragNUWA
**DragNUWA** enables users to manipulate backgrounds or objects within images directly, and the model seamlessly translates these actions into **camera movements** or **object motions**, generating the corresponding video.
See our paper: [DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory](https://arxiv.org/abs/2308.08089)
### DragNUWA 1.5 (Updated on Jan 8, 2024)
**DragNUWA 1.5** enables Stable Video Diffusion to animate an image according to specific path.
### DragNUWA 1.0 (Original Paper)
[**DragNUWA 1.0**](https://arxiv.org/abs/2308.08089) utilizes text, images, and trajectory as three essential control factors to facilitate highly controllable video generation from semantic, spatial, and temporal aspects.
## Getting Start
### Setting Environment
```Shell
git clone -b svd https://github.com/ProjectNUWA/DragNUWA.git
cd DragNUWA
conda create -n DragNUWA python=3.8
conda activate DragNUWA
pip install -r environment.txt
```
### Download Pretrained Weights
Download the [Pretrained Weights](https://drive.google.com/file/d/1Z4JOley0SJCb35kFF4PCc6N6P1ftfX4i/view) to `models/` directory or directly run `bash models/Download.sh`.
### Drag and Animate !
```Shell
python DragNUWA_demo.py
```
It will launch a gradio demo, and you can drag an image and animate it!
### Acknowledgement
We appreciate the open source of the following projects:
[Stable Video Diffusion](https://github.com/Stability-AI/generative-models)
[Hugging Face](https://github.com/huggingface)
[UniMatch](https://github.com/autonomousvision/unimatch)
### Citation
```bibtex
@article{yin2023dragnuwa,
title={Dragnuwa: Fine-grained control in video generation by integrating text, image, and trajectory},
author={Yin, Shengming and Wu, Chenfei and Liang, Jian and Shi, Jie and Li, Houqiang and Ming, Gong and Duan, Nan},
journal={arXiv preprint arXiv:2308.08089},
year={2023}
}
```