huggface123 commited on
Commit
fa70d03
β€’
1 Parent(s): 0d24b07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -272
README.md CHANGED
@@ -1,272 +1,12 @@
1
- # πŸ€— Introduction
2
- **update** πŸ”₯πŸ”₯πŸ”₯ We propose a face reenactment method, based on our AnimateAnyone pipeline: Using the facial landmark of driving video to control the pose of given source image, and keeping the identity of source image. Specially, we disentangle head attitude (including eyes blink) and mouth motion from the landmark of driving video, and it can control the expression and movements of source face precisely. We release our inference codes and pretrained models of face reenactment!!
3
-
4
-
5
- **update** πŸ‹οΈπŸ‹οΈπŸ‹οΈ We release our training codes!! Now you can train your own AnimateAnyone models. See [here](#train) for more details. Have fun!
6
-
7
- **update**:πŸ”₯πŸ”₯πŸ”₯ We launch a HuggingFace Spaces demo of Moore-AnimateAnyone at [here](https://huggingface.co/spaces/xunsong/Moore-AnimateAnyone)!!
8
-
9
- This repository reproduces [AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone). To align the results demonstrated by the original paper, we adopt various approaches and tricks, which may differ somewhat from the paper and another [implementation](https://github.com/guoqincode/Open-AnimateAnyone).
10
-
11
- It's worth noting that this is a very preliminary version, aiming for approximating the performance (roughly 80% under our test) showed in [AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone).
12
-
13
- We will continue to develop it, and also welcome feedbacks and ideas from the community. The enhanced version will also be launched on our [MoBi MaLiang](https://maliang.mthreads.com/) AIGC platform, running on our own full-featured GPU S4000 cloud computing platform.
14
-
15
- # πŸ“ Release Plans
16
-
17
- - [x] Inference codes and pretrained weights of AnimateAnyone
18
- - [x] Training scripts of AnimateAnyone
19
- - [x] Inference codes and pretrained weights of face reenactment
20
- - [ ] Training scripts of face reenactment
21
- - [ ] Inference scripts of audio driven portrait video generation
22
- - [ ] Training scripts of audio driven portrait video generation
23
- # 🎞️ Examples
24
-
25
- ## AnimateAnyone
26
-
27
- Here are some AnimateAnyone results we generated, with the resolution of 512x768.
28
-
29
- https://github.com/MooreThreads/Moore-AnimateAnyone/assets/138439222/f0454f30-6726-4ad4-80a7-5b7a15619057
30
-
31
- https://github.com/MooreThreads/Moore-AnimateAnyone/assets/138439222/337ff231-68a3-4760-a9f9-5113654acf48
32
-
33
- <table class="center">
34
-
35
- <tr>
36
- <td width=50% style="border: none">
37
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/138439222/9c4d852e-0a99-4607-8d63-569a1f67a8d2" muted="false"></video>
38
- </td>
39
- <td width=50% style="border: none">
40
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/138439222/722c6535-2901-4e23-9de9-501b22306ebd" muted="false"></video>
41
- </td>
42
- </tr>
43
-
44
- <tr>
45
- <td width=50% style="border: none">
46
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/138439222/17b907cc-c97e-43cd-af18-b646393c8e8a" muted="false"></video>
47
- </td>
48
- <td width=50% style="border: none">
49
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/138439222/86f2f6d2-df60-4333-b19b-4c5abcd5999d" muted="false"></video>
50
- </td>
51
- </tr>
52
- </table>
53
-
54
- **Limitation**: We observe following shortcomings in current version:
55
- 1. The background may occur some artifacts, when the reference image has a clean background
56
- 2. Suboptimal results may arise when there is a scale mismatch between the reference image and keypoints. We have yet to implement preprocessing techniques as mentioned in the [paper](https://arxiv.org/pdf/2311.17117.pdf).
57
- 3. Some flickering and jittering may occur when the motion sequence is subtle or the scene is static.
58
-
59
-
60
-
61
- These issues will be addressed and improved in the near future. We appreciate your anticipation!
62
-
63
- ## Face Reenactment
64
-
65
- Here are some results we generated, with the resolution of 512x512.
66
-
67
- <table class="center">
68
-
69
- <tr>
70
- <td width=50% style="border: none">
71
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/117793823/8cfaddec-fb81-485e-88e9-229c0adb8bf9" muted="false"></video>
72
- </td>
73
- <td width=50% style="border: none">
74
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/117793823/ad06ba29-5bb2-490e-a204-7242c724ba8b" muted="false"></video>
75
- </td>
76
- </tr>
77
-
78
- <tr>
79
- <td width=50% style="border: none">
80
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/117793823/6843cdc0-830b-4f91-87c5-41cd12fbe8c2" muted="false"></video>
81
- </td>
82
- <td width=50% style="border: none">
83
- <video controls autoplay loop src="https://github.com/MooreThreads/Moore-AnimateAnyone/assets/117793823/bb9b8b74-ba4b-4f62-8fd1-7ebf140acc81" muted="false"></video>
84
- </td>
85
- </tr>
86
- </table>
87
-
88
-
89
- # βš’οΈ Installation
90
-
91
- ## Build Environtment
92
-
93
- We Recommend a python version `>=3.10` and cuda version `=11.7`. Then build environment as follows:
94
-
95
- ```shell
96
- # [Optional] Create a virtual env
97
- python -m venv .venv
98
- source .venv/bin/activate
99
- # Install with pip:
100
- pip install -r requirements.txt
101
- # For face landmark extraction
102
- git clone https://github.com/emilianavt/OpenSeeFace.git
103
- ```
104
-
105
- ## Download weights
106
-
107
- **Automatically downloading**: You can run the following command to download weights automatically:
108
-
109
- ```shell
110
- python tools/download_weights.py
111
- ```
112
-
113
- Weights will be placed under the `./pretrained_weights` direcotry. The whole downloading process may take a long time.
114
-
115
- **Manually downloading**: You can also download weights manually, which has some steps:
116
-
117
- 1. Download our AnimateAnyone trained [weights](https://huggingface.co/patrolli/AnimateAnyone/tree/main), which include four parts: `denoising_unet.pth`, `reference_unet.pth`, `pose_guider.pth` and `motion_module.pth`.
118
-
119
- 2. Download our trained [weights](https://pan.baidu.com/s/1lS5CynyNfYlDbjowKKfG8g?pwd=crci) of face reenactment, and place these weights under `pretrained_weights`.
120
-
121
- 3. Download pretrained weight of based models and other components:
122
- - [StableDiffusion V1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
123
- - [sd-vae-ft-mse](https://huggingface.co/stabilityai/sd-vae-ft-mse)
124
- - [image_encoder](https://huggingface.co/lambdalabs/sd-image-variations-diffusers/tree/main/image_encoder)
125
-
126
- 4. Download dwpose weights (`dw-ll_ucoco_384.onnx`, `yolox_l.onnx`) following [this](https://github.com/IDEA-Research/DWPose?tab=readme-ov-file#-dwpose-for-controlnet).
127
-
128
- Finally, these weights should be orgnized as follows:
129
-
130
- ```text
131
- ./pretrained_weights/
132
- |-- DWPose
133
- | |-- dw-ll_ucoco_384.onnx
134
- | `-- yolox_l.onnx
135
- |-- image_encoder
136
- | |-- config.json
137
- | `-- pytorch_model.bin
138
- |-- denoising_unet.pth
139
- |-- motion_module.pth
140
- |-- pose_guider.pth
141
- |-- reference_unet.pth
142
- |-- sd-vae-ft-mse
143
- | |-- config.json
144
- | |-- diffusion_pytorch_model.bin
145
- | `-- diffusion_pytorch_model.safetensors
146
- |-- reenact
147
- | |-- denoising_unet.pth
148
- | |-- reference_unet.pth
149
- | |-- pose_guider1.pth
150
- | |-- pose_guider2.pth
151
- `-- stable-diffusion-v1-5
152
- |-- feature_extractor
153
- | `-- preprocessor_config.json
154
- |-- model_index.json
155
- |-- unet
156
- | |-- config.json
157
- | `-- diffusion_pytorch_model.bin
158
- `-- v1-inference.yaml
159
- ```
160
-
161
- Note: If you have installed some of the pretrained models, such as `StableDiffusion V1.5`, you can specify their paths in the config file (e.g. `./config/prompts/animation.yaml`).
162
-
163
- # πŸš€ Training and Inference
164
-
165
- ## Inference of AnimateAnyone
166
-
167
- Here is the cli command for running inference scripts:
168
-
169
- ```shell
170
- python -m scripts.pose2vid --config ./configs/prompts/animation.yaml -W 512 -H 784 -L 64
171
- ```
172
-
173
- You can refer the format of `animation.yaml` to add your own reference images or pose videos. To convert the raw video into a pose video (keypoint sequence), you can run with the following command:
174
-
175
- ```shell
176
- python tools/vid2pose.py --video_path /path/to/your/video.mp4
177
- ```
178
-
179
- ## Inference of Face Reenactment
180
- Here is the cli command for running inference scripts:
181
-
182
- ```shell
183
- python -m scripts.lmks2vid --config ./configs/prompts/inference_reenact.yaml --driving_video_path YOUR_OWN_DRIVING_VIDEO_PATH --source_image_path YOUR_OWN_SOURCE_IMAGE_PATH
184
- ```
185
- We provide some face images in `./config/inference/talkinghead_images`, and some face videos in `./config/inference/talkinghead_videos` for inference.
186
-
187
- ## <span id="train"> Training of AnimateAnyone </span>
188
-
189
- Note: package dependencies have been updated, you may upgrade your environment via `pip install -r requirements.txt` before training.
190
-
191
- ### Data Preparation
192
-
193
- Extract keypoints from raw videos:
194
-
195
- ```shell
196
- python tools/extract_dwpose_from_vid.py --video_root /path/to/your/video_dir
197
- ```
198
-
199
- Extract the meta info of dataset:
200
-
201
- ```shell
202
- python tools/extract_meta_info.py --root_path /path/to/your/video_dir --dataset_name anyone
203
- ```
204
-
205
- Update lines in the training config file:
206
-
207
- ```yaml
208
- data:
209
- meta_paths:
210
- - "./data/anyone_meta.json"
211
- ```
212
-
213
- ### Stage1
214
-
215
- Put [openpose controlnet weights](https://huggingface.co/lllyasviel/control_v11p_sd15_openpose/tree/main) under `./pretrained_weights`, which is used to initialize the pose_guider.
216
-
217
- Put [sd-image-variation](https://huggingface.co/lambdalabs/sd-image-variations-diffusers/tree/main) under `./pretrained_weights`, which is used to initialize unet weights.
218
-
219
- Run command:
220
-
221
- ```shell
222
- accelerate launch train_stage_1.py --config configs/train/stage1.yaml
223
- ```
224
-
225
- ### Stage2
226
-
227
- Put the pretrained motion module weights `mm_sd_v15_v2.ckpt` ([download link](https://huggingface.co/guoyww/animatediff/blob/main/mm_sd_v15_v2.ckpt)) under `./pretrained_weights`.
228
-
229
- Specify the stage1 training weights in the config file `stage2.yaml`, for example:
230
-
231
- ```yaml
232
- stage1_ckpt_dir: './exp_output/stage1'
233
- stage1_ckpt_step: 30000
234
- ```
235
-
236
- Run command:
237
-
238
- ```shell
239
- accelerate launch train_stage_2.py --config configs/train/stage2.yaml
240
- ```
241
-
242
- # 🎨 Gradio Demo
243
-
244
- **HuggingFace Demo**: We launch a quick preview demo of Moore-AnimateAnyone at [HuggingFace Spaces](https://huggingface.co/spaces/xunsong/Moore-AnimateAnyone)!!
245
- We appreciate the assistance provided by the HuggingFace team in setting up this demo.
246
-
247
- To reduce waiting time, we limit the size (width, height, and length) and inference steps when generating videos.
248
-
249
- If you have your own GPU resource (>= 16GB vram), you can run a local gradio app via following commands:
250
-
251
- `python app.py`
252
-
253
- # Community Contributions
254
-
255
- - Installation for Windows users: [Moore-AnimateAnyone-for-windows](https://github.com/sdbds/Moore-AnimateAnyone-for-windows)
256
-
257
- # πŸ–ŒοΈ Try on Mobi MaLiang
258
-
259
- We will launched this model on our [MoBi MaLiang](https://maliang.mthreads.com/) AIGC platform, running on our own full-featured GPU S4000 cloud computing platform. Mobi MaLiang has now integrated various AIGC applications and functionalities (e.g. text-to-image, controllable generation...). You can experience it by [clicking this link](https://maliang.mthreads.com/) or scanning the QR code bellow via WeChat!
260
-
261
- <p align="left">
262
- <img src="assets/mini_program_maliang.png" width="100
263
- "/>
264
- </p>
265
-
266
- # βš–οΈ Disclaimer
267
-
268
- This project is intended for academic research, and we explicitly disclaim any responsibility for user-generated content. Users are solely liable for their actions while using the generative model. The project contributors have no legal affiliation with, nor accountability for, users' behaviors. It is imperative to use the generative model responsibly, adhering to both ethical and legal standards.
269
-
270
- # πŸ™πŸ» Acknowledgements
271
-
272
- We first thank the authors of [AnimateAnyone](). Additionally, we would like to thank the contributors to the [majic-animate](https://github.com/magic-research/magic-animate), [animatediff](https://github.com/guoyww/AnimateDiff) and [Open-AnimateAnyone](https://github.com/guoqincode/Open-AnimateAnyone) repositories, for their open research and exploration. Furthermore, our repo incorporates some codes from [dwpose](https://github.com/IDEA-Research/DWPose) and [animatediff-cli-prompt-travel](https://github.com/s9roll7/animatediff-cli-prompt-travel/), and we extend our thanks to them as well.
 
1
+ ---
2
+ title: AnimateAnyone
3
+ emoji: 🎬
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: "3.9"
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference