Update README.md
Browse files
README.md
CHANGED
@@ -1,138 +0,0 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
title: TS
|
4 |
-
sdk: gradio
|
5 |
-
emoji: 😻
|
6 |
-
colorFrom: purple
|
7 |
-
colorTo: purple
|
8 |
-
---
|
9 |
-
# ReSwapper
|
10 |
-
|
11 |
-
ReSwapper aims to reproduce the implementation of inswapper. This repository provides code for training, inference, and includes pretrained weights.
|
12 |
-
|
13 |
-
Here is the comparesion of the output of Inswapper and Reswapper.
|
14 |
-
| Target | Source | Inswapper Output | Reswapper Output<br>(256 resolution)<br>(Step 1399500) | Reswapper Output<br>(Step 1019500) | Reswapper Output<br>(Step 429500) |
|
15 |
-
|--------|--------|--------|--------|--------|--------|
|
16 |
-
|  | |  |  | |  |
|
17 |
-
|  | |  |  |  |  |
|
18 |
-
|  | |  |  |  |  |
|
19 |
-
|
20 |
-
## Installation
|
21 |
-
|
22 |
-
```bash
|
23 |
-
git clone https://github.com/somanchiu/ReSwapper.git
|
24 |
-
cd ReSwapper
|
25 |
-
python -m venv venv
|
26 |
-
|
27 |
-
venv\scripts\activate
|
28 |
-
|
29 |
-
pip install -r requirements.txt
|
30 |
-
|
31 |
-
pip install torch torchvision --force --index-url https://download.pytorch.org/whl/cu121
|
32 |
-
pip install onnxruntime-gpu --force --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
|
33 |
-
```
|
34 |
-
|
35 |
-
## The details of inswapper
|
36 |
-
|
37 |
-
### Model architecture
|
38 |
-
The inswapper model architecture can be visualized in [Netron](https://netron.app). You can compare with ReSwapper implementation to see architectural similarities. Exporting the model with opset_version=10 makes it easier to compare the graph in Netron. However, it will cause issue #8.
|
39 |
-
|
40 |
-
We can also use the following Python code to get more details:
|
41 |
-
```python
|
42 |
-
model = onnx.load('test.onnx')
|
43 |
-
printable_graph=onnx.helper.printable_graph(model.graph)
|
44 |
-
```
|
45 |
-
|
46 |
-
The model architectures of InSwapper and SimSwap are extremely similar and worth paying attention to.
|
47 |
-
|
48 |
-
### Model inputs
|
49 |
-
- target: [1, 3, 128, 128] shape image in RGB format with face alignment, normalized to [-1, 1] range
|
50 |
-
- source (latent): [1, 512] shape vector, the features of the source face
|
51 |
-
- Calculation of latent, "emap" can be extracted from the original inswapper model.
|
52 |
-
```python
|
53 |
-
latent = source_face.normed_embedding.reshape((1,-1))
|
54 |
-
latent = np.dot(latent, emap)
|
55 |
-
latent /= np.linalg.norm(latent)
|
56 |
-
```
|
57 |
-
- It can also be used to calculate the similarity between two faces using cosine similarity.
|
58 |
-
|
59 |
-
### Model output
|
60 |
-
Model inswapper_128 not only changes facial features, but also body shape.
|
61 |
-
|
62 |
-
| Target | Source | Inswapper Output | Reswapper Output<br>(Step 429500) |
|
63 |
-
|--------|--------|--------|--------|
|
64 |
-
|  | |  |  |
|
65 |
-
|
66 |
-
### Loss Functions
|
67 |
-
There is no information released from insightface. It is an important part of the training. However, there are a lot of articles and papers that can be referenced. By reading a substantial number of articles and papers on face swapping, ID fidelity, and style transfer, you'll frequently encounter the following keywords:
|
68 |
-
- content loss
|
69 |
-
- style loss/id loss
|
70 |
-
- perceptual loss
|
71 |
-
|
72 |
-
### Face alignment
|
73 |
-
Face alignment is handled incorrectly at resolutions other than 128. To resolve this issue, add an offset to "dst" in both x and y directions in the function "face_align.estimate_norm". The offset is approximately given by the formula: Offset = (128/32768) * Resolution - 0.5
|
74 |
-
|
75 |
-
## Training
|
76 |
-
### 0. Pretrained weights (Optional)
|
77 |
-
If you don't want to train the model from scratch, you can download the pretrained weights and pass model_path into the train function in train.py.
|
78 |
-
|
79 |
-
### 1. Dataset Preparation
|
80 |
-
Download [FFHQ](https://www.kaggle.com/datasets/arnaud58/flickrfaceshq-dataset-ffhq) to use as target and source images. For the swaped face images, we can use the inswapper output.
|
81 |
-
|
82 |
-
### 2. Model Training
|
83 |
-
|
84 |
-
Optimizer: Adam
|
85 |
-
|
86 |
-
Learning rate: 0.0001
|
87 |
-
|
88 |
-
Modify the code in train.py if needed. Then, execute:
|
89 |
-
```python
|
90 |
-
python train.py
|
91 |
-
```
|
92 |
-
|
93 |
-
The model will be saved as "reswapper-\<total steps\>.pth". You can also save the model as ONNX using the ModelFormat.save_as_onnx_model function. The ONNX model can then be used with the original INSwapper class.
|
94 |
-
|
95 |
-
All losses will be logged into TensorBoard.
|
96 |
-
|
97 |
-
Using images with different resolutions simultaneously to train the model will enhance its generalization ability. To apply this strategy, you can pass "resolutions" into the train function.
|
98 |
-
|
99 |
-
Generalization ability of the model trained with resolutions of 128 and 256:
|
100 |
-
|
101 |
-
| Output<br>resolution | 128 | 160 | 256 |
|
102 |
-
|--------|--------|--------|--------|
|
103 |
-
|Output|  | | |
|
104 |
-
|
105 |
-
Enhancing data diversity will improve output quality, you can pass "enableDataAugmentation" into the train function to perform data augmentation.
|
106 |
-
|
107 |
-
| Target | Source | Inswapper Output | Reswapper Output<br>(Step 1567500) | Reswapper Output<br>(Step 1399500) |
|
108 |
-
|--------|--------|--------|--------|--------|
|
109 |
-
||  | | |  |
|
110 |
-
|
111 |
-
#### Notes
|
112 |
-
- Do not stop the training too early.
|
113 |
-
|
114 |
-
- I'm using an RTX3060 12GB for training. It takes around 12 hours for 50,000 steps.
|
115 |
-
- The optimizer may need to be changed to SGD for the final training, as many articles show that SGD can result in lower loss.
|
116 |
-
- To get inspiration for improving the model, you might want to review the commented code and unused functions in commit [c2a12e10021ecd1342b9ba50570a16b18f9634b9](https://github.com/somanchiu/ReSwapper/commit/c2a12e10021ecd1342b9ba50570a16b18f9634b9).
|
117 |
-
|
118 |
-
## Inference
|
119 |
-
```python
|
120 |
-
python swap.py
|
121 |
-
```
|
122 |
-
|
123 |
-
## Pretrained Model
|
124 |
-
### 256 Resolution
|
125 |
-
- [reswapper_256-1567500.pth](https://huggingface.co/somanchiu/reswapper/tree/main)
|
126 |
-
- [reswapper_256-1399500.pth](https://huggingface.co/somanchiu/reswapper/tree/main)
|
127 |
-
|
128 |
-
### 128 Resolution
|
129 |
-
- [reswapper-1019500.pth](https://huggingface.co/somanchiu/reswapper/tree/main)
|
130 |
-
- [reswapper-1019500.onnx](https://huggingface.co/somanchiu/reswapper/tree/main)
|
131 |
-
- [reswapper-429500.pth](https://huggingface.co/somanchiu/reswapper/tree/main)
|
132 |
-
- [reswapper-429500.onnx](https://huggingface.co/somanchiu/reswapper/tree/main)
|
133 |
-
|
134 |
-
### Notes
|
135 |
-
If you downloaded the ONNX format model before 2024/11/25, please download the model again or export the model with opset_version=11. This is related to issue #8.
|
136 |
-
|
137 |
-
## To Do
|
138 |
-
- Create a 512-resolution model (alternative to inswapper_512)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|