leozindev15 commited on
Commit
89a6db0
Β·
verified Β·
1 Parent(s): ae06c1b

Upload 12 files

Browse files
Files changed (11) hide show
  1. .gitignore +2 -0
  2. README.md +93 -6
  3. app.py +897 -0
  4. face_analyser.py +194 -0
  5. face_enhancer.py +72 -0
  6. face_swapper.py +150 -0
  7. requirements.txt +21 -0
  8. start-ngrok.py +109 -0
  9. start.sh +47 -0
  10. update.requirements.txt +16 -0
  11. utils.py +303 -0
.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+
2
+ *.pyc
README.md CHANGED
@@ -1,12 +1,99 @@
1
  ---
2
- title: Swapper
3
- emoji: πŸ“Š
4
  colorFrom: green
5
- colorTo: indigo
6
  sdk: gradio
7
- sdk_version: 5.4.0
8
  app_file: app.py
9
- pinned: false
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Deepfake Faceswap
3
+ emoji: πŸ’»
4
  colorFrom: green
5
+ colorTo: red
6
  sdk: gradio
7
+ sdk_version: 4.44.0
8
  app_file: app.py
9
+ pinned: true
10
+ license: apache-2.0
11
+ short_description: Faceswap application with fast swap with many ehance models.
12
  ---
13
 
14
+ ## Here Some of my Faceswap applications
15
+
16
+ ### Deepfake Faceswap
17
+
18
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/SwapFace2Pon)
19
+
20
+ ### Swapall
21
+ txt prompt function
22
+
23
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/swapall)
24
+
25
+
26
+ ### VideoReface
27
+
28
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/VideoReface)
29
+
30
+ ### FaceClone and pose Swap
31
+
32
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/FaceClone2)
33
+
34
+ ### SwapUi
35
+
36
+ Auto detect faces and auto Swap
37
+
38
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/SwapUI)
39
+
40
+ ### FaceSwapLite
41
+ Manual select faces to swap
42
+
43
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/FaceSwapLite)
44
+
45
+ ### ModelSwap
46
+ Auto Swap by Models
47
+
48
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/ModelSwap)
49
+
50
+ ### Rop
51
+
52
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/Rop)
53
+
54
+ ### Muliti Swap
55
+ Under Developing
56
+
57
+ [![Hugging Face Space](https://img.shields.io/badge/Open-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/victorisgeek/Multi?logs=container)
58
+
59
+ ###
60
+
61
+ ## Description
62
+ Powerful Deepfake application
63
+
64
+ ## Features
65
+ - Fast photo Faceswap
66
+ - Fast Video Faceswap
67
+ - Powerful Models Gfpgan, RealESRGAN, codeformer, inswapper
68
+
69
+ ## Preparing
70
+
71
+ ```python
72
+ pkg update
73
+ pkg upgrade
74
+ pkg install git
75
+ ```
76
+
77
+ ### Go to Faceswap
78
+
79
+ ```python
80
+ cd FaceSwapSpace
81
+ ```
82
+
83
+ ### Install requirements
84
+
85
+ ```python
86
+ pip install -r requirements.txt
87
+ ```
88
+ *** If some error instated to update
89
+
90
+ ```python
91
+ pip install -r update.requirements.txt
92
+ ```
93
+
94
+ ## Run
95
+
96
+ ```pyhon
97
+ python app.py
98
+ ```
99
+ # Enjoy with your Toy!!
app.py ADDED
@@ -0,0 +1,897 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import cv2
3
+ import glob
4
+ import time
5
+ import torch
6
+ import shutil
7
+ import argparse
8
+ import platform
9
+ import datetime
10
+ import subprocess
11
+ import insightface
12
+ import onnxruntime
13
+ import numpy as np
14
+ import gradio as gr
15
+ import threading
16
+ import queue
17
+ from tqdm import tqdm
18
+ import concurrent.futures
19
+ from moviepy.editor import VideoFileClip
20
+
21
+ from face_swapper import Inswapper, paste_to_whole
22
+ from face_analyser import detect_conditions, get_analysed_data, swap_options_list
23
+ from face_parsing import init_parsing_model, get_parsed_mask, mask_regions, mask_regions_to_list
24
+ from face_enhancer import get_available_enhancer_names, load_face_enhancer_model, cv2_interpolations
25
+ from utils import trim_video, StreamerThread, ProcessBar, open_directory, split_list_by_lengths, merge_img_sequence_from_ref, create_image_grid
26
+
27
+ ## ------------------------------ USER ARGS ------------------------------
28
+
29
+ parser = argparse.ArgumentParser(description="Free Face Swapper")
30
+ parser.add_argument("--out_dir", help="Default Output directory", default=os.getcwd())
31
+ parser.add_argument("--batch_size", help="Gpu batch size", default=32)
32
+ parser.add_argument("--cuda", action="store_true", help="Enable cuda", default=False)
33
+ parser.add_argument(
34
+ "--colab", action="store_true", help="Enable colab mode", default=False
35
+ )
36
+ user_args = parser.parse_args()
37
+
38
+ ## ------------------------------ DEFAULTS ------------------------------
39
+
40
+ USE_COLAB = user_args.colab
41
+ USE_CUDA = user_args.cuda
42
+ DEF_OUTPUT_PATH = user_args.out_dir
43
+ BATCH_SIZE = int(user_args.batch_size)
44
+ WORKSPACE = None
45
+ OUTPUT_FILE = None
46
+ CURRENT_FRAME = None
47
+ STREAMER = None
48
+ DETECT_CONDITION = "best detection"
49
+ DETECT_SIZE = 640
50
+ DETECT_THRESH = 0.6
51
+ NUM_OF_SRC_SPECIFIC = 10
52
+ MASK_INCLUDE = [
53
+ "Skin",
54
+ "R-Eyebrow",
55
+ "L-Eyebrow",
56
+ "L-Eye",
57
+ "R-Eye",
58
+ "Nose",
59
+ "Mouth",
60
+ "L-Lip",
61
+ "U-Lip"
62
+ ]
63
+ MASK_SOFT_KERNEL = 17
64
+ MASK_SOFT_ITERATIONS = 10
65
+ MASK_BLUR_AMOUNT = 0.1
66
+ MASK_ERODE_AMOUNT = 0.15
67
+
68
+ FACE_SWAPPER = None
69
+ FACE_ANALYSER = None
70
+ FACE_ENHANCER = None
71
+ FACE_PARSER = None
72
+ FACE_ENHANCER_LIST = ["NONE"]
73
+ FACE_ENHANCER_LIST.extend(get_available_enhancer_names())
74
+ FACE_ENHANCER_LIST.extend(cv2_interpolations)
75
+
76
+ ## ------------------------------ SET EXECUTION PROVIDER ------------------------------
77
+ # Note: Non CUDA users may change settings here
78
+
79
+ PROVIDER = ["CPUExecutionProvider"]
80
+
81
+ if USE_CUDA:
82
+ available_providers = onnxruntime.get_available_providers()
83
+ if "CUDAExecutionProvider" in available_providers:
84
+ print("\n********** Running on CUDA **********\n")
85
+ PROVIDER = ["CUDAExecutionProvider", "CPUExecutionProvider"]
86
+ else:
87
+ USE_CUDA = False
88
+ print("\n********** CUDA unavailable running on CPU **********\n")
89
+ else:
90
+ USE_CUDA = False
91
+ print("\n********** Running on CPU **********\n")
92
+
93
+ device = "cuda" if USE_CUDA else "cpu"
94
+ EMPTY_CACHE = lambda: torch.cuda.empty_cache() if device == "cuda" else None
95
+
96
+ ## ------------------------------ LOAD MODELS ------------------------------
97
+
98
+ def load_face_analyser_model(name="buffalo_l"):
99
+ global FACE_ANALYSER
100
+ if FACE_ANALYSER is None:
101
+ FACE_ANALYSER = insightface.app.FaceAnalysis(name=name, providers=PROVIDER)
102
+ FACE_ANALYSER.prepare(
103
+ ctx_id=0, det_size=(DETECT_SIZE, DETECT_SIZE), det_thresh=DETECT_THRESH
104
+ )
105
+
106
+
107
+ def load_face_swapper_model(path="./assets/pretrained_models/inswapper_128.onnx"):
108
+ global FACE_SWAPPER
109
+ if FACE_SWAPPER is None:
110
+ batch = int(BATCH_SIZE) if device == "cuda" else 1
111
+ FACE_SWAPPER = Inswapper(model_file=path, batch_size=batch, providers=PROVIDER)
112
+
113
+
114
+ def load_face_parser_model(path="./assets/pretrained_models/79999_iter.pth"):
115
+ global FACE_PARSER
116
+ if FACE_PARSER is None:
117
+ FACE_PARSER = init_parsing_model(path, device=device)
118
+
119
+
120
+ load_face_analyser_model()
121
+ load_face_swapper_model()
122
+
123
+ ## ------------------------------ MAIN PROCESS ------------------------------
124
+
125
+
126
+ def process(
127
+ input_type,
128
+ image_path,
129
+ video_path,
130
+ directory_path,
131
+ source_path,
132
+ output_path,
133
+ output_name,
134
+ keep_output_sequence,
135
+ condition,
136
+ age,
137
+ distance,
138
+ face_enhancer_name,
139
+ enable_face_parser,
140
+ mask_includes,
141
+ mask_soft_kernel,
142
+ mask_soft_iterations,
143
+ blur_amount,
144
+ erode_amount,
145
+ face_scale,
146
+ enable_laplacian_blend,
147
+ crop_top,
148
+ crop_bott,
149
+ crop_left,
150
+ crop_right,
151
+ *specifics,
152
+ ):
153
+ global WORKSPACE
154
+ global OUTPUT_FILE
155
+ global PREVIEW
156
+ WORKSPACE, OUTPUT_FILE, PREVIEW = None, None, None
157
+
158
+ ## ------------------------------ GUI UPDATE FUNC ------------------------------
159
+
160
+ def ui_before():
161
+ return (
162
+ gr.update(visible=True, value=PREVIEW),
163
+ gr.update(interactive=False),
164
+ gr.update(interactive=False),
165
+ gr.update(visible=False),
166
+ )
167
+
168
+ def ui_after():
169
+ return (
170
+ gr.update(visible=True, value=PREVIEW),
171
+ gr.update(interactive=True),
172
+ gr.update(interactive=True),
173
+ gr.update(visible=False),
174
+ )
175
+
176
+ def ui_after_vid():
177
+ return (
178
+ gr.update(visible=False),
179
+ gr.update(interactive=True),
180
+ gr.update(interactive=True),
181
+ gr.update(value=OUTPUT_FILE, visible=True),
182
+ )
183
+
184
+ start_time = time.time()
185
+ total_exec_time = lambda start_time: divmod(time.time() - start_time, 60)
186
+ get_finsh_text = lambda start_time: f"βœ”οΈ Completed in {int(total_exec_time(start_time)[0])} min {int(total_exec_time(start_time)[1])} sec."
187
+
188
+ ## ------------------------------ PREPARE INPUTS & LOAD MODELS ------------------------------
189
+
190
+
191
+
192
+ yield "### \n πŸŒ€ Loading face analyser model...", *ui_before()
193
+ load_face_analyser_model()
194
+
195
+ yield "### \n βš™οΈ Loading face swapper model...", *ui_before()
196
+ load_face_swapper_model()
197
+
198
+ if face_enhancer_name != "NONE":
199
+ if face_enhancer_name not in cv2_interpolations:
200
+ yield f"### \n πŸ’‘ Loading {face_enhancer_name} model...", *ui_before()
201
+ FACE_ENHANCER = load_face_enhancer_model(name=face_enhancer_name, device=device)
202
+ else:
203
+ FACE_ENHANCER = None
204
+
205
+ if enable_face_parser:
206
+ yield "### \n πŸ“€ Loading face parsing model...", *ui_before()
207
+ load_face_parser_model()
208
+
209
+ includes = mask_regions_to_list(mask_includes)
210
+ specifics = list(specifics)
211
+ half = len(specifics) // 2
212
+ sources = specifics[:half]
213
+ specifics = specifics[half:]
214
+ if crop_top > crop_bott:
215
+ crop_top, crop_bott = crop_bott, crop_top
216
+ if crop_left > crop_right:
217
+ crop_left, crop_right = crop_right, crop_left
218
+ crop_mask = (crop_top, 511-crop_bott, crop_left, 511-crop_right)
219
+
220
+ def swap_process(image_sequence):
221
+ ## ------------------------------ CONTENT CHECK ------------------------------
222
+
223
+
224
+ yield "### \n 🧿 Analysing face data...", *ui_before()
225
+ if condition != "Specific Face":
226
+ source_data = source_path, age
227
+ else:
228
+ source_data = ((sources, specifics), distance)
229
+ analysed_targets, analysed_sources, whole_frame_list, num_faces_per_frame = get_analysed_data(
230
+ FACE_ANALYSER,
231
+ image_sequence,
232
+ source_data,
233
+ swap_condition=condition,
234
+ detect_condition=DETECT_CONDITION,
235
+ scale=face_scale
236
+ )
237
+
238
+ ## ------------------------------ SWAP FUNC ------------------------------
239
+
240
+ yield "### \n 🧢 Generating faces...", *ui_before()
241
+ preds = []
242
+ matrs = []
243
+ count = 0
244
+ global PREVIEW
245
+ for batch_pred, batch_matr in FACE_SWAPPER.batch_forward(whole_frame_list, analysed_targets, analysed_sources):
246
+ preds.extend(batch_pred)
247
+ matrs.extend(batch_matr)
248
+ EMPTY_CACHE()
249
+ count += 1
250
+
251
+ if USE_CUDA:
252
+ image_grid = create_image_grid(batch_pred, size=128)
253
+ PREVIEW = image_grid[:, :, ::-1]
254
+ yield f"### \n 🧩 Generating face Batch {count}", *ui_before()
255
+
256
+ ## ------------------------------ FACE ENHANCEMENT ------------------------------
257
+
258
+ generated_len = len(preds)
259
+ if face_enhancer_name != "NONE":
260
+ yield f"### \n 🎲 Upscaling faces with {face_enhancer_name}...", *ui_before()
261
+ for idx, pred in tqdm(enumerate(preds), total=generated_len, desc=f"Upscaling with {face_enhancer_name}"):
262
+ enhancer_model, enhancer_model_runner = FACE_ENHANCER
263
+ pred = enhancer_model_runner(pred, enhancer_model)
264
+ preds[idx] = cv2.resize(pred, (512,512))
265
+ EMPTY_CACHE()
266
+
267
+ ## ------------------------------ FACE PARSING ------------------------------
268
+
269
+ if enable_face_parser:
270
+ yield "### \n 🎨 Face-parsing mask...", *ui_before()
271
+ masks = []
272
+ count = 0
273
+ for batch_mask in get_parsed_mask(FACE_PARSER, preds, classes=includes, device=device, batch_size=BATCH_SIZE, softness=int(mask_soft_iterations)):
274
+ masks.append(batch_mask)
275
+ EMPTY_CACHE()
276
+ count += 1
277
+
278
+ if len(batch_mask) > 1:
279
+ image_grid = create_image_grid(batch_mask, size=128)
280
+ PREVIEW = image_grid[:, :, ::-1]
281
+ yield f"### \n πŸͺ™ Face parsing Batch {count}", *ui_before()
282
+ masks = np.concatenate(masks, axis=0) if len(masks) >= 1 else masks
283
+ else:
284
+ masks = [None] * generated_len
285
+
286
+ ## ------------------------------ SPLIT LIST ------------------------------
287
+
288
+ split_preds = split_list_by_lengths(preds, num_faces_per_frame)
289
+ del preds
290
+ split_matrs = split_list_by_lengths(matrs, num_faces_per_frame)
291
+ del matrs
292
+ split_masks = split_list_by_lengths(masks, num_faces_per_frame)
293
+ del masks
294
+
295
+ ## ------------------------------ PASTE-BACK ------------------------------
296
+
297
+ yield "### \n 🧿 Pasting back...", *ui_before()
298
+ def post_process(frame_idx, frame_img, split_preds, split_matrs, split_masks, enable_laplacian_blend, crop_mask, blur_amount, erode_amount):
299
+ whole_img_path = frame_img
300
+ whole_img = cv2.imread(whole_img_path)
301
+ blend_method = 'laplacian' if enable_laplacian_blend else 'linear'
302
+ for p, m, mask in zip(split_preds[frame_idx], split_matrs[frame_idx], split_masks[frame_idx]):
303
+ p = cv2.resize(p, (512,512))
304
+ mask = cv2.resize(mask, (512,512)) if mask is not None else None
305
+ m /= 0.25
306
+ whole_img = paste_to_whole(p, whole_img, m, mask=mask, crop_mask=crop_mask, blend_method=blend_method, blur_amount=blur_amount, erode_amount=erode_amount)
307
+ cv2.imwrite(whole_img_path, whole_img)
308
+
309
+ def concurrent_post_process(image_sequence, *args):
310
+ with concurrent.futures.ThreadPoolExecutor() as executor:
311
+ futures = []
312
+ for idx, frame_img in enumerate(image_sequence):
313
+ future = executor.submit(post_process, idx, frame_img, *args)
314
+ futures.append(future)
315
+
316
+ for future in tqdm(concurrent.futures.as_completed(futures), total=len(futures), desc="Pasting back"):
317
+ result = future.result()
318
+
319
+ concurrent_post_process(
320
+ image_sequence,
321
+ split_preds,
322
+ split_matrs,
323
+ split_masks,
324
+ enable_laplacian_blend,
325
+ crop_mask,
326
+ blur_amount,
327
+ erode_amount
328
+ )
329
+
330
+
331
+ ## ------------------------------ IMAGE ------------------------------
332
+
333
+ if input_type == "Image":
334
+ target = cv2.imread(image_path)
335
+ output_file = os.path.join(output_path, output_name + ".png")
336
+ cv2.imwrite(output_file, target)
337
+
338
+ for info_update in swap_process([output_file]):
339
+ yield info_update
340
+
341
+ OUTPUT_FILE = output_file
342
+ WORKSPACE = output_path
343
+ PREVIEW = cv2.imread(output_file)[:, :, ::-1]
344
+
345
+ yield get_finsh_text(start_time), *ui_after()
346
+
347
+ ## ------------------------------ VIDEO ------------------------------
348
+
349
+ elif input_type == "Video":
350
+ temp_path = os.path.join(output_path, output_name, "sequence")
351
+ os.makedirs(temp_path, exist_ok=True)
352
+
353
+ yield "### \n βŒ› Extracting video frames...", *ui_before()
354
+ image_sequence = []
355
+ cap = cv2.VideoCapture(video_path)
356
+ curr_idx = 0
357
+ while True:
358
+ ret, frame = cap.read()
359
+ if not ret:break
360
+ frame_path = os.path.join(temp_path, f"frame_{curr_idx}.jpg")
361
+ cv2.imwrite(frame_path, frame)
362
+ image_sequence.append(frame_path)
363
+ curr_idx += 1
364
+ cap.release()
365
+ cv2.destroyAllWindows()
366
+
367
+ for info_update in swap_process(image_sequence):
368
+ yield info_update
369
+
370
+ yield "### \n βŒ› Merging sequence...", *ui_before()
371
+ output_video_path = os.path.join(output_path, output_name + ".mp4")
372
+ merge_img_sequence_from_ref(video_path, image_sequence, output_video_path)
373
+
374
+ if os.path.exists(temp_path) and not keep_output_sequence:
375
+ yield "### \n βŒ› Removing temporary files...", *ui_before()
376
+ shutil.rmtree(temp_path)
377
+
378
+ WORKSPACE = output_path
379
+ OUTPUT_FILE = output_video_path
380
+
381
+ yield get_finsh_text(start_time), *ui_after_vid()
382
+
383
+ ## ------------------------------ DIRECTORY ------------------------------
384
+
385
+ elif input_type == "Directory":
386
+ extensions = ["jpg", "jpeg", "png", "bmp", "tiff", "ico", "webp"]
387
+ temp_path = os.path.join(output_path, output_name)
388
+ if os.path.exists(temp_path):
389
+ shutil.rmtree(temp_path)
390
+ os.mkdir(temp_path)
391
+
392
+ file_paths =[]
393
+ for file_path in glob.glob(os.path.join(directory_path, "*")):
394
+ if any(file_path.lower().endswith(ext) for ext in extensions):
395
+ img = cv2.imread(file_path)
396
+ new_file_path = os.path.join(temp_path, os.path.basename(file_path))
397
+ cv2.imwrite(new_file_path, img)
398
+ file_paths.append(new_file_path)
399
+
400
+ for info_update in swap_process(file_paths):
401
+ yield info_update
402
+
403
+ PREVIEW = cv2.imread(file_paths[-1])[:, :, ::-1]
404
+ WORKSPACE = temp_path
405
+ OUTPUT_FILE = file_paths[-1]
406
+
407
+ yield get_finsh_text(start_time), *ui_after()
408
+
409
+ ## ------------------------------ STREAM ------------------------------
410
+
411
+ elif input_type == "Stream":
412
+ pass
413
+
414
+
415
+ ## ------------------------------ GRADIO FUNC ------------------------------
416
+
417
+
418
+ def update_radio(value):
419
+ if value == "Image":
420
+ return (
421
+ gr.update(visible=True),
422
+ gr.update(visible=False),
423
+ gr.update(visible=False),
424
+ )
425
+ elif value == "Video":
426
+ return (
427
+ gr.update(visible=False),
428
+ gr.update(visible=True),
429
+ gr.update(visible=False),
430
+ )
431
+ elif value == "Directory":
432
+ return (
433
+ gr.update(visible=False),
434
+ gr.update(visible=False),
435
+ gr.update(visible=True),
436
+ )
437
+ elif value == "Stream":
438
+ return (
439
+ gr.update(visible=False),
440
+ gr.update(visible=False),
441
+ gr.update(visible=True),
442
+ )
443
+
444
+
445
+ def swap_option_changed(value):
446
+ if value.startswith("Age"):
447
+ return (
448
+ gr.update(visible=True),
449
+ gr.update(visible=False),
450
+ gr.update(visible=True),
451
+ )
452
+ elif value == "Specific Face":
453
+ return (
454
+ gr.update(visible=False),
455
+ gr.update(visible=True),
456
+ gr.update(visible=False),
457
+ )
458
+ return gr.update(visible=False), gr.update(visible=False), gr.update(visible=True)
459
+
460
+
461
+ def video_changed(video_path):
462
+ sliders_update = gr.Slider.update
463
+ button_update = gr.Button.update
464
+ number_update = gr.Number.update
465
+
466
+ if video_path is None:
467
+ return (
468
+ sliders_update(minimum=0, maximum=0, value=0),
469
+ sliders_update(minimum=1, maximum=1, value=1),
470
+ number_update(value=1),
471
+ )
472
+ try:
473
+ clip = VideoFileClip(video_path)
474
+ fps = clip.fps
475
+ total_frames = clip.reader.nframes
476
+ clip.close()
477
+ return (
478
+ sliders_update(minimum=0, maximum=total_frames, value=0, interactive=True),
479
+ sliders_update(
480
+ minimum=0, maximum=total_frames, value=total_frames, interactive=True
481
+ ),
482
+ number_update(value=fps),
483
+ )
484
+ except:
485
+ return (
486
+ sliders_update(value=0),
487
+ sliders_update(value=0),
488
+ number_update(value=1),
489
+ )
490
+
491
+
492
+ def analyse_settings_changed(detect_condition, detection_size, detection_threshold):
493
+ yield "### \n βŒ› Applying new values..."
494
+ global FACE_ANALYSER
495
+ global DETECT_CONDITION
496
+ DETECT_CONDITION = detect_condition
497
+ FACE_ANALYSER = insightface.app.FaceAnalysis(name="buffalo_l", providers=PROVIDER)
498
+ FACE_ANALYSER.prepare(
499
+ ctx_id=0,
500
+ det_size=(int(detection_size), int(detection_size)),
501
+ det_thresh=float(detection_threshold),
502
+ )
503
+ yield f"### \n βœ”οΈ Applied detect condition:{detect_condition}, detection size: {detection_size}, detection threshold: {detection_threshold}"
504
+
505
+
506
+ def stop_running():
507
+ global STREAMER
508
+ if hasattr(STREAMER, "stop"):
509
+ STREAMER.stop()
510
+ STREAMER = None
511
+ return "Cancelled"
512
+
513
+
514
+ def slider_changed(show_frame, video_path, frame_index):
515
+ if not show_frame:
516
+ return None, None
517
+ if video_path is None:
518
+ return None, None
519
+ clip = VideoFileClip(video_path)
520
+ frame = clip.get_frame(frame_index / clip.fps)
521
+ frame_array = np.array(frame)
522
+ clip.close()
523
+ return gr.Image.update(value=frame_array, visible=True), gr.Video.update(
524
+ visible=False
525
+ )
526
+
527
+
528
+ def trim_and_reload(video_path, output_path, output_name, start_frame, stop_frame):
529
+ yield video_path, f"### \n 🌈 Trimming video frame {start_frame} to {stop_frame}..."
530
+ try:
531
+ output_path = os.path.join(output_path, output_name)
532
+ trimmed_video = trim_video(video_path, output_path, start_frame, stop_frame)
533
+ yield trimmed_video, "### \n βœ”οΈ Video trimmed and reloaded."
534
+ except Exception as e:
535
+ print(e)
536
+ yield video_path, "### \n πŸ”₯ Video trimming failed. See console for more info."
537
+
538
+
539
+ ## ------------------------------ GRADIO GUI ------------------------------
540
+
541
+ css = """
542
+ footer{display:none !important}
543
+ """
544
+
545
+ with gr.Blocks(css=css) as interface:
546
+ gr.Markdown("## πŸ¦‹ FaceSwap & Enhnacer πŸ¦‹")
547
+ with gr.Row():
548
+ with gr.Row():
549
+ with gr.Column(scale=0.4):
550
+ with gr.Tab("πŸ’— Swap Condition"):
551
+ swap_option = gr.Dropdown(
552
+ swap_options_list,
553
+ info="Choose which face or faces in the target image to swap.",
554
+ multiselect=False,
555
+ show_label=False,
556
+ value=swap_options_list[0],
557
+ interactive=True,
558
+ )
559
+ age = gr.Number(
560
+ value=25, label="Value", interactive=True, visible=False
561
+ )
562
+
563
+ with gr.Tab("β€οΈβ€πŸ©Ή Detection Settings"):
564
+ detect_condition_dropdown = gr.Dropdown(
565
+ detect_conditions,
566
+ label="Condition",
567
+ value=DETECT_CONDITION,
568
+ interactive=True,
569
+ info="This condition is only used when multiple faces are detected on source or specific image.",
570
+ )
571
+ detection_size = gr.Number(
572
+ label="Detection Size", value=DETECT_SIZE, interactive=True
573
+ )
574
+ detection_threshold = gr.Number(
575
+ label="Detection Threshold",
576
+ value=DETECT_THRESH,
577
+ interactive=True,
578
+ )
579
+ apply_detection_settings = gr.Button("Apply settings")
580
+
581
+ with gr.Tab("πŸ’– Output Settings"):
582
+ output_directory = gr.Text(
583
+ label="Output Directory",
584
+ value=DEF_OUTPUT_PATH,
585
+ interactive=True,
586
+ )
587
+ output_name = gr.Text(
588
+ label="Output Name", value="Result", interactive=True
589
+ )
590
+ keep_output_sequence = gr.Checkbox(
591
+ label="Keep output sequence", value=False, interactive=True
592
+ )
593
+
594
+ with gr.Tab("🀍 Other Settings"):
595
+ face_scale = gr.Slider(
596
+ label="Face Scale",
597
+ minimum=0,
598
+ maximum=2,
599
+ value=1,
600
+ interactive=True,
601
+ )
602
+
603
+ face_enhancer_name = gr.Dropdown(
604
+ FACE_ENHANCER_LIST, label="Face Enhancer", value="NONE", multiselect=False, interactive=True
605
+ )
606
+
607
+ with gr.Accordion("Advanced Mask", open=False):
608
+ enable_face_parser_mask = gr.Checkbox(
609
+ label="Enable Face Parsing",
610
+ value=False,
611
+ interactive=True,
612
+ )
613
+
614
+ mask_include = gr.Dropdown(
615
+ mask_regions.keys(),
616
+ value=MASK_INCLUDE,
617
+ multiselect=True,
618
+ label="Include",
619
+ interactive=True,
620
+ )
621
+ mask_soft_kernel = gr.Number(
622
+ label="Soft Erode Kernel",
623
+ value=MASK_SOFT_KERNEL,
624
+ minimum=3,
625
+ interactive=True,
626
+ visible = False
627
+ )
628
+ mask_soft_iterations = gr.Number(
629
+ label="Soft Erode Iterations",
630
+ value=MASK_SOFT_ITERATIONS,
631
+ minimum=0,
632
+ interactive=True,
633
+
634
+ )
635
+
636
+
637
+ with gr.Accordion("Crop Mask", open=False):
638
+ crop_top = gr.Slider(label="Top", minimum=0, maximum=511, value=0, step=1, interactive=True)
639
+ crop_bott = gr.Slider(label="Bottom", minimum=0, maximum=511, value=511, step=1, interactive=True)
640
+ crop_left = gr.Slider(label="Left", minimum=0, maximum=511, value=0, step=1, interactive=True)
641
+ crop_right = gr.Slider(label="Right", minimum=0, maximum=511, value=511, step=1, interactive=True)
642
+
643
+
644
+ erode_amount = gr.Slider(
645
+ label="Mask Erode",
646
+ minimum=0,
647
+ maximum=1,
648
+ value=MASK_ERODE_AMOUNT,
649
+ step=0.05,
650
+ interactive=True,
651
+ )
652
+
653
+ blur_amount = gr.Slider(
654
+ label="Mask Blur",
655
+ minimum=0,
656
+ maximum=1,
657
+ value=MASK_BLUR_AMOUNT,
658
+ step=0.05,
659
+ interactive=True,
660
+ )
661
+
662
+ enable_laplacian_blend = gr.Checkbox(
663
+ label="Laplacian Blending",
664
+ value=True,
665
+ interactive=True,
666
+ )
667
+
668
+
669
+ source_image_input = gr.Image(
670
+ label="Source face", type="filepath", interactive=True
671
+ )
672
+
673
+ with gr.Group(visible=False) as specific_face:
674
+ for i in range(NUM_OF_SRC_SPECIFIC):
675
+ idx = i + 1
676
+ code = "\n"
677
+ code += f"with gr.Tab(label='({idx})'):"
678
+ code += "\n\twith gr.Row():"
679
+ code += f"\n\t\tsrc{idx} = gr.Image(interactive=True, type='numpy', label='Source Face {idx}')"
680
+ code += f"\n\t\ttrg{idx} = gr.Image(interactive=True, type='numpy', label='Specific Face {idx}')"
681
+ exec(code)
682
+
683
+ distance_slider = gr.Slider(
684
+ minimum=0,
685
+ maximum=2,
686
+ value=0.6,
687
+ interactive=True,
688
+ label="Distance",
689
+ info="Lower distance is more similar and higher distance is less similar to the target face.",
690
+ )
691
+
692
+ with gr.Group():
693
+ input_type = gr.Radio(
694
+ ["Image", "Video"],
695
+ label="Target Type",
696
+ value="Image",
697
+ )
698
+
699
+ with gr.Group(visible=True) as input_image_group:
700
+ image_input = gr.Image(
701
+ label="Target Image", interactive=True, type="filepath"
702
+ )
703
+
704
+ with gr.Group(visible=False) as input_video_group:
705
+ vid_widget = gr.Video if USE_COLAB else gr.Text
706
+ video_input = gr.Video(
707
+ label="Target Video", interactive=True
708
+ )
709
+ with gr.Accordion("πŸ’™ Trim video", open=False):
710
+ with gr.Column():
711
+ with gr.Row():
712
+ set_slider_range_btn = gr.Button(
713
+ "Set frame range", interactive=True
714
+ )
715
+ show_trim_preview_btn = gr.Checkbox(
716
+ label="Show frame when slider change",
717
+ value=True,
718
+ interactive=True,
719
+ )
720
+
721
+ video_fps = gr.Number(
722
+ value=30,
723
+ interactive=False,
724
+ label="Fps",
725
+ visible=False,
726
+ )
727
+ start_frame = gr.Slider(
728
+ minimum=0,
729
+ maximum=1,
730
+ value=0,
731
+ step=1,
732
+ interactive=True,
733
+ label="Start Frame",
734
+ info="",
735
+ )
736
+ end_frame = gr.Slider(
737
+ minimum=0,
738
+ maximum=1,
739
+ value=1,
740
+ step=1,
741
+ interactive=True,
742
+ label="End Frame",
743
+ info="",
744
+ )
745
+ trim_and_reload_btn = gr.Button(
746
+ "Trim and Reload", interactive=True
747
+ )
748
+
749
+ with gr.Group(visible=False) as input_directory_group:
750
+ direc_input = gr.Text(label="Path", interactive=True)
751
+
752
+ with gr.Column(scale=0.6):
753
+ info = gr.Markdown(value="...")
754
+
755
+ with gr.Row():
756
+ swap_button = gr.Button("πŸ’– Swap", variant="primary")
757
+ cancel_button = gr.Button("πŸ’” Cancel")
758
+
759
+ preview_image = gr.Image(label="Output", interactive=False)
760
+ preview_video = gr.Video(
761
+ label="Output", interactive=False, visible=False
762
+ )
763
+
764
+ with gr.Row():
765
+ output_directory_button = gr.Button(
766
+ "πŸ’š", interactive=False, visible=False
767
+ )
768
+ output_video_button = gr.Button(
769
+ "πŸ’˜", interactive=False, visible=False
770
+ )
771
+
772
+ with gr.Group():
773
+ with gr.Row():
774
+ gr.Markdown(
775
+ "### [πŸ€— Welcome to my GitHub πŸ€—](https://github.com/victorgeel)"
776
+ )
777
+
778
+
779
+ ## ------------------------------ GRADIO EVENTS ------------------------------
780
+
781
+ set_slider_range_event = set_slider_range_btn.click(
782
+ video_changed,
783
+ inputs=[video_input],
784
+ outputs=[start_frame, end_frame, video_fps],
785
+ )
786
+
787
+ trim_and_reload_event = trim_and_reload_btn.click(
788
+ fn=trim_and_reload,
789
+ inputs=[video_input, output_directory, output_name, start_frame, end_frame],
790
+ outputs=[video_input, info],
791
+ )
792
+
793
+ start_frame_event = start_frame.release(
794
+ fn=slider_changed,
795
+ inputs=[show_trim_preview_btn, video_input, start_frame],
796
+ outputs=[preview_image, preview_video],
797
+ show_progress=True,
798
+ )
799
+
800
+ end_frame_event = end_frame.release(
801
+ fn=slider_changed,
802
+ inputs=[show_trim_preview_btn, video_input, end_frame],
803
+ outputs=[preview_image, preview_video],
804
+ show_progress=True,
805
+ )
806
+
807
+ input_type.change(
808
+ update_radio,
809
+ inputs=[input_type],
810
+ outputs=[input_image_group, input_video_group, input_directory_group],
811
+ )
812
+ swap_option.change(
813
+ swap_option_changed,
814
+ inputs=[swap_option],
815
+ outputs=[age, specific_face, source_image_input],
816
+ )
817
+
818
+ apply_detection_settings.click(
819
+ analyse_settings_changed,
820
+ inputs=[detect_condition_dropdown, detection_size, detection_threshold],
821
+ outputs=[info],
822
+ )
823
+
824
+ src_specific_inputs = []
825
+ gen_variable_txt = ",".join(
826
+ [f"src{i+1}" for i in range(NUM_OF_SRC_SPECIFIC)]
827
+ + [f"trg{i+1}" for i in range(NUM_OF_SRC_SPECIFIC)]
828
+ )
829
+ exec(f"src_specific_inputs = ({gen_variable_txt})")
830
+ swap_inputs = [
831
+ input_type,
832
+ image_input,
833
+ video_input,
834
+ direc_input,
835
+ source_image_input,
836
+ output_directory,
837
+ output_name,
838
+ keep_output_sequence,
839
+ swap_option,
840
+ age,
841
+ distance_slider,
842
+ face_enhancer_name,
843
+ enable_face_parser_mask,
844
+ mask_include,
845
+ mask_soft_kernel,
846
+ mask_soft_iterations,
847
+ blur_amount,
848
+ erode_amount,
849
+ face_scale,
850
+ enable_laplacian_blend,
851
+ crop_top,
852
+ crop_bott,
853
+ crop_left,
854
+ crop_right,
855
+ *src_specific_inputs,
856
+ ]
857
+
858
+ swap_outputs = [
859
+ info,
860
+ preview_image,
861
+ output_directory_button,
862
+ output_video_button,
863
+ preview_video,
864
+ ]
865
+
866
+ swap_event = swap_button.click(
867
+ fn=process, inputs=swap_inputs, outputs=swap_outputs, show_progress=True
868
+ )
869
+
870
+
871
+ cancel_button.click(
872
+ fn=stop_running,
873
+ inputs=None,
874
+ outputs=[info],
875
+ cancels=[
876
+ swap_event,
877
+ trim_and_reload_event,
878
+ set_slider_range_event,
879
+ start_frame_event,
880
+ end_frame_event,
881
+ ],
882
+ show_progress=True,
883
+
884
+ )
885
+ output_directory_button.click(
886
+ lambda: open_directory(path=WORKSPACE), inputs=None, outputs=None
887
+ )
888
+ output_video_button.click(
889
+ lambda: open_directory(path=OUTPUT_FILE), inputs=None, outputs=None
890
+ )
891
+
892
+ if __name__ == "__main__":
893
+ if USE_COLAB:
894
+ print("Running in colab mode")
895
+
896
+
897
+ interface.launch(share=USE_COLAB, max_threads=10)
face_analyser.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import cv2
3
+ import numpy as np
4
+ from tqdm import tqdm
5
+ from utils import scale_bbox_from_center
6
+
7
+ detect_conditions = [
8
+ "best detection",
9
+ "left most",
10
+ "right most",
11
+ "top most",
12
+ "bottom most",
13
+ "middle",
14
+ "biggest",
15
+ "smallest",
16
+ ]
17
+
18
+ swap_options_list = [
19
+ "All Face",
20
+ "Specific Face",
21
+ "Age less than",
22
+ "Age greater than",
23
+ "All Male",
24
+ "All Female",
25
+ "Left Most",
26
+ "Right Most",
27
+ "Top Most",
28
+ "Bottom Most",
29
+ "Middle",
30
+ "Biggest",
31
+ "Smallest",
32
+ ]
33
+
34
+ def get_single_face(faces, method="best detection"):
35
+ total_faces = len(faces)
36
+ if total_faces == 1:
37
+ return faces[0]
38
+
39
+ print(f"{total_faces} face detected. Using {method} face.")
40
+ if method == "best detection":
41
+ return sorted(faces, key=lambda face: face["det_score"])[-1]
42
+ elif method == "left most":
43
+ return sorted(faces, key=lambda face: face["bbox"][0])[0]
44
+ elif method == "right most":
45
+ return sorted(faces, key=lambda face: face["bbox"][0])[-1]
46
+ elif method == "top most":
47
+ return sorted(faces, key=lambda face: face["bbox"][1])[0]
48
+ elif method == "bottom most":
49
+ return sorted(faces, key=lambda face: face["bbox"][1])[-1]
50
+ elif method == "middle":
51
+ return sorted(faces, key=lambda face: (
52
+ (face["bbox"][0] + face["bbox"][2]) / 2 - 0.5) ** 2 +
53
+ ((face["bbox"][1] + face["bbox"][3]) / 2 - 0.5) ** 2)[len(faces) // 2]
54
+ elif method == "biggest":
55
+ return sorted(faces, key=lambda face: (face["bbox"][2] - face["bbox"][0]) * (face["bbox"][3] - face["bbox"][1]))[-1]
56
+ elif method == "smallest":
57
+ return sorted(faces, key=lambda face: (face["bbox"][2] - face["bbox"][0]) * (face["bbox"][3] - face["bbox"][1]))[0]
58
+
59
+
60
+ def analyse_face(image, model, return_single_face=True, detect_condition="best detection", scale=1.0):
61
+ faces = model.get(image)
62
+ if scale != 1: # landmark-scale
63
+ for i, face in enumerate(faces):
64
+ landmark = face['kps']
65
+ center = np.mean(landmark, axis=0)
66
+ landmark = center + (landmark - center) * scale
67
+ faces[i]['kps'] = landmark
68
+
69
+ if not return_single_face:
70
+ return faces
71
+
72
+ return get_single_face(faces, method=detect_condition)
73
+
74
+
75
+ def cosine_distance(a, b):
76
+ a /= np.linalg.norm(a)
77
+ b /= np.linalg.norm(b)
78
+ return 1 - np.dot(a, b)
79
+
80
+
81
+ def get_analysed_data(face_analyser, image_sequence, source_data, swap_condition="All face", detect_condition="left most", scale=1.0):
82
+ if swap_condition != "Specific Face":
83
+ source_path, age = source_data
84
+ source_image = cv2.imread(source_path)
85
+ analysed_source = analyse_face(source_image, face_analyser, return_single_face=True, detect_condition=detect_condition, scale=scale)
86
+ else:
87
+ analysed_source_specifics = []
88
+ source_specifics, threshold = source_data
89
+ for source, specific in zip(*source_specifics):
90
+ if source is None or specific is None:
91
+ continue
92
+ analysed_source = analyse_face(source, face_analyser, return_single_face=True, detect_condition=detect_condition, scale=scale)
93
+ analysed_specific = analyse_face(specific, face_analyser, return_single_face=True, detect_condition=detect_condition, scale=scale)
94
+ analysed_source_specifics.append([analysed_source, analysed_specific])
95
+
96
+ analysed_target_list = []
97
+ analysed_source_list = []
98
+ whole_frame_eql_list = []
99
+ num_faces_per_frame = []
100
+
101
+ total_frames = len(image_sequence)
102
+ curr_idx = 0
103
+ for curr_idx, frame_path in tqdm(enumerate(image_sequence), total=total_frames, desc="Analysing face data"):
104
+ frame = cv2.imread(frame_path)
105
+ analysed_faces = analyse_face(frame, face_analyser, return_single_face=False, detect_condition=detect_condition, scale=scale)
106
+
107
+ n_faces = 0
108
+ for analysed_face in analysed_faces:
109
+ if swap_condition == "All Face":
110
+ analysed_target_list.append(analysed_face)
111
+ analysed_source_list.append(analysed_source)
112
+ whole_frame_eql_list.append(frame_path)
113
+ n_faces += 1
114
+ elif swap_condition == "Age less than" and analysed_face["age"] < age:
115
+ analysed_target_list.append(analysed_face)
116
+ analysed_source_list.append(analysed_source)
117
+ whole_frame_eql_list.append(frame_path)
118
+ n_faces += 1
119
+ elif swap_condition == "Age greater than" and analysed_face["age"] > age:
120
+ analysed_target_list.append(analysed_face)
121
+ analysed_source_list.append(analysed_source)
122
+ whole_frame_eql_list.append(frame_path)
123
+ n_faces += 1
124
+ elif swap_condition == "All Male" and analysed_face["gender"] == 1:
125
+ analysed_target_list.append(analysed_face)
126
+ analysed_source_list.append(analysed_source)
127
+ whole_frame_eql_list.append(frame_path)
128
+ n_faces += 1
129
+ elif swap_condition == "All Female" and analysed_face["gender"] == 0:
130
+ analysed_target_list.append(analysed_face)
131
+ analysed_source_list.append(analysed_source)
132
+ whole_frame_eql_list.append(frame_path)
133
+ n_faces += 1
134
+ elif swap_condition == "Specific Face":
135
+ for analysed_source, analysed_specific in analysed_source_specifics:
136
+ distance = cosine_distance(analysed_specific["embedding"], analysed_face["embedding"])
137
+ if distance < threshold:
138
+ analysed_target_list.append(analysed_face)
139
+ analysed_source_list.append(analysed_source)
140
+ whole_frame_eql_list.append(frame_path)
141
+ n_faces += 1
142
+
143
+ if swap_condition == "Left Most":
144
+ analysed_face = get_single_face(analysed_faces, method="left most")
145
+ analysed_target_list.append(analysed_face)
146
+ analysed_source_list.append(analysed_source)
147
+ whole_frame_eql_list.append(frame_path)
148
+ n_faces += 1
149
+
150
+ elif swap_condition == "Right Most":
151
+ analysed_face = get_single_face(analysed_faces, method="right most")
152
+ analysed_target_list.append(analysed_face)
153
+ analysed_source_list.append(analysed_source)
154
+ whole_frame_eql_list.append(frame_path)
155
+ n_faces += 1
156
+
157
+ elif swap_condition == "Top Most":
158
+ analysed_face = get_single_face(analysed_faces, method="top most")
159
+ analysed_target_list.append(analysed_face)
160
+ analysed_source_list.append(analysed_source)
161
+ whole_frame_eql_list.append(frame_path)
162
+ n_faces += 1
163
+
164
+ elif swap_condition == "Bottom Most":
165
+ analysed_face = get_single_face(analysed_faces, method="bottom most")
166
+ analysed_target_list.append(analysed_face)
167
+ analysed_source_list.append(analysed_source)
168
+ whole_frame_eql_list.append(frame_path)
169
+ n_faces += 1
170
+
171
+ elif swap_condition == "Middle":
172
+ analysed_face = get_single_face(analysed_faces, method="middle")
173
+ analysed_target_list.append(analysed_face)
174
+ analysed_source_list.append(analysed_source)
175
+ whole_frame_eql_list.append(frame_path)
176
+ n_faces += 1
177
+
178
+ elif swap_condition == "Biggest":
179
+ analysed_face = get_single_face(analysed_faces, method="biggest")
180
+ analysed_target_list.append(analysed_face)
181
+ analysed_source_list.append(analysed_source)
182
+ whole_frame_eql_list.append(frame_path)
183
+ n_faces += 1
184
+
185
+ elif swap_condition == "Smallest":
186
+ analysed_face = get_single_face(analysed_faces, method="smallest")
187
+ analysed_target_list.append(analysed_face)
188
+ analysed_source_list.append(analysed_source)
189
+ whole_frame_eql_list.append(frame_path)
190
+ n_faces += 1
191
+
192
+ num_faces_per_frame.append(n_faces)
193
+
194
+ return analysed_target_list, analysed_source_list, whole_frame_eql_list, num_faces_per_frame
face_enhancer.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import cv2
3
+ import torch
4
+ import gfpgan
5
+ from PIL import Image
6
+ from upscaler.RealESRGAN import RealESRGAN
7
+ from upscaler.codeformer import CodeFormerEnhancer
8
+
9
+ def gfpgan_runner(img, model):
10
+ _, imgs, _ = model.enhance(img, paste_back=True, has_aligned=True)
11
+ return imgs[0]
12
+
13
+
14
+ def realesrgan_runner(img, model):
15
+ img = model.predict(img)
16
+ return img
17
+
18
+
19
+ def codeformer_runner(img, model):
20
+ img = model.enhance(img)
21
+ return img
22
+
23
+
24
+ supported_enhancers = {
25
+ "CodeFormer": ("./assets/pretrained_models/codeformer.onnx", codeformer_runner),
26
+ "GFPGAN": ("./assets/pretrained_models/GFPGANv1.4.pth", gfpgan_runner),
27
+ "REAL-ESRGAN 2x": ("./assets/pretrained_models/RealESRGAN_x2.pth", realesrgan_runner),
28
+ "REAL-ESRGAN 4x": ("./assets/pretrained_models/RealESRGAN_x4.pth", realesrgan_runner),
29
+ "REAL-ESRGAN 8x": ("./assets/pretrained_models/RealESRGAN_x8.pth", realesrgan_runner)
30
+ }
31
+
32
+ cv2_interpolations = ["LANCZOS4", "CUBIC", "NEAREST"]
33
+
34
+ def get_available_enhancer_names():
35
+ available = []
36
+ for name, data in supported_enhancers.items():
37
+ path = os.path.join(os.path.abspath(os.path.dirname(__file__)), data[0])
38
+ if os.path.exists(path):
39
+ available.append(name)
40
+ return available
41
+
42
+
43
+ def load_face_enhancer_model(name='GFPGAN', device="cpu"):
44
+ assert name in get_available_enhancer_names() + cv2_interpolations, f"Face enhancer {name} unavailable."
45
+ if name in supported_enhancers.keys():
46
+ model_path, model_runner = supported_enhancers.get(name)
47
+ model_path = os.path.join(os.path.abspath(os.path.dirname(__file__)), model_path)
48
+ if name == 'CodeFormer':
49
+ model = CodeFormerEnhancer(model_path=model_path, device=device)
50
+ elif name == 'GFPGAN':
51
+ model = gfpgan.GFPGANer(model_path=model_path, upscale=1, device=device)
52
+ elif name == 'REAL-ESRGAN 2x':
53
+ model = RealESRGAN(device, scale=2)
54
+ model.load_weights(model_path, download=False)
55
+ elif name == 'REAL-ESRGAN 4x':
56
+ model = RealESRGAN(device, scale=4)
57
+ model.load_weights(model_path, download=False)
58
+ elif name == 'REAL-ESRGAN 8x':
59
+ model = RealESRGAN(device, scale=8)
60
+ model.load_weights(model_path, download=False)
61
+ elif name == 'LANCZOS4':
62
+ model = None
63
+ model_runner = lambda img, _: cv2.resize(img, (512,512), interpolation=cv2.INTER_LANCZOS4)
64
+ elif name == 'CUBIC':
65
+ model = None
66
+ model_runner = lambda img, _: cv2.resize(img, (512,512), interpolation=cv2.INTER_CUBIC)
67
+ elif name == 'NEAREST':
68
+ model = None
69
+ model_runner = lambda img, _: cv2.resize(img, (512,512), interpolation=cv2.INTER_NEAREST)
70
+ else:
71
+ model = None
72
+ return (model, model_runner)
face_swapper.py ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import torch
3
+ import onnx
4
+ import cv2
5
+ import onnxruntime
6
+ import numpy as np
7
+ from tqdm import tqdm
8
+ import torch.nn as nn
9
+ from onnx import numpy_helper
10
+ from skimage import transform as trans
11
+ import torchvision.transforms.functional as F
12
+ import torch.nn.functional as F
13
+ from utils import mask_crop, laplacian_blending
14
+
15
+
16
+ arcface_dst = np.array(
17
+ [[38.2946, 51.6963], [73.5318, 51.5014], [56.0252, 71.7366],
18
+ [41.5493, 92.3655], [70.7299, 92.2041]],
19
+ dtype=np.float32)
20
+
21
+ def estimate_norm(lmk, image_size=112, mode='arcface'):
22
+ assert lmk.shape == (5, 2)
23
+ assert image_size % 112 == 0 or image_size % 128 == 0
24
+ if image_size % 112 == 0:
25
+ ratio = float(image_size) / 112.0
26
+ diff_x = 0
27
+ else:
28
+ ratio = float(image_size) / 128.0
29
+ diff_x = 8.0 * ratio
30
+ dst = arcface_dst * ratio
31
+ dst[:, 0] += diff_x
32
+ tform = trans.SimilarityTransform()
33
+ tform.estimate(lmk, dst)
34
+ M = tform.params[0:2, :]
35
+ return M
36
+
37
+
38
+ def norm_crop2(img, landmark, image_size=112, mode='arcface'):
39
+ M = estimate_norm(landmark, image_size, mode)
40
+ warped = cv2.warpAffine(img, M, (image_size, image_size), borderValue=0.0)
41
+ return warped, M
42
+
43
+
44
+ class Inswapper():
45
+ def __init__(self, model_file=None, batch_size=32, providers=['CPUExecutionProvider']):
46
+ self.model_file = model_file
47
+ self.batch_size = batch_size
48
+
49
+ model = onnx.load(self.model_file)
50
+ graph = model.graph
51
+ self.emap = numpy_helper.to_array(graph.initializer[-1])
52
+
53
+ self.session_options = onnxruntime.SessionOptions()
54
+ self.session = onnxruntime.InferenceSession(self.model_file, sess_options=self.session_options, providers=providers)
55
+
56
+ def forward(self, imgs, latents):
57
+ preds = []
58
+ for img, latent in zip(imgs, latents):
59
+ img = img / 255
60
+ pred = self.session.run(['output'], {'target': img, 'source': latent})[0]
61
+ preds.append(pred)
62
+
63
+ def get(self, imgs, target_faces, source_faces):
64
+ imgs = list(imgs)
65
+
66
+ preds = [None] * len(imgs)
67
+ matrs = [None] * len(imgs)
68
+
69
+ for idx, (img, target_face, source_face) in enumerate(zip(imgs, target_faces, source_faces)):
70
+ matrix, blob, latent = self.prepare_data(img, target_face, source_face)
71
+ pred = self.session.run(['output'], {'target': blob, 'source': latent})[0]
72
+ pred = pred.transpose((0, 2, 3, 1))[0]
73
+ pred = np.clip(255 * pred, 0, 255).astype(np.uint8)[:, :, ::-1]
74
+
75
+ preds[idx] = pred
76
+ matrs[idx] = matrix
77
+
78
+ return (preds, matrs)
79
+
80
+ def prepare_data(self, img, target_face, source_face):
81
+ if isinstance(img, str):
82
+ img = cv2.imread(img)
83
+
84
+ aligned_img, matrix = norm_crop2(img, target_face.kps, 128)
85
+
86
+ blob = cv2.dnn.blobFromImage(aligned_img, 1.0 / 255, (128, 128), (0., 0., 0.), swapRB=True)
87
+
88
+ latent = source_face.normed_embedding.reshape((1, -1))
89
+ latent = np.dot(latent, self.emap)
90
+ latent /= np.linalg.norm(latent)
91
+
92
+ return (matrix, blob, latent)
93
+
94
+ def batch_forward(self, img_list, target_f_list, source_f_list):
95
+ num_samples = len(img_list)
96
+ num_batches = (num_samples + self.batch_size - 1) // self.batch_size
97
+
98
+ for i in tqdm(range(num_batches), desc="Generating face"):
99
+ start_idx = i * self.batch_size
100
+ end_idx = min((i + 1) * self.batch_size, num_samples)
101
+
102
+ batch_img = img_list[start_idx:end_idx]
103
+ batch_target_f = target_f_list[start_idx:end_idx]
104
+ batch_source_f = source_f_list[start_idx:end_idx]
105
+
106
+ batch_pred, batch_matr = self.get(batch_img, batch_target_f, batch_source_f)
107
+
108
+ yield batch_pred, batch_matr
109
+
110
+
111
+ def paste_to_whole(foreground, background, matrix, mask=None, crop_mask=(0,0,0,0), blur_amount=0.1, erode_amount = 0.15, blend_method='linear'):
112
+ inv_matrix = cv2.invertAffineTransform(matrix)
113
+ fg_shape = foreground.shape[:2]
114
+ bg_shape = (background.shape[1], background.shape[0])
115
+ foreground = cv2.warpAffine(foreground, inv_matrix, bg_shape, borderValue=0.0)
116
+
117
+ if mask is None:
118
+ mask = np.full(fg_shape, 1., dtype=np.float32)
119
+ mask = mask_crop(mask, crop_mask)
120
+ mask = cv2.warpAffine(mask, inv_matrix, bg_shape, borderValue=0.0)
121
+ else:
122
+ assert fg_shape == mask.shape[:2], "foreground & mask shape mismatch!"
123
+ mask = mask_crop(mask, crop_mask).astype('float32')
124
+ mask = cv2.warpAffine(mask, inv_matrix, (background.shape[1], background.shape[0]), borderValue=0.0)
125
+
126
+ _mask = mask.copy()
127
+ _mask[_mask > 0.05] = 1.
128
+ non_zero_points = cv2.findNonZero(_mask)
129
+ _, _, w, h = cv2.boundingRect(non_zero_points)
130
+ mask_size = int(np.sqrt(w * h))
131
+
132
+ if erode_amount > 0:
133
+ kernel_size = max(int(mask_size * erode_amount), 1)
134
+ structuring_element = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_size, kernel_size))
135
+ mask = cv2.erode(mask, structuring_element)
136
+
137
+ if blur_amount > 0:
138
+ kernel_size = max(int(mask_size * blur_amount), 3)
139
+ if kernel_size % 2 == 0:
140
+ kernel_size += 1
141
+ mask = cv2.GaussianBlur(mask, (kernel_size, kernel_size), 0)
142
+
143
+ mask = np.tile(np.expand_dims(mask, axis=-1), (1, 1, 3))
144
+
145
+ if blend_method == 'laplacian':
146
+ composite_image = laplacian_blending(foreground, background, mask.clip(0,1), num_levels=4)
147
+ else:
148
+ composite_image = mask * foreground + (1 - mask) * background
149
+
150
+ return composite_image.astype("uint8").clip(0, 255)
requirements.txt ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ --extra-index-url https://download.pytorch.org/whl/cu118
2
+
3
+ gfpgan==1.3.8
4
+ gradio
5
+ insightface==0.7.3
6
+ moviepy>=1.0.3
7
+ numpy==1.24.3
8
+ onnx==1.14.0
9
+ onnxruntime==1.15.1; python_version != '3.9' and sys_platform == 'darwin' and platform_machine != 'arm64'
10
+ onnxruntime-coreml==1.13.1; python_version == '3.9' and sys_platform == 'darwin' and platform_machine != 'arm64'
11
+ onnxruntime-gpu==1.15.1; sys_platform != 'darwin'
12
+ onnxruntime-silicon==1.13.1; sys_platform == 'darwin' and platform_machine == 'arm64'
13
+ opencv-python==4.8.0.74
14
+ opennsfw2==0.10.2
15
+ pillow==10.0.0
16
+ protobuf==4.23.4
17
+ psutil==5.9.5
18
+ realesrgan==0.3.0
19
+ tensorflow==2.13.0
20
+ tqdm==4.65.0
21
+
start-ngrok.py ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import json
3
+ from pyngrok import ngrok, conf
4
+ import os
5
+ import psutil
6
+ import signal
7
+ import socket
8
+ import sys
9
+ import subprocess
10
+
11
+ def get_saved_data():
12
+ try:
13
+ with open('data.json', 'r') as file:
14
+ data = json.load(file)
15
+ return data
16
+ except (FileNotFoundError, json.JSONDecodeError):
17
+ return None
18
+
19
+ def save_data(data):
20
+ with open('data.json', 'w') as file:
21
+ json.dump(data, file)
22
+
23
+ def signal_handler(sig, frame):
24
+ print('You pressed Ctrl+C!')
25
+ sys.exit(0)
26
+
27
+ def is_port_in_use(port):
28
+ with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
29
+ return s.connect_ex(('127.0.0.1', port)) == 0
30
+
31
+ def find_and_terminate_process(port):
32
+ for process in psutil.process_iter(['pid', 'name', 'connections']):
33
+ for conn in process.info.get('connections', []):
34
+ if conn.laddr.port == port:
35
+ print(f"Port {port} is in use by process {process.info['name']} (PID {process.info['pid']})")
36
+ try:
37
+ process.terminate()
38
+ print(f"Terminated process with PID {process.info['pid']}")
39
+ except psutil.NoSuchProcess:
40
+ print(f"Process with PID {process.info['pid']} not found")
41
+
42
+ def main():
43
+ target_port = 7860
44
+
45
+ if is_port_in_use(target_port):
46
+ find_and_terminate_process(target_port)
47
+ else:
48
+ print(f"Port {target_port} is free.")
49
+
50
+ parser = argparse.ArgumentParser(description='Console app with token and domain arguments')
51
+ parser.add_argument('--token', help='Specify the token')
52
+ parser.add_argument('--domain', help='Specify the domain')
53
+ parser.add_argument('--reset', action='store_true', help='Reset saved data')
54
+
55
+ args = parser.parse_args()
56
+
57
+ saved_data = get_saved_data()
58
+
59
+ if args.reset:
60
+ if saved_data is not None:
61
+ saved_data = { 'token': '', 'domain': ''}
62
+ else:
63
+ if saved_data is not None:
64
+ if args.token:
65
+ saved_data['token'] = args.token
66
+ if args.domain:
67
+ saved_data['domain'] = args.domain
68
+ else:
69
+ saved_data = { 'token': '', 'domain': ''}
70
+
71
+ if args.token is None:
72
+ if saved_data and saved_data['token']:
73
+ args.token = saved_data['token']
74
+ else:
75
+ args.token = input('Enter the token: ')
76
+ if args.token == '':
77
+ args.token = input('Enter the token: ')
78
+ saved_data['token'] = args.token
79
+
80
+ if args.domain is None:
81
+ args.domain = ''
82
+ if saved_data and saved_data['domain']:
83
+ args.domain = saved_data['domain']
84
+ else:
85
+ args.domain = input('Enter the domain: ')
86
+ saved_data['domain'] = args.domain
87
+
88
+ save_data(saved_data)
89
+
90
+ print(f'Token: {args.token}')
91
+ print(f'Domain: {args.domain}')
92
+
93
+ if args.token != '':
94
+ ngrok.kill()
95
+ srv = ngrok.connect(target_port, pyngrok_config=conf.PyngrokConfig(auth_token=args.token),
96
+ bind_tls=True, domain=args.domain).public_url
97
+ print(srv)
98
+
99
+ signal.signal(signal.SIGINT, signal_handler)
100
+ print('Press Ctrl+C to exit')
101
+ cmd = 'cd facefusion; python run.py --execution-providers cuda'
102
+ env = os.environ.copy()
103
+ subprocess.run(cmd, shell=True, env=env)
104
+ signal.pause()
105
+ else:
106
+ print('An ngrok token is required. You can get one on https://ngrok.com')
107
+
108
+ if __name__ == '__main__':
109
+ main()
start.sh ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ eval "$(conda shell.bash hook)"
3
+
4
+ # Create the Conda environment
5
+ env_exists=1
6
+ if [ ! -d ~/.conda/envs/SwapFace2Pon ]
7
+ then
8
+ env_exists=0
9
+ conda create -y -n SwapFace2Pon python=3.10
10
+ fi
11
+
12
+ conda activate facefusion
13
+
14
+ # Get SwapFace2Pon from GitHub
15
+ if [ ! -d "SwapFace2Pon" ]
16
+ then
17
+ git clone https://huggingface.co/spaces/victorisgeek/SwapFace2Pon
18
+ fi
19
+
20
+ # Update the installation if the parameter "update" was passed by running
21
+ # start.sh update
22
+ if [ $# -eq 1 ] && [ $1 = "update" ]
23
+ then
24
+ cd SwapFace2Pon
25
+ git pull
26
+ cd ..
27
+ fi
28
+
29
+ # Install the required packages if the environment needs to be freshly installed or updated
30
+ if [ $# -eq 1 ] && [ $1 = "update" ] || [ $env_exists = 0 ]
31
+ then
32
+ cd SwapFace2Pon
33
+ python install.py --torch cuda --onnxruntime cuda
34
+ cd ..
35
+ pip install pyngrok
36
+ conda install opencv -y
37
+ conda install ffmpeg
38
+ fi
39
+
40
+ # Start SwapFace2Pon with ngrok
41
+ if [ $# -eq 0 ]
42
+ then
43
+ python start-ngrok.py
44
+ elif [ $1 = "reset" ]
45
+ then
46
+ python start-ngrok.py --reset
47
+ fi
update.requirements.txt ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ --extra-index-url https://download.pytorch.org/whl/cu118
2
+
3
+ gfpgan==1.3.8
4
+ gradio
5
+ insightface==0.7.3
6
+ moviepy>=1.0.3
7
+ numpy==1.24.3
8
+ onnx==1.14.0
9
+ onnxruntime==1.15.1; platform_system != 'Darwin'
10
+ opencv-python==4.8.0.74
11
+ pillow==10.0.0
12
+ protobuf==4.23.4
13
+ psutil==5.9.5
14
+ realesrgan==0.3.0
15
+ tensorflow==2.13.0
16
+ tqdm==4.65.0
utils.py ADDED
@@ -0,0 +1,303 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import cv2
3
+ import time
4
+ import glob
5
+ import shutil
6
+ import platform
7
+ import datetime
8
+ import subprocess
9
+ import numpy as np
10
+ from threading import Thread
11
+ from moviepy.editor import VideoFileClip, ImageSequenceClip
12
+ from moviepy.video.io.ffmpeg_tools import ffmpeg_extract_subclip
13
+
14
+
15
+ logo_image = cv2.imread("./assets/images/logo.png", cv2.IMREAD_UNCHANGED)
16
+
17
+
18
+ quality_types = ["poor", "low", "medium", "high", "best"]
19
+
20
+
21
+ bitrate_quality_by_resolution = {
22
+ 240: {"poor": "300k", "low": "500k", "medium": "800k", "high": "1000k", "best": "1200k"},
23
+ 360: {"poor": "500k","low": "800k","medium": "1200k","high": "1500k","best": "2000k"},
24
+ 480: {"poor": "800k","low": "1200k","medium": "2000k","high": "2500k","best": "3000k"},
25
+ 720: {"poor": "1500k","low": "2500k","medium": "4000k","high": "5000k","best": "6000k"},
26
+ 1080: {"poor": "2500k","low": "4000k","medium": "6000k","high": "7000k","best": "8000k"},
27
+ 1440: {"poor": "4000k","low": "6000k","medium": "8000k","high": "10000k","best": "12000k"},
28
+ 2160: {"poor": "8000k","low": "10000k","medium": "12000k","high": "15000k","best": "20000k"}
29
+ }
30
+
31
+
32
+ crf_quality_by_resolution = {
33
+ 240: {"poor": 45, "low": 35, "medium": 28, "high": 23, "best": 20},
34
+ 360: {"poor": 35, "low": 28, "medium": 23, "high": 20, "best": 18},
35
+ 480: {"poor": 28, "low": 23, "medium": 20, "high": 18, "best": 16},
36
+ 720: {"poor": 23, "low": 20, "medium": 18, "high": 16, "best": 14},
37
+ 1080: {"poor": 20, "low": 18, "medium": 16, "high": 14, "best": 12},
38
+ 1440: {"poor": 18, "low": 16, "medium": 14, "high": 12, "best": 10},
39
+ 2160: {"poor": 16, "low": 14, "medium": 12, "high": 10, "best": 8}
40
+ }
41
+
42
+
43
+ def get_bitrate_for_resolution(resolution, quality):
44
+ available_resolutions = list(bitrate_quality_by_resolution.keys())
45
+ closest_resolution = min(available_resolutions, key=lambda x: abs(x - resolution))
46
+ return bitrate_quality_by_resolution[closest_resolution][quality]
47
+
48
+
49
+ def get_crf_for_resolution(resolution, quality):
50
+ available_resolutions = list(crf_quality_by_resolution.keys())
51
+ closest_resolution = min(available_resolutions, key=lambda x: abs(x - resolution))
52
+ return crf_quality_by_resolution[closest_resolution][quality]
53
+
54
+
55
+ def get_video_bitrate(video_file):
56
+ ffprobe_cmd = ['ffprobe', '-v', 'error', '-select_streams', 'v:0', '-show_entries',
57
+ 'stream=bit_rate', '-of', 'default=noprint_wrappers=1:nokey=1', video_file]
58
+ result = subprocess.run(ffprobe_cmd, stdout=subprocess.PIPE)
59
+ kbps = max(int(result.stdout) // 1000, 10)
60
+ return str(kbps) + 'k'
61
+
62
+
63
+ def trim_video(video_path, output_path, start_frame, stop_frame):
64
+ video_name, _ = os.path.splitext(os.path.basename(video_path))
65
+ trimmed_video_filename = video_name + "_trimmed" + ".mp4"
66
+ temp_path = os.path.join(output_path, "trim")
67
+ os.makedirs(temp_path, exist_ok=True)
68
+ trimmed_video_file_path = os.path.join(temp_path, trimmed_video_filename)
69
+
70
+ video = VideoFileClip(video_path, fps_source="fps")
71
+ fps = video.fps
72
+ start_time = start_frame / fps
73
+ duration = (stop_frame - start_frame) / fps
74
+
75
+ bitrate = get_bitrate_for_resolution(min(*video.size), "high")
76
+
77
+ trimmed_video = video.subclip(start_time, start_time + duration)
78
+ trimmed_video.write_videofile(
79
+ trimmed_video_file_path, codec="libx264", audio_codec="aac", bitrate=bitrate,
80
+ )
81
+ trimmed_video.close()
82
+ video.close()
83
+
84
+ return trimmed_video_file_path
85
+
86
+
87
+ def open_directory(path=None):
88
+ if path is None:
89
+ return
90
+ try:
91
+ os.startfile(path)
92
+ except:
93
+ subprocess.Popen(["xdg-open", path])
94
+
95
+
96
+ class StreamerThread(object):
97
+ def __init__(self, src=0):
98
+ self.capture = cv2.VideoCapture(src)
99
+ self.capture.set(cv2.CAP_PROP_BUFFERSIZE, 2)
100
+ self.FPS = 1 / 30
101
+ self.FPS_MS = int(self.FPS * 1000)
102
+ self.thread = None
103
+ self.stopped = False
104
+ self.frame = None
105
+
106
+ def start(self):
107
+ self.thread = Thread(target=self.update, args=())
108
+ self.thread.daemon = True
109
+ self.thread.start()
110
+
111
+ def stop(self):
112
+ self.stopped = True
113
+ self.thread.join()
114
+ print("stopped")
115
+
116
+ def update(self):
117
+ while not self.stopped:
118
+ if self.capture.isOpened():
119
+ (self.status, self.frame) = self.capture.read()
120
+ time.sleep(self.FPS)
121
+
122
+
123
+ class ProcessBar:
124
+ def __init__(self, bar_length, total, before="⬛", after="🟨"):
125
+ self.bar_length = bar_length
126
+ self.total = total
127
+ self.before = before
128
+ self.after = after
129
+ self.bar = [self.before] * bar_length
130
+ self.start_time = time.time()
131
+
132
+ def get(self, index):
133
+ total = self.total
134
+ elapsed_time = time.time() - self.start_time
135
+ average_time_per_iteration = elapsed_time / (index + 1)
136
+ remaining_iterations = total - (index + 1)
137
+ estimated_remaining_time = remaining_iterations * average_time_per_iteration
138
+
139
+ self.bar[int(index / total * self.bar_length)] = self.after
140
+ info_text = f"({index+1}/{total}) {''.join(self.bar)} "
141
+ info_text += f"(ETR: {int(estimated_remaining_time // 60)} min {int(estimated_remaining_time % 60)} sec)"
142
+ return info_text
143
+
144
+
145
+ def add_logo_to_image(img, logo=logo_image):
146
+ logo_size = int(img.shape[1] * 0.1)
147
+ logo = cv2.resize(logo, (logo_size, logo_size))
148
+ if logo.shape[2] == 4:
149
+ alpha = logo[:, :, 3]
150
+ else:
151
+ alpha = np.ones_like(logo[:, :, 0]) * 255
152
+ padding = int(logo_size * 0.1)
153
+ roi = img.shape[0] - logo_size - padding, img.shape[1] - logo_size - padding
154
+ for c in range(0, 3):
155
+ img[roi[0] : roi[0] + logo_size, roi[1] : roi[1] + logo_size, c] = (
156
+ alpha / 255.0
157
+ ) * logo[:, :, c] + (1 - alpha / 255.0) * img[
158
+ roi[0] : roi[0] + logo_size, roi[1] : roi[1] + logo_size, c
159
+ ]
160
+ return img
161
+
162
+
163
+ def split_list_by_lengths(data, length_list):
164
+ split_data = []
165
+ start_idx = 0
166
+ for length in length_list:
167
+ end_idx = start_idx + length
168
+ sublist = data[start_idx:end_idx]
169
+ split_data.append(sublist)
170
+ start_idx = end_idx
171
+ return split_data
172
+
173
+
174
+ def merge_img_sequence_from_ref(ref_video_path, image_sequence, output_file_name):
175
+ video_clip = VideoFileClip(ref_video_path, fps_source="fps")
176
+ fps = video_clip.fps
177
+ duration = video_clip.duration
178
+ total_frames = video_clip.reader.nframes
179
+ audio_clip = video_clip.audio if video_clip.audio is not None else None
180
+ edited_video_clip = ImageSequenceClip(image_sequence, fps=fps)
181
+
182
+ if audio_clip is not None:
183
+ edited_video_clip = edited_video_clip.set_audio(audio_clip)
184
+
185
+ bitrate = get_bitrate_for_resolution(min(*edited_video_clip.size), "high")
186
+
187
+ edited_video_clip.set_duration(duration).write_videofile(
188
+ output_file_name, codec="libx264", bitrate=bitrate,
189
+ )
190
+ edited_video_clip.close()
191
+ video_clip.close()
192
+
193
+
194
+ def scale_bbox_from_center(bbox, scale_width, scale_height, image_width, image_height):
195
+ # Extract the coordinates of the bbox
196
+ x1, y1, x2, y2 = bbox
197
+
198
+ # Calculate the center point of the bbox
199
+ center_x = (x1 + x2) / 2
200
+ center_y = (y1 + y2) / 2
201
+
202
+ # Calculate the new width and height of the bbox based on the scaling factors
203
+ width = x2 - x1
204
+ height = y2 - y1
205
+ new_width = width * scale_width
206
+ new_height = height * scale_height
207
+
208
+ # Calculate the new coordinates of the bbox, considering the image boundaries
209
+ new_x1 = center_x - new_width / 2
210
+ new_y1 = center_y - new_height / 2
211
+ new_x2 = center_x + new_width / 2
212
+ new_y2 = center_y + new_height / 2
213
+
214
+ # Adjust the coordinates to ensure the bbox remains within the image boundaries
215
+ new_x1 = max(0, new_x1)
216
+ new_y1 = max(0, new_y1)
217
+ new_x2 = min(image_width - 1, new_x2)
218
+ new_y2 = min(image_height - 1, new_y2)
219
+
220
+ # Return the scaled bbox coordinates
221
+ scaled_bbox = [new_x1, new_y1, new_x2, new_y2]
222
+ return scaled_bbox
223
+
224
+
225
+ def laplacian_blending(A, B, m, num_levels=7):
226
+ assert A.shape == B.shape
227
+ assert B.shape == m.shape
228
+ height = m.shape[0]
229
+ width = m.shape[1]
230
+ size_list = np.array([4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192])
231
+ size = size_list[np.where(size_list > max(height, width))][0]
232
+ GA = np.zeros((size, size, 3), dtype=np.float32)
233
+ GA[:height, :width, :] = A
234
+ GB = np.zeros((size, size, 3), dtype=np.float32)
235
+ GB[:height, :width, :] = B
236
+ GM = np.zeros((size, size, 3), dtype=np.float32)
237
+ GM[:height, :width, :] = m
238
+ gpA = [GA]
239
+ gpB = [GB]
240
+ gpM = [GM]
241
+ for i in range(num_levels):
242
+ GA = cv2.pyrDown(GA)
243
+ GB = cv2.pyrDown(GB)
244
+ GM = cv2.pyrDown(GM)
245
+ gpA.append(np.float32(GA))
246
+ gpB.append(np.float32(GB))
247
+ gpM.append(np.float32(GM))
248
+ lpA = [gpA[num_levels-1]]
249
+ lpB = [gpB[num_levels-1]]
250
+ gpMr = [gpM[num_levels-1]]
251
+ for i in range(num_levels-1,0,-1):
252
+ LA = np.subtract(gpA[i-1], cv2.pyrUp(gpA[i]))
253
+ LB = np.subtract(gpB[i-1], cv2.pyrUp(gpB[i]))
254
+ lpA.append(LA)
255
+ lpB.append(LB)
256
+ gpMr.append(gpM[i-1])
257
+ LS = []
258
+ for la,lb,gm in zip(lpA,lpB,gpMr):
259
+ ls = la * gm + lb * (1.0 - gm)
260
+ LS.append(ls)
261
+ ls_ = LS[0]
262
+ for i in range(1,num_levels):
263
+ ls_ = cv2.pyrUp(ls_)
264
+ ls_ = cv2.add(ls_, LS[i])
265
+ ls_ = ls_[:height, :width, :]
266
+ #ls_ = (ls_ - np.min(ls_)) * (255.0 / (np.max(ls_) - np.min(ls_)))
267
+ return ls_.clip(0, 255)
268
+
269
+
270
+ def mask_crop(mask, crop):
271
+ top, bottom, left, right = crop
272
+ shape = mask.shape
273
+ top = int(top)
274
+ bottom = int(bottom)
275
+ if top + bottom < shape[1]:
276
+ if top > 0: mask[:top, :] = 0
277
+ if bottom > 0: mask[-bottom:, :] = 0
278
+
279
+ left = int(left)
280
+ right = int(right)
281
+ if left + right < shape[0]:
282
+ if left > 0: mask[:, :left] = 0
283
+ if right > 0: mask[:, -right:] = 0
284
+
285
+ return mask
286
+
287
+ def create_image_grid(images, size=128):
288
+ num_images = len(images)
289
+ num_cols = int(np.ceil(np.sqrt(num_images)))
290
+ num_rows = int(np.ceil(num_images / num_cols))
291
+ grid = np.zeros((num_rows * size, num_cols * size, 3), dtype=np.uint8)
292
+
293
+ for i, image in enumerate(images):
294
+ row_idx = (i // num_cols) * size
295
+ col_idx = (i % num_cols) * size
296
+ image = cv2.resize(image.copy(), (size,size))
297
+ if image.dtype != np.uint8:
298
+ image = (image.astype('float32') * 255).astype('uint8')
299
+ if image.ndim == 2:
300
+ image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
301
+ grid[row_idx:row_idx + size, col_idx:col_idx + size] = image
302
+
303
+ return grid