Spaces:
Sleeping
Sleeping
Add application file
Browse files- .gitignore +3 -0
- README.md +186 -4
- app.py +1396 -0
- requirements.txt +2 -0
- webui.bat +73 -0
.gitignore
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
.vs
|
2 |
+
venv
|
3 |
+
tmp
|
README.md
CHANGED
@@ -1,14 +1,196 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
-
emoji:
|
4 |
colorFrom: indigo
|
5 |
-
colorTo:
|
6 |
sdk: gradio
|
7 |
sdk_version: 5.29.0
|
8 |
app_file: app.py
|
9 |
-
pinned:
|
10 |
license: mit
|
11 |
short_description: 'Video Processor: Detection, Classification, Analysis'
|
12 |
---
|
13 |
|
14 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
title: ImgutilsVideoProcessor
|
3 |
+
emoji: 🖼️
|
4 |
colorFrom: indigo
|
5 |
+
colorTo: pink
|
6 |
sdk: gradio
|
7 |
sdk_version: 5.29.0
|
8 |
app_file: app.py
|
9 |
+
pinned: true
|
10 |
license: mit
|
11 |
short_description: 'Video Processor: Detection, Classification, Analysis'
|
12 |
---
|
13 |
|
14 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
15 |
+
|
16 |
+
---
|
17 |
+
|
18 |
+
# [deepghs/anime_person_detection](https://huggingface.co/deepghs/anime_person_detection)
|
19 |
+
|
20 |
+
| Model | FLOPS | Params | F1 Score | Threshold | F1 Plot | Confusion | Labels |
|
21 |
+
|:--------------------:|:-------:|:--------:|:----------:|:-----------:|:---------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------:|:--------:|
|
22 |
+
| person_detect_v1.2_s | 3.49k | 11.1M | 0.86 | 0.295 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.2_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.2_s/confusion_matrix.png) | `person` |
|
23 |
+
| person_detect_v1.3_s | 3.49k | 11.1M | 0.86 | 0.324 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.3_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.3_s/confusion_matrix.png) | `person` |
|
24 |
+
| person_detect_v0_x | 31.3k | 68.1M | N/A | N/A | N/A | N/A | `person` |
|
25 |
+
| person_detect_v0_m | 9.53k | 25.8M | 0.85 | 0.424 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v0_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v0_m/confusion_matrix.png) | `person` |
|
26 |
+
| person_detect_v0_s | 3.49k | 11.1M | N/A | N/A | N/A | N/A | `person` |
|
27 |
+
| person_detect_v1.1_s | 3.49k | 11.1M | 0.86 | 0.384 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.1_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.1_s/confusion_matrix.png) | `person` |
|
28 |
+
| person_detect_v1.1_n | 898 | 3.01M | 0.85 | 0.327 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.1_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.1_n/confusion_matrix.png) | `person` |
|
29 |
+
| person_detect_v1.1_m | 9.53k | 25.8M | 0.87 | 0.348 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.1_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1.1_m/confusion_matrix.png) | `person` |
|
30 |
+
| person_detect_v1_m | 9.53k | 25.8M | 0.86 | 0.351 | [plot](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_person_detection/blob/main/person_detect_v1_m/confusion_matrix.png) | `person` |
|
31 |
+
|
32 |
+
___
|
33 |
+
|
34 |
+
# [deepghs/anime_halfbody_detection](https://huggingface.co/deepghs/anime_halfbody_detection)
|
35 |
+
|
36 |
+
| Model | FLOPS | Params | F1 Score | Threshold | F1 Plot | Confusion | Labels |
|
37 |
+
|:----------------------:|:-------:|:--------:|:----------:|:-----------:|:-------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------------------------------:|:----------:|
|
38 |
+
| halfbody_detect_v1.0_n | 898 | 3.01M | 0.94 | 0.512 | [plot](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v1.0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v1.0_n/confusion_matrix.png) | `halfbody` |
|
39 |
+
| halfbody_detect_v1.0_s | 3.49k | 11.1M | 0.95 | 0.577 | [plot](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v1.0_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v1.0_s/confusion_matrix.png) | `halfbody` |
|
40 |
+
| halfbody_detect_v0.4_s | 3.49k | 11.1M | 0.93 | 0.517 | [plot](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v0.4_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v0.4_s/confusion_matrix.png) | `halfbody` |
|
41 |
+
| halfbody_detect_v0.3_s | 3.49k | 11.1M | 0.92 | 0.222 | [plot](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v0.3_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v0.3_s/confusion_matrix.png) | `halfbody` |
|
42 |
+
| halfbody_detect_v0.2_s | 3.49k | 11.1M | 0.94 | 0.548 | [plot](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v0.2_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_halfbody_detection/blob/main/halfbody_detect_v0.2_s/confusion_matrix.png) | `halfbody` |
|
43 |
+
|
44 |
+
___
|
45 |
+
|
46 |
+
# [deepghs/anime_head_detection](https://huggingface.co/deepghs/anime_head_detection)
|
47 |
+
|
48 |
+
| Model | Type | FLOPS | Params | F1 Score | Threshold | precision(B) | recall(B) | mAP50(B) | mAP50-95(B) | F1 Plot | Confusion | Labels |
|
49 |
+
|:-------------------------:|:------:|:-------:|:--------:|:----------:|:-----------:|:--------------:|:-----------:|:----------:|:-------------:|:------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------:|:--------:|
|
50 |
+
| head_detect_v2.0_x_yv11 | yolo | 195G | 56.9M | 0.93 | 0.458 | 0.95942 | 0.90938 | 0.96853 | 0.78938 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_x_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_x_yv11/confusion_matrix_normalized.png) | `head` |
|
51 |
+
| head_detect_v2.0_l_yv11 | yolo | 87.3G | 25.3M | 0.93 | 0.432 | 0.95557 | 0.90905 | 0.9661 | 0.78709 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_l_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_l_yv11/confusion_matrix_normalized.png) | `head` |
|
52 |
+
| head_detect_v2.0_m_yv11 | yolo | 68.2G | 20.1M | 0.93 | 0.42 | 0.95383 | 0.90658 | 0.96485 | 0.78511 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_m_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_m_yv11/confusion_matrix_normalized.png) | `head` |
|
53 |
+
| head_detect_v2.0_s_yv11 | yolo | 21.5G | 9.43M | 0.92 | 0.383 | 0.954 | 0.89512 | 0.95789 | 0.77753 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_s_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_s_yv11/confusion_matrix_normalized.png) | `head` |
|
54 |
+
| head_detect_v2.0_n_yv11 | yolo | 6.44G | 2.59M | 0.91 | 0.365 | 0.94815 | 0.87169 | 0.9452 | 0.75835 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_n_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_n_yv11/confusion_matrix_normalized.png) | `head` |
|
55 |
+
| head_detect_v2.0_x | yolo | 227G | 61.6M | 0.93 | 0.459 | 0.95378 | 0.91123 | 0.96593 | 0.78767 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_x/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_x/confusion_matrix_normalized.png) | `head` |
|
56 |
+
| head_detect_v2.0_l | yolo | 146G | 39.5M | 0.93 | 0.379 | 0.95124 | 0.91264 | 0.96458 | 0.78627 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_l/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_l/confusion_matrix_normalized.png) | `head` |
|
57 |
+
| head_detect_v2.0_m | yolo | 79.1G | 25.9M | 0.93 | 0.397 | 0.95123 | 0.90701 | 0.96403 | 0.78342 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_m/confusion_matrix_normalized.png) | `head` |
|
58 |
+
| head_detect_v2.0_s | yolo | 28.6G | 11.1M | 0.92 | 0.413 | 0.95556 | 0.89197 | 0.95799 | 0.77833 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_s/confusion_matrix_normalized.png) | `head` |
|
59 |
+
| head_detect_v2.0_n | yolo | 8.19G | 3.01M | 0.91 | 0.368 | 0.94633 | 0.87046 | 0.94361 | 0.75764 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v2.0_n/confusion_matrix_normalized.png) | `head` |
|
60 |
+
| head_detect_v1.6_x_rtdetr | rtdetr | 232G | 67.3M | 0.93 | 0.559 | 0.95316 | 0.91697 | 0.96556 | 0.76682 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_x_rtdetr/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_x_rtdetr/confusion_matrix_normalized.png) | `head` |
|
61 |
+
| head_detect_v1.6_l_rtdetr | rtdetr | 108G | 32.8M | 0.93 | 0.53 | 0.95113 | 0.90956 | 0.96218 | 0.76201 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_l_rtdetr/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_l_rtdetr/confusion_matrix_normalized.png) | `head` |
|
62 |
+
| head_detect_v1.6_s_yv11 | yolo | 21.5G | 9.43M | 0.93 | 0.42 | 0.95273 | 0.90558 | 0.96327 | 0.78566 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s_yv11/confusion_matrix_normalized.png) | `head` |
|
63 |
+
| head_detect_v1.6_n_yv11 | yolo | 6.44G | 2.59M | 0.92 | 0.385 | 0.95561 | 0.87798 | 0.95086 | 0.76765 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_n_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_n_yv11/confusion_matrix_normalized.png) | `head` |
|
64 |
+
| head_detect_v1.6_s_yv9 | yolo | 22.7G | 6.32M | 0.93 | 0.419 | 0.95464 | 0.90425 | 0.9627 | 0.78663 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s_yv9/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s_yv9/confusion_matrix_normalized.png) | `head` |
|
65 |
+
| head_detect_v1.6_t_yv9 | yolo | 6.7G | 1.77M | 0.91 | 0.332 | 0.94968 | 0.8792 | 0.95069 | 0.76789 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_t_yv9/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_t_yv9/confusion_matrix_normalized.png) | `head` |
|
66 |
+
| head_detect_v1.6_x | yolo | 258G | 68.2M | 0.94 | 0.448 | 0.9546 | 0.91873 | 0.96878 | 0.79502 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_x/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_x/confusion_matrix_normalized.png) | `head` |
|
67 |
+
| head_detect_v1.6_l | yolo | 165G | 43.6M | 0.94 | 0.458 | 0.95733 | 0.92018 | 0.96868 | 0.79428 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_l/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_l/confusion_matrix_normalized.png) | `head` |
|
68 |
+
| head_detect_v1.6_s_yv10 | yolo | 24.8G | 8.07M | 0.93 | 0.406 | 0.95424 | 0.90074 | 0.96201 | 0.78713 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s_yv10/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s_yv10/confusion_matrix_normalized.png) | `head` |
|
69 |
+
| head_detect_v1.6_n_yv10 | yolo | 8.39G | 2.71M | 0.91 | 0.374 | 0.94845 | 0.87492 | 0.9503 | 0.77059 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_n_yv10/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_n_yv10/confusion_matrix_normalized.png) | `head` |
|
70 |
+
| head_detect_v1.6_s | yolo | 28.6G | 11.1M | 0.93 | 0.381 | 0.95333 | 0.90587 | 0.96241 | 0.78688 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_s/confusion_matrix_normalized.png) | `head` |
|
71 |
+
| head_detect_v1.6_n | yolo | 8.19G | 3.01M | 0.92 | 0.38 | 0.94835 | 0.88436 | 0.95051 | 0.76766 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.6_n/confusion_matrix_normalized.png) | `head` |
|
72 |
+
| head_detect_v1.5_s | yolo | 28.6G | 11.1M | 0.94 | 0.453 | 0.96014 | 0.92275 | 0.96829 | 0.80674 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.5_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.5_s/confusion_matrix_normalized.png) | `head` |
|
73 |
+
| head_detect_v1.5_n | yolo | 8.19G | 3.01M | 0.93 | 0.396 | 0.95719 | 0.90511 | 0.9612 | 0.78841 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.5_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.5_n/confusion_matrix_normalized.png) | `head` |
|
74 |
+
| head_detect_v1.4_s | yolo | 28.6G | 11.1M | 0.94 | 0.472 | 0.96275 | 0.91875 | 0.96812 | 0.80417 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.4_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.4_s/confusion_matrix_normalized.png) | `head` |
|
75 |
+
| head_detect_v1.4_n | yolo | 8.19G | 3.01M | 0.93 | 0.396 | 0.9557 | 0.90559 | 0.96075 | 0.78689 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.4_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.4_n/confusion_matrix_normalized.png) | `head` |
|
76 |
+
| head_detect_v1.3_s | yolo | 28.6G | 11.1M | 0.94 | 0.423 | 0.95734 | 0.9257 | 0.97037 | 0.80391 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.3_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.3_s/confusion_matrix_normalized.png) | `head` |
|
77 |
+
| head_detect_v1.3_n | yolo | 8.19G | 3.01M | 0.93 | 0.409 | 0.95254 | 0.90674 | 0.96258 | 0.7844 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.3_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.3_n/confusion_matrix_normalized.png) | `head` |
|
78 |
+
| head_detect_v1.2_s | yolo | 28.6G | 11.1M | 0.94 | 0.415 | 0.95756 | 0.9271 | 0.97097 | 0.80514 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.2_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.2_s/confusion_matrix_normalized.png) | `head` |
|
79 |
+
| head_detect_v1.2_n | yolo | 8.19G | 3.01M | 0.93 | 0.471 | 0.96309 | 0.89766 | 0.9647 | 0.78928 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.2_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.2_n/confusion_matrix_normalized.png) | `head` |
|
80 |
+
| head_detect_v1.1_s | yolo | 28.6G | 11.1M | 0.94 | 0.485 | 0.96191 | 0.91892 | 0.97069 | 0.80182 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.1_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.1_s/confusion_matrix_normalized.png) | `head` |
|
81 |
+
| head_detect_v1.0_l | yolo | 165G | 43.6M | 0.94 | 0.579 | 0.95881 | 0.91532 | 0.96561 | 0.81417 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_l/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_l/confusion_matrix_normalized.png) | `head` |
|
82 |
+
| head_detect_v1.0_x | yolo | 258G | 68.2M | 0.94 | 0.567 | 0.9597 | 0.91947 | 0.96682 | 0.8154 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_x/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_x/confusion_matrix_normalized.png) | `head` |
|
83 |
+
| head_detect_v1.0_m | yolo | 79.1G | 25.9M | 0.94 | 0.489 | 0.95805 | 0.9196 | 0.96632 | 0.81383 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_m/confusion_matrix_normalized.png) | `head` |
|
84 |
+
| head_detect_v1.0_s | yolo | 28.6G | 11.1M | 0.93 | 0.492 | 0.95267 | 0.91355 | 0.96245 | 0.80371 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_s/confusion_matrix_normalized.png) | `head` |
|
85 |
+
| head_detect_v1.0_n | yolo | 8.19G | 3.01M | 0.92 | 0.375 | 0.93999 | 0.9002 | 0.95509 | 0.7849 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v1.0_n/confusion_matrix_normalized.png) | `head` |
|
86 |
+
| head_detect_v0.5_s | yolo | 28.6G | 11.1M | 0.92 | 0.415 | 0.93908 | 0.9034 | 0.95697 | 0.77514 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_s/confusion_matrix_normalized.png) | `head` |
|
87 |
+
| head_detect_v0.5_n | yolo | 8.19G | 3.01M | 0.91 | 0.446 | 0.93834 | 0.88034 | 0.94784 | 0.75251 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_n/confusion_matrix_normalized.png) | `head` |
|
88 |
+
| head_detect_v0.5_s_pruned | yolo | 28.6G | 11.1M | 0.93 | 0.472 | 0.95455 | 0.89865 | 0.9584 | 0.79968 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_s_pruned/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_s_pruned/confusion_matrix_normalized.png) | `head` |
|
89 |
+
| head_detect_v0.5_n_pruned | yolo | 8.19G | 3.01M | 0.91 | 0.523 | 0.95254 | 0.8743 | 0.95049 | 0.7807 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_n_pruned/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_n_pruned/confusion_matrix_normalized.png) | `head` |
|
90 |
+
| head_detect_v0.5_m_pruned | yolo | 79.1G | 25.9M | 0.94 | 0.52 | 0.9609 | 0.91365 | 0.96501 | 0.81322 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_m_pruned/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.5_m_pruned/confusion_matrix_normalized.png) | `head` |
|
91 |
+
| head_detect_v0.4_s | yolo | 28.6G | 11.1M | 0.92 | 0.405 | 0.93314 | 0.90274 | 0.95727 | 0.77193 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.4_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.4_s/confusion_matrix_normalized.png) | `head` |
|
92 |
+
| head_detect_v0.4_s_fp | yolo | 28.6G | 11.1M | 0.91 | 0.445 | 0.93181 | 0.89113 | 0.95002 | 0.76302 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.4_s_fp/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.4_s_fp/confusion_matrix_normalized.png) | `head` |
|
93 |
+
| head_detect_v0.3_s | yolo | 28.6G | 11.1M | 0.91 | 0.406 | 0.92457 | 0.90351 | 0.95785 | 0.78912 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.3_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.3_s/confusion_matrix_normalized.png) | `head` |
|
94 |
+
| head_detect_v0.2_s_plus | yolo | 28.6G | 11.1M | 0.91 | 0.594 | 0.94239 | 0.8774 | 0.94909 | 0.77986 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.2_s_plus/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.2_s_plus/confusion_matrix_normalized.png) | `head` |
|
95 |
+
| head_detect_v0.2_s | yolo | 28.6G | 11.1M | 0.9 | 0.461 | 0.91861 | 0.8898 | 0.94765 | 0.77541 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.2_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.2_s/confusion_matrix_normalized.png) | `head` |
|
96 |
+
| head_detect_v0.1_s | yolo | 28.6G | 11.1M | 0.9 | 0.504 | 0.91576 | 0.88662 | 0.94213 | 0.7713 | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.1_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0.1_s/confusion_matrix_normalized.png) | `head` |
|
97 |
+
| head_detect_v0_n | yolo | 8.19G | 3.01M | 0.9 | 0.316 | N/A | N/A | N/A | N/A | [plot](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_head_detection/blob/main/head_detect_v0_n/confusion_matrix.png) | `head` |
|
98 |
+
| head_detect_v0_s | yolo | 28.6G | 11.1M | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | `head` |
|
99 |
+
|
100 |
+
___
|
101 |
+
|
102 |
+
# [deepghs/ccip](https://huggingface.co/deepghs/ccip)
|
103 |
+
|
104 |
+
| Model | F1 Score | Precision | Recall | Threshold | Cluster_2 | Cluster_Free |
|
105 |
+
|:-----------------------------------:|:----------:|:-----------:|:--------:|:-----------:|:-----------:|:--------------:|
|
106 |
+
| ccip-caformer_b36-24 | 0.940925 | 0.938254 | 0.943612 | 0.213231 | 0.89508 | 0.957017 |
|
107 |
+
| ccip-caformer-24-randaug-pruned | 0.917211 | 0.933481 | 0.901499 | 0.178475 | 0.890366 | 0.922375 |
|
108 |
+
| ccip-v2-caformer_s36-10 | 0.906422 | 0.932779 | 0.881513 | 0.207757 | 0.874592 | 0.89241 |
|
109 |
+
| ccip-caformer-6-randaug-pruned_fp32 | 0.878403 | 0.893648 | 0.863669 | 0.195122 | 0.810176 | 0.897904 |
|
110 |
+
| ccip-caformer-5_fp32 | 0.864363 | 0.90155 | 0.830121 | 0.183973 | 0.792051 | 0.862289 |
|
111 |
+
| ccip-caformer-4_fp32 | 0.844967 | 0.870553 | 0.820842 | 0.18367 | 0.795565 | 0.868133 |
|
112 |
+
| ccip-caformer_query-12 | 0.823928 | 0.871122 | 0.781585 | 0.141308 | 0.787237 | 0.809426 |
|
113 |
+
| ccip-caformer-23_randaug_fp32 | 0.81625 | 0.854134 | 0.781585 | 0.136797 | 0.745697 | 0.8068 |
|
114 |
+
| ccip-caformer-2-randaug-pruned_fp32 | 0.78561 | 0.800148 | 0.771592 | 0.171053 | 0.686617 | 0.728195 |
|
115 |
+
| ccip-caformer-2_fp32 | 0.755125 | 0.790172 | 0.723055 | 0.141275 | 0.64977 | 0.718516 |
|
116 |
+
|
117 |
+
* The calculation of `F1 Score`, `Precision`, and `Recall` considers "the characters in both images are the same" as a positive case. `Threshold` is determined by finding the maximum value on the F1 Score curve.
|
118 |
+
* `Cluster_2` represents the approximate optimal clustering solution obtained by tuning the eps value in DBSCAN clustering algorithm with min_samples set to `2`, and evaluating the similarity between the obtained clusters and the true distribution using the `random_adjust_score`.
|
119 |
+
* `Cluster_Free` represents the approximate optimal solution obtained by tuning the `max_eps` and `min_samples` values in the OPTICS clustering algorithm, and evaluating the similarity between the obtained clusters and the true distribution using the `random_adjust_score`.
|
120 |
+
|
121 |
+
___
|
122 |
+
|
123 |
+
# [deepghs/anime_aesthetic](https://huggingface.co/spaces/deepghs/anime_aesthetic)
|
124 |
+
|
125 |
+
| Name | FLOPS | Params | Accuracy | AUC | Confusion | Labels |
|
126 |
+
|:------------------------:|:-------:|:--------:|:----------:|:------:|:-----------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------:|
|
127 |
+
| caformer_s36_v0_ls0.2 | 22.10G | 37.22M | 34.68% | 0.7725 | [confusion](https://huggingface.co/deepghs/anime_aesthetic/blob/main/caformer_s36_v0_ls0.2/plot_confusion.png) | `masterpiece`, `best`, `great`, `good`, `normal`, `low`, `worst` |
|
128 |
+
| swinv2pv3_v0_448_ls0.2 | 46.20G | 65.94M | 40.32% | 0.8188 | [confusion](https://huggingface.co/deepghs/anime_aesthetic/blob/main/swinv2pv3_v0_448_ls0.2/plot_confusion.png) | `masterpiece`, `best`, `great`, `good`, `normal`, `low`, `worst` |
|
129 |
+
| swinv2pv3_v0_448_ls0.2_x | 46.20G | 65.94M | 40.88% | 0.8214 | [confusion](https://huggingface.co/deepghs/anime_aesthetic/blob/main/swinv2pv3_v0_448_ls0.2_x/plot_confusion.png) | `masterpiece`, `best`, `great`, `good`, `normal`, `low`, `worst` |
|
130 |
+
|
131 |
+
___
|
132 |
+
|
133 |
+
# [deepghs/anime_face_detection](https://huggingface.co/deepghs/anime_face_detection)
|
134 |
+
|
135 |
+
| Model | FLOPS | Params | F1 Score | Threshold | F1 Plot | Confusion | Labels |
|
136 |
+
|:------------------:|:-------:|:--------:|:----------:|:-----------:|:-----------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|
|
137 |
+
| face_detect_v1.4_n | 898 | 3.01M | 0.94 | 0.278 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.4_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.4_n/confusion_matrix.png) | `face` |
|
138 |
+
| face_detect_v1.4_s | 3.49k | 11.1M | 0.95 | 0.307 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.4_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.4_s/confusion_matrix.png) | `face` |
|
139 |
+
| face_detect_v1.3_n | 898 | 3.01M | 0.93 | 0.305 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.3_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.3_n/confusion_matrix.png) | `face` |
|
140 |
+
| face_detect_v1.2_s | 3.49k | 11.1M | 0.93 | 0.222 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.2_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.2_s/confusion_matrix.png) | `face` |
|
141 |
+
| face_detect_v1.3_s | 3.49k | 11.1M | 0.93 | 0.259 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.3_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.3_s/confusion_matrix.png) | `face` |
|
142 |
+
| face_detect_v1_s | 3.49k | 11.1M | 0.95 | 0.446 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1_s/confusion_matrix.png) | `face` |
|
143 |
+
| face_detect_v1_n | 898 | 3.01M | 0.95 | 0.458 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1_n/confusion_matrix.png) | `face` |
|
144 |
+
| face_detect_v0_n | 898 | 3.01M | 0.97 | 0.428 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v0_n/confusion_matrix.png) | `face` |
|
145 |
+
| face_detect_v0_s | 3.49k | 11.1M | N/A | N/A | N/A | N/A | `face` |
|
146 |
+
| face_detect_v1.1_n | 898 | 3.01M | 0.94 | 0.373 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.1_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.1_n/confusion_matrix.png) | `face` |
|
147 |
+
| face_detect_v1.1_s | 3.49k | 11.1M | 0.94 | 0.405 | [plot](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.1_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/anime_face_detection/blob/main/face_detect_v1.1_s/confusion_matrix.png) | `face` |
|
148 |
+
___
|
149 |
+
|
150 |
+
# [deepghs/real_person_detection](https://huggingface.co/deepghs/real_person_detection)
|
151 |
+
|
152 |
+
| Model | Type | FLOPS | Params | F1 Score | Threshold | precision(B) | recall(B) | mAP50(B) | mAP50-95(B) | F1 Plot | Confusion | Labels |
|
153 |
+
|:-----------------------:|:------:|:-------:|:--------:|:----------:|:-----------:|:--------------:|:-----------:|:----------:|:-------------:|:-----------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------------------:|:--------:|
|
154 |
+
| person_detect_v0_l_yv11 | yolo | 87.3G | 25.3M | 0.79 | 0.359 | 0.84037 | 0.74055 | 0.82796 | 0.57272 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_l_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_l_yv11/confusion_matrix_normalized.png) | `person` |
|
155 |
+
| person_detect_v0_m_yv11 | yolo | 68.2G | 20.1M | 0.78 | 0.351 | 0.83393 | 0.73614 | 0.82195 | 0.56267 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_m_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_m_yv11/confusion_matrix_normalized.png) | `person` |
|
156 |
+
| person_detect_v0_s_yv11 | yolo | 21.5G | 9.43M | 0.75 | 0.344 | 0.82356 | 0.6967 | 0.79224 | 0.52304 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_s_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_s_yv11/confusion_matrix_normalized.png) | `person` |
|
157 |
+
| person_detect_v0_n_yv11 | yolo | 6.44G | 2.59M | 0.71 | 0.325 | 0.80096 | 0.64148 | 0.74612 | 0.46875 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_n_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_n_yv11/confusion_matrix_normalized.png) | `person` |
|
158 |
+
| person_detect_v0_l | yolo | 165G | 43.6M | 0.79 | 0.359 | 0.83674 | 0.74182 | 0.82536 | 0.57022 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_l/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_l/confusion_matrix_normalized.png) | `person` |
|
159 |
+
| person_detect_v0_m | yolo | 79.1G | 25.9M | 0.78 | 0.363 | 0.83439 | 0.72529 | 0.81314 | 0.55388 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_m/confusion_matrix_normalized.png) | `person` |
|
160 |
+
| person_detect_v0_s | yolo | 28.6G | 11.1M | 0.76 | 0.346 | 0.82522 | 0.69696 | 0.79105 | 0.52201 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_s/confusion_matrix_normalized.png) | `person` |
|
161 |
+
| person_detect_v0_n | yolo | 8.19G | 3.01M | 0.72 | 0.32 | 0.80883 | 0.64552 | 0.74996 | 0.47272 | [plot](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_person_detection/blob/main/person_detect_v0_n/confusion_matrix_normalized.png) | `person` |
|
162 |
+
|
163 |
+
___
|
164 |
+
|
165 |
+
# [deepghs/real_head_detection](https://huggingface.co/deepghs/real_head_detection)
|
166 |
+
|
167 |
+
| Model | Type | FLOPS | Params | F1 Score | Threshold | precision(B) | recall(B) | mAP50(B) | mAP50-95(B) | F1 Plot | Confusion | Labels |
|
168 |
+
|:---------------------:|:------:|:-------:|:--------:|:----------:|:-----------:|:--------------:|:-----------:|:----------:|:-------------:|:-------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------:|:--------:|
|
169 |
+
| head_detect_v0_l_yv11 | yolo | 87.3G | 25.3M | 0.81 | 0.199 | 0.90226 | 0.72872 | 0.81049 | 0.5109 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_l_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_l_yv11/confusion_matrix_normalized.png) | `head` |
|
170 |
+
| head_detect_v0_m_yv11 | yolo | 68.2G | 20.1M | 0.8 | 0.206 | 0.89855 | 0.72654 | 0.80704 | 0.50804 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_m_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_m_yv11/confusion_matrix_normalized.png) | `head` |
|
171 |
+
| head_detect_v0_s_yv11 | yolo | 21.5G | 9.43M | 0.78 | 0.187 | 0.88726 | 0.69234 | 0.77518 | 0.47825 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_s_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_s_yv11/confusion_matrix_normalized.png) | `head` |
|
172 |
+
| head_detect_v0_n_yv11 | yolo | 6.44G | 2.59M | 0.74 | 0.14 | 0.87359 | 0.64011 | 0.73393 | 0.44118 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_n_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_n_yv11/confusion_matrix_normalized.png) | `head` |
|
173 |
+
| head_detect_v0_l | yolo | 165G | 43.6M | 0.81 | 0.234 | 0.89921 | 0.74092 | 0.81715 | 0.51615 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_l/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_l/confusion_matrix_normalized.png) | `head` |
|
174 |
+
| head_detect_v0_m | yolo | 79.1G | 25.9M | 0.8 | 0.228 | 0.90006 | 0.72646 | 0.80614 | 0.50586 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_m/confusion_matrix_normalized.png) | `head` |
|
175 |
+
| head_detect_v0_s | yolo | 28.6G | 11.1M | 0.78 | 0.182 | 0.89224 | 0.69382 | 0.77804 | 0.48067 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_s/confusion_matrix_normalized.png) | `head` |
|
176 |
+
| head_detect_v0_n | yolo | 8.19G | 3.01M | 0.74 | 0.172 | 0.8728 | 0.64823 | 0.73865 | 0.44501 | [plot](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_head_detection/blob/main/head_detect_v0_n/confusion_matrix_normalized.png) | `head` |
|
177 |
+
|
178 |
+
___
|
179 |
+
|
180 |
+
# [deepghs/real_face_detection](https://huggingface.co/deepghs/real_face_detection)
|
181 |
+
|
182 |
+
| Model | Type | FLOPS | Params | F1 Score | Threshold | precision(B) | recall(B) | mAP50(B) | mAP50-95(B) | F1 Plot | Confusion | Labels |
|
183 |
+
|:---------------------:|:------:|:-------:|:--------:|:----------:|:-----------:|:--------------:|:-----------:|:----------:|:-------------:|:-------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------:|:--------:|
|
184 |
+
| face_detect_v0_s_yv12 | yolo | 21.5G | 9.25M | 0.74 | 0.272 | 0.86931 | 0.6404 | 0.73074 | 0.42652 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_s_yv12/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_s_yv12/confusion_matrix_normalized.png) | `face` |
|
185 |
+
| face_detect_v0_n_yv12 | yolo | 6.48G | 2.57M | 0.7 | 0.258 | 0.85246 | 0.59089 | 0.6793 | 0.39182 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_n_yv12/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_n_yv12/confusion_matrix_normalized.png) | `face` |
|
186 |
+
| face_detect_v0_l_yv11 | yolo | 87.3G | 25.3M | 0.77 | 0.291 | 0.88458 | 0.67474 | 0.76666 | 0.45722 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_l_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_l_yv11/confusion_matrix_normalized.png) | `face` |
|
187 |
+
| face_detect_v0_m_yv11 | yolo | 68.2G | 20.1M | 0.76 | 0.262 | 0.87947 | 0.67315 | 0.76073 | 0.45288 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_m_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_m_yv11/confusion_matrix_normalized.png) | `face` |
|
188 |
+
| face_detect_v0_s_yv11 | yolo | 21.5G | 9.43M | 0.73 | 0.271 | 0.87001 | 0.63572 | 0.72683 | 0.42706 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_s_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_s_yv11/confusion_matrix_normalized.png) | `face` |
|
189 |
+
| face_detect_v0_n_yv11 | yolo | 6.44G | 2.59M | 0.7 | 0.263 | 0.86044 | 0.58577 | 0.67641 | 0.38975 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_n_yv11/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_n_yv11/confusion_matrix_normalized.png) | `face` |
|
190 |
+
| face_detect_v0_l | yolo | 165G | 43.6M | 0.76 | 0.277 | 0.87894 | 0.67335 | 0.76313 | 0.4532 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_l/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_l/confusion_matrix_normalized.png) | `face` |
|
191 |
+
| face_detect_v0_m | yolo | 79.1G | 25.9M | 0.75 | 0.277 | 0.87687 | 0.66265 | 0.75114 | 0.44262 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_m/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_m/confusion_matrix_normalized.png) | `face` |
|
192 |
+
| face_detect_v0_s | yolo | 28.6G | 11.1M | 0.73 | 0.282 | 0.86932 | 0.63557 | 0.72494 | 0.42219 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_s/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_s/confusion_matrix_normalized.png) | `face` |
|
193 |
+
| face_detect_v0_n | yolo | 8.19G | 3.01M | 0.7 | 0.257 | 0.85337 | 0.58877 | 0.67471 | 0.38692 | [plot](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_n/F1_curve.png) | [confusion](https://huggingface.co/deepghs/real_face_detection/blob/main/face_detect_v0_n/confusion_matrix_normalized.png) | `face` |
|
194 |
+
|
195 |
+
___
|
196 |
+
|
app.py
ADDED
@@ -0,0 +1,1396 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import io
|
2 |
+
import os
|
3 |
+
import gc
|
4 |
+
import re
|
5 |
+
import cv2
|
6 |
+
import time
|
7 |
+
import zipfile
|
8 |
+
import tempfile
|
9 |
+
import traceback
|
10 |
+
import numpy as np
|
11 |
+
import gradio as gr
|
12 |
+
import imgutils.detect.person as person_detector
|
13 |
+
import imgutils.detect.halfbody as halfbody_detector
|
14 |
+
import imgutils.detect.head as head_detector
|
15 |
+
import imgutils.detect.face as face_detector
|
16 |
+
import imgutils.metrics.ccip as ccip_analyzer
|
17 |
+
import imgutils.metrics.dbaesthetic as dbaesthetic_analyzer
|
18 |
+
import imgutils.metrics.lpips as lpips_module
|
19 |
+
from PIL import Image
|
20 |
+
from typing import List, Tuple, Dict, Any, Union, Optional, Iterator
|
21 |
+
|
22 |
+
# --- Constants for File Types ---
|
23 |
+
IMAGE_EXTENSIONS = ('.png', '.jpg', '.jpeg', '.webp', '.bmp', '.tiff', '.tif', '.gif')
|
24 |
+
VIDEO_EXTENSIONS = ('.mp4', '.avi', '.mov', '.mkv', '.flv', '.webm', '.mpeg', '.mpg')
|
25 |
+
|
26 |
+
# --- Helper Functions ---
|
27 |
+
def sanitize_filename(filename: str, max_len: int = 50) -> str:
|
28 |
+
"""Removes invalid characters and shortens a filename for safe use."""
|
29 |
+
# Remove path components
|
30 |
+
base_name = os.path.basename(filename)
|
31 |
+
# Remove extension
|
32 |
+
name_part, _ = os.path.splitext(base_name)
|
33 |
+
# Replace spaces and problematic characters with underscores
|
34 |
+
sanitized = re.sub(r'[\\/*?:"<>|\s]+', '_', name_part)
|
35 |
+
# Remove leading/trailing underscores/periods
|
36 |
+
sanitized = sanitized.strip('._')
|
37 |
+
# Limit length (important for temp paths and OS limits)
|
38 |
+
sanitized = sanitized[:max_len]
|
39 |
+
# Ensure it's not empty after sanitization
|
40 |
+
if not sanitized:
|
41 |
+
return "file"
|
42 |
+
return sanitized
|
43 |
+
|
44 |
+
def convert_to_pil(frame: np.ndarray) -> Image.Image:
|
45 |
+
"""Converts an OpenCV frame (BGR) to a PIL Image (RGB)."""
|
46 |
+
# Add error handling for potentially empty frames
|
47 |
+
if frame is None or frame.size == 0:
|
48 |
+
raise ValueError("Cannot convert empty frame to PIL Image")
|
49 |
+
try:
|
50 |
+
return Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
|
51 |
+
except Exception as e:
|
52 |
+
# Re-raise with more context if conversion fails
|
53 |
+
raise RuntimeError(f"Failed to convert frame to PIL Image: {e}")
|
54 |
+
|
55 |
+
def image_to_bytes(img: Image.Image, format: str = 'PNG') -> bytes:
|
56 |
+
"""Converts a PIL Image to bytes."""
|
57 |
+
if img is None:
|
58 |
+
raise ValueError("Cannot convert None image to bytes")
|
59 |
+
byte_arr = io.BytesIO()
|
60 |
+
img.save(byte_arr, format=format)
|
61 |
+
return byte_arr.getvalue()
|
62 |
+
|
63 |
+
def create_zip_file(image_data: Dict[str, bytes], output_path: str) -> None:
|
64 |
+
"""
|
65 |
+
Creates a zip file containing the provided images directly at the output_path.
|
66 |
+
|
67 |
+
Args:
|
68 |
+
image_data: A dictionary where keys are filenames (including paths within zip)
|
69 |
+
and values are image bytes.
|
70 |
+
output_path: The full path where the zip file should be created.
|
71 |
+
"""
|
72 |
+
if not image_data:
|
73 |
+
raise ValueError("No image data provided to create zip file.")
|
74 |
+
if not output_path:
|
75 |
+
raise ValueError("No output path provided for the zip file.")
|
76 |
+
|
77 |
+
print(f"Creating zip file at: {output_path}")
|
78 |
+
|
79 |
+
try:
|
80 |
+
# Ensure parent directory exists (useful if output_path is nested)
|
81 |
+
# Though NamedTemporaryFile usually handles this for its own path.
|
82 |
+
parent_dir = os.path.dirname(output_path)
|
83 |
+
if parent_dir: # Check if there is a parent directory component
|
84 |
+
os.makedirs(parent_dir, exist_ok=True)
|
85 |
+
|
86 |
+
with zipfile.ZipFile(output_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
|
87 |
+
# Sort items for potentially better organization and predictability
|
88 |
+
for filename, img_bytes in sorted(image_data.items()):
|
89 |
+
zipf.writestr(filename, img_bytes)
|
90 |
+
print(f"Successfully created zip file with {len(image_data)} items at {output_path}.")
|
91 |
+
# No return value needed as we are writing to a path
|
92 |
+
except Exception as e:
|
93 |
+
print(f"Error creating zip file at {output_path}: {e}")
|
94 |
+
# If zip creation fails, attempt to remove the partially created file
|
95 |
+
if os.path.exists(output_path):
|
96 |
+
try:
|
97 |
+
os.remove(output_path)
|
98 |
+
print(f"Removed partially created/failed zip file: {output_path}")
|
99 |
+
except OSError as remove_err:
|
100 |
+
print(f"Warning: Could not remove failed zip file {output_path}: {remove_err}")
|
101 |
+
raise # Re-raise the original exception
|
102 |
+
|
103 |
+
def generate_filename(
|
104 |
+
base_name: str, # Should be the core identifier, e.g., "frame_X_person_Y_scoreZ"
|
105 |
+
aesthetic_label: Optional[str] = None,
|
106 |
+
ccip_cluster_id_for_lpips_logic: Optional[int] = None, # Original CCIP ID, used to decide if LPIPS is sub-cluster
|
107 |
+
ccip_folder_naming_index: Optional[int] = None, # The new 000, 001, ... index based on image count
|
108 |
+
source_prefix_for_ccip_folder: Optional[str] = None, # The source filename prefix for CCIP folder
|
109 |
+
lpips_folder_naming_index: Optional[Union[int, str]] = None, # New: Can be int (0,1,2...) or "noise"
|
110 |
+
file_extension: str = '.png',
|
111 |
+
# Suffix flags for this specific image:
|
112 |
+
is_halfbody_primary_target_type: bool = False, # If this image itself was a halfbody primary target
|
113 |
+
is_derived_head_crop: bool = False,
|
114 |
+
is_derived_face_crop: bool = False,
|
115 |
+
) -> str:
|
116 |
+
"""
|
117 |
+
Generates the final filename, incorporating aesthetic label, cluster directory,
|
118 |
+
and crop indicators. CCIP and LPIPS folder names are sorted by image count.
|
119 |
+
"""
|
120 |
+
filename_stem = base_name
|
121 |
+
|
122 |
+
# Add suffixes for derived crops.
|
123 |
+
# For halfbody primary targets, the base_name should already contain "halfbody".
|
124 |
+
# This flag is more for potentially adding a suffix if desired, but currently not used to add a suffix.
|
125 |
+
# if is_halfbody_primary_target_type:
|
126 |
+
# filename_stem += "_halfbody" # Potentially redundant if base_name good.
|
127 |
+
|
128 |
+
if is_derived_head_crop:
|
129 |
+
filename_stem += "_headCrop"
|
130 |
+
if is_derived_face_crop:
|
131 |
+
filename_stem += "_faceCrop"
|
132 |
+
|
133 |
+
filename_with_extension = filename_stem + file_extension
|
134 |
+
|
135 |
+
path_parts = []
|
136 |
+
# New CCIP folder naming based on source prefix and sorted index
|
137 |
+
if ccip_folder_naming_index is not None and source_prefix_for_ccip_folder is not None:
|
138 |
+
path_parts.append(f"{source_prefix_for_ccip_folder}_ccip_{ccip_folder_naming_index:03d}")
|
139 |
+
|
140 |
+
# LPIPS folder naming based on the new sorted index or "noise"
|
141 |
+
if lpips_folder_naming_index is not None:
|
142 |
+
lpips_folder_name_part_str: Optional[str] = None
|
143 |
+
if isinstance(lpips_folder_naming_index, str) and lpips_folder_naming_index == "noise":
|
144 |
+
lpips_folder_name_part_str = "noise"
|
145 |
+
elif isinstance(lpips_folder_naming_index, int):
|
146 |
+
lpips_folder_name_part_str = f"{lpips_folder_naming_index:03d}"
|
147 |
+
|
148 |
+
if lpips_folder_name_part_str is not None:
|
149 |
+
# Determine prefix based on whether the item was originally in a CCIP cluster
|
150 |
+
if ccip_cluster_id_for_lpips_logic is not None: # LPIPS is sub-cluster if item had an original CCIP ID
|
151 |
+
lpips_folder_name_base = "lpips_sub_"
|
152 |
+
else: # No CCIP, LPIPS is primary
|
153 |
+
lpips_folder_name_base = "lpips_"
|
154 |
+
path_parts.append(f"{lpips_folder_name_base}{lpips_folder_name_part_str}")
|
155 |
+
|
156 |
+
final_filename_part = filename_with_extension
|
157 |
+
if aesthetic_label:
|
158 |
+
final_filename_part = f"{aesthetic_label}_{filename_with_extension}"
|
159 |
+
|
160 |
+
if path_parts:
|
161 |
+
return f"{'/'.join(path_parts)}/{final_filename_part}"
|
162 |
+
else:
|
163 |
+
return final_filename_part
|
164 |
+
|
165 |
+
# --- Core Processing Function for a single source (video or image sequence) ---
|
166 |
+
def _process_input_source_frames(
|
167 |
+
source_file_prefix: str, # Sanitized name for this source (e.g., "myvideo" or "ImageGroup123")
|
168 |
+
# Iterator yielding: (PIL.Image, frame_identifier_string, current_item_index, total_items_for_desc)
|
169 |
+
# For videos, current_item_index is the 1-based raw frame number.
|
170 |
+
# For images, current_item_index is the 1-based image number in the sequence.
|
171 |
+
frames_provider: Iterator[Tuple[Image.Image, int, int, int]],
|
172 |
+
is_video_source: bool, # To adjust some logging/stats messages
|
173 |
+
# Person Detection
|
174 |
+
enable_person_detection: bool,
|
175 |
+
min_target_width_person_percentage: float,
|
176 |
+
person_model_name: str,
|
177 |
+
person_conf_threshold: float,
|
178 |
+
person_iou_threshold: float,
|
179 |
+
# Half-Body Detection
|
180 |
+
enable_halfbody_detection: bool,
|
181 |
+
enable_halfbody_cropping: bool,
|
182 |
+
min_target_width_halfbody_percentage: float,
|
183 |
+
halfbody_model_name: str,
|
184 |
+
halfbody_conf_threshold: float,
|
185 |
+
halfbody_iou_threshold: float,
|
186 |
+
# Head Detection
|
187 |
+
enable_head_detection: bool,
|
188 |
+
enable_head_cropping: bool,
|
189 |
+
min_crop_width_head_percentage: float,
|
190 |
+
enable_head_filtering: bool,
|
191 |
+
head_model_name: str,
|
192 |
+
head_conf_threshold: float,
|
193 |
+
head_iou_threshold: float,
|
194 |
+
# Face Detection
|
195 |
+
enable_face_detection: bool,
|
196 |
+
enable_face_cropping: bool,
|
197 |
+
min_crop_width_face_percentage: float,
|
198 |
+
enable_face_filtering: bool,
|
199 |
+
face_model_name: str,
|
200 |
+
face_conf_threshold: float,
|
201 |
+
face_iou_threshold: float,
|
202 |
+
# CCIP Classification
|
203 |
+
enable_ccip_classification: bool,
|
204 |
+
ccip_model_name: str,
|
205 |
+
ccip_threshold: float,
|
206 |
+
# LPIPS Clustering
|
207 |
+
enable_lpips_clustering: bool,
|
208 |
+
lpips_threshold: float,
|
209 |
+
# Aesthetic Analysis
|
210 |
+
enable_aesthetic_analysis: bool,
|
211 |
+
aesthetic_model_name: str,
|
212 |
+
# Gradio Progress (specific to this source's processing)
|
213 |
+
progress_updater # Function: (progress_value: float, desc: str) -> None
|
214 |
+
) -> Tuple[str | None, str]:
|
215 |
+
"""
|
216 |
+
Processes frames from a given source (video or image sequence) according to the specified parameters.
|
217 |
+
Order: Person => Half-Body (alternative) => Face Detection => Head Detection => CCIP => Aesthetic.
|
218 |
+
|
219 |
+
Returns:
|
220 |
+
A tuple containing:
|
221 |
+
- Path to the output zip file (or None if error).
|
222 |
+
- Status message string.
|
223 |
+
"""
|
224 |
+
# This list will hold data for images that pass all filters, BEFORE LPIPS and final zipping
|
225 |
+
images_pending_final_processing: List[Dict[str, Any]] = []
|
226 |
+
|
227 |
+
# CCIP specific data
|
228 |
+
ccip_clusters_info: List[Tuple[int, np.ndarray]] = []
|
229 |
+
next_ccip_cluster_id = 0
|
230 |
+
|
231 |
+
# Stats
|
232 |
+
processed_items_count = 0
|
233 |
+
total_persons_detected_raw, total_halfbodies_detected_raw = 0, 0
|
234 |
+
person_targets_processed_count, halfbody_targets_processed_count, fullframe_targets_processed_count = 0, 0, 0
|
235 |
+
total_faces_detected_on_targets, total_heads_detected_on_targets = 0, 0
|
236 |
+
|
237 |
+
# These count items added to images_pending_final_processing
|
238 |
+
main_targets_pending_count, face_crops_pending_count, head_crops_pending_count = 0, 0, 0
|
239 |
+
items_filtered_by_face_count, items_filtered_by_head_count = 0, 0
|
240 |
+
ccip_applied_count, aesthetic_applied_count = 0, 0
|
241 |
+
# LPIPS stats
|
242 |
+
lpips_images_subject_to_clustering, total_lpips_clusters_created, total_lpips_noise_samples = 0, 0, 0
|
243 |
+
|
244 |
+
gc_interval = 100 # items from provider
|
245 |
+
start_time = time.time()
|
246 |
+
|
247 |
+
# Progress update for initializing this specific video
|
248 |
+
progress_updater(0, desc=f"Initializing {source_file_prefix}...")
|
249 |
+
output_zip_path_temp = None
|
250 |
+
output_zip_path_final = None
|
251 |
+
|
252 |
+
try:
|
253 |
+
# --- Main Loop for processing items from the frames_provider ---
|
254 |
+
for pil_image_full_frame, frame_specific_index, current_item_index, total_items_for_desc in frames_provider:
|
255 |
+
progress_value_for_updater = (current_item_index) / total_items_for_desc if total_items_for_desc > 0 else 1.0
|
256 |
+
|
257 |
+
|
258 |
+
# The description string should reflect what current_item_index means
|
259 |
+
item_description = ""
|
260 |
+
if is_video_source:
|
261 |
+
# For video, total_items_in_source_for_description is total raw frames.
|
262 |
+
# current_item_index is the raw frame index of the *sampled* frame.
|
263 |
+
# We also need a counter for *sampled* frames for a "processed X of Y (sampled)" message.
|
264 |
+
# processed_items_count counts sampled frames.
|
265 |
+
item_description = f"Scanning frame {current_item_index}/{total_items_for_desc} (processed {processed_items_count + 1} sampled)"
|
266 |
+
|
267 |
+
else: # For images
|
268 |
+
item_description = f"image {current_item_index}/{total_items_for_desc}"
|
269 |
+
|
270 |
+
progress_updater(
|
271 |
+
min(progress_value_for_updater, 1.0), # Cap progress at 1.0
|
272 |
+
desc=f"Processing {item_description} for {source_file_prefix}"
|
273 |
+
)
|
274 |
+
# processed_items_count still counts how many items are yielded by the provider
|
275 |
+
# (i.e., how many sampled frames for video, or how many images for image sequence)
|
276 |
+
processed_items_count += 1
|
277 |
+
|
278 |
+
try:
|
279 |
+
full_frame_width = pil_image_full_frame.width # Store for percentage calculations
|
280 |
+
print(f"--- Processing item ID {frame_specific_index} (Width: {full_frame_width}px) for {source_file_prefix} ---")
|
281 |
+
|
282 |
+
# List to hold PIL images that are the primary subjects for this frame
|
283 |
+
# Each element: {'pil': Image, 'base_name': str, 'source_type': 'person'/'halfbody'/'fullframe'}
|
284 |
+
primary_targets_for_frame: List[Dict[str, Any]] = []
|
285 |
+
processed_primary_source_this_frame = False # Flag if Person or HalfBody yielded targets
|
286 |
+
|
287 |
+
# --- 1. Person Detection ---
|
288 |
+
if enable_person_detection and full_frame_width > 0:
|
289 |
+
print(" Attempting Person Detection...")
|
290 |
+
min_person_target_px_width = full_frame_width * min_target_width_person_percentage
|
291 |
+
person_detections = person_detector.detect_person(
|
292 |
+
pil_image_full_frame, model_name=person_model_name,
|
293 |
+
conf_threshold=person_conf_threshold, iou_threshold=person_iou_threshold
|
294 |
+
)
|
295 |
+
total_persons_detected_raw += len(person_detections)
|
296 |
+
if person_detections:
|
297 |
+
print(f" Detected {len(person_detections)} raw persons.")
|
298 |
+
valid_person_targets = 0
|
299 |
+
for i, (bbox, _, score) in enumerate(person_detections):
|
300 |
+
# Check width before full crop for minor optimization
|
301 |
+
detected_person_width = bbox[2] - bbox[0]
|
302 |
+
if detected_person_width >= min_person_target_px_width:
|
303 |
+
primary_targets_for_frame.append({
|
304 |
+
'pil': pil_image_full_frame.crop(bbox),
|
305 |
+
'base_name': f"{source_file_prefix}_item_{frame_specific_index}_person_{i}_score{int(score*100)}",
|
306 |
+
'source_type': 'person'})
|
307 |
+
person_targets_processed_count +=1
|
308 |
+
valid_person_targets +=1
|
309 |
+
else:
|
310 |
+
print(f" Person {i} width {detected_person_width}px < min {min_person_target_px_width:.0f}px. Skipping.")
|
311 |
+
if valid_person_targets > 0:
|
312 |
+
processed_primary_source_this_frame = True
|
313 |
+
print(f" Added {valid_person_targets} persons as primary targets.")
|
314 |
+
|
315 |
+
# --- 2. Half-Body Detection (if Person not processed and HBD enabled) ---
|
316 |
+
if not processed_primary_source_this_frame and enable_halfbody_detection and full_frame_width > 0:
|
317 |
+
print(" Attempting Half-Body Detection (on full item)...")
|
318 |
+
min_halfbody_target_px_width = full_frame_width * min_target_width_halfbody_percentage
|
319 |
+
halfbody_detections = halfbody_detector.detect_halfbody(
|
320 |
+
pil_image_full_frame, model_name=halfbody_model_name,
|
321 |
+
conf_threshold=halfbody_conf_threshold, iou_threshold=halfbody_iou_threshold
|
322 |
+
)
|
323 |
+
total_halfbodies_detected_raw += len(halfbody_detections)
|
324 |
+
if halfbody_detections:
|
325 |
+
print(f" Detected {len(halfbody_detections)} raw half-bodies.")
|
326 |
+
valid_halfbody_targets = 0
|
327 |
+
for i, (bbox, _, score) in enumerate(halfbody_detections):
|
328 |
+
detected_hb_width = bbox[2] - bbox[0]
|
329 |
+
# Cropping must be enabled and width must be sufficient for it to be a target
|
330 |
+
if enable_halfbody_cropping and detected_hb_width >= min_halfbody_target_px_width:
|
331 |
+
primary_targets_for_frame.append({
|
332 |
+
'pil': pil_image_full_frame.crop(bbox),
|
333 |
+
'base_name': f"{source_file_prefix}_item_{frame_specific_index}_halfbody_{i}_score{int(score*100)}",
|
334 |
+
'source_type': 'halfbody'})
|
335 |
+
halfbody_targets_processed_count +=1
|
336 |
+
valid_halfbody_targets +=1
|
337 |
+
elif enable_halfbody_cropping:
|
338 |
+
print(f" Half-body {i} width {detected_hb_width}px < min {min_halfbody_target_px_width:.0f}px. Skipping.")
|
339 |
+
if valid_halfbody_targets > 0:
|
340 |
+
processed_primary_source_this_frame = True
|
341 |
+
print(f" Added {valid_halfbody_targets} half-bodies as primary targets.")
|
342 |
+
|
343 |
+
# --- 3. Full Frame/Image (fallback) ---
|
344 |
+
if not processed_primary_source_this_frame:
|
345 |
+
print(" Processing Full Item as primary target.")
|
346 |
+
primary_targets_for_frame.append({
|
347 |
+
'pil': pil_image_full_frame.copy(),
|
348 |
+
'base_name': f"{source_file_prefix}_item_{frame_specific_index}_full",
|
349 |
+
'source_type': 'fullframe'})
|
350 |
+
fullframe_targets_processed_count += 1
|
351 |
+
|
352 |
+
# --- Process each identified primary_target_for_frame ---
|
353 |
+
for target_data in primary_targets_for_frame:
|
354 |
+
current_pil: Image.Image = target_data['pil']
|
355 |
+
current_base_name: str = target_data['base_name'] # Base name for this main target
|
356 |
+
current_source_type: str = target_data['source_type']
|
357 |
+
current_pil_width = current_pil.width # For sub-crop percentage calculations
|
358 |
+
print(f" Processing target: {current_base_name} (type: {current_source_type}, width: {current_pil_width}px)")
|
359 |
+
|
360 |
+
# Store PILs of successful crops from current_pil for this target
|
361 |
+
keep_this_target = True
|
362 |
+
item_area = current_pil_width * current_pil.height
|
363 |
+
potential_face_crops_pil: List[Image.Image] = []
|
364 |
+
potential_head_crops_pil: List[Image.Image] = []
|
365 |
+
|
366 |
+
# --- A. Face Detection ---
|
367 |
+
if keep_this_target and enable_face_detection and current_pil_width > 0:
|
368 |
+
print(f" Detecting faces in {current_base_name}...")
|
369 |
+
min_face_crop_px_width = current_pil_width * min_crop_width_face_percentage
|
370 |
+
face_detections = face_detector.detect_faces(
|
371 |
+
current_pil, model_name=face_model_name,
|
372 |
+
conf_threshold=face_conf_threshold, iou_threshold=face_iou_threshold
|
373 |
+
)
|
374 |
+
total_faces_detected_on_targets += len(face_detections)
|
375 |
+
if not face_detections and enable_face_filtering:
|
376 |
+
keep_this_target = False
|
377 |
+
items_filtered_by_face_count += 1
|
378 |
+
print(f" FILTERING TARGET {current_base_name} (no face).")
|
379 |
+
elif face_detections and enable_face_cropping:
|
380 |
+
for f_idx, (f_bbox, _, _) in enumerate(face_detections):
|
381 |
+
if (f_bbox[2]-f_bbox[0]) >= min_face_crop_px_width:
|
382 |
+
potential_face_crops_pil.append(current_pil.crop(f_bbox))
|
383 |
+
else:
|
384 |
+
print(f" Face {f_idx} too small. Skipping crop.")
|
385 |
+
|
386 |
+
# --- B. Head Detection ---
|
387 |
+
if keep_this_target and enable_head_detection and current_pil_width > 0:
|
388 |
+
print(f" Detecting heads in {current_base_name}...")
|
389 |
+
min_head_crop_px_width = current_pil_width * min_crop_width_head_percentage
|
390 |
+
head_detections = head_detector.detect_heads(
|
391 |
+
current_pil, model_name=head_model_name,
|
392 |
+
conf_threshold=head_conf_threshold, iou_threshold=head_iou_threshold
|
393 |
+
)
|
394 |
+
total_heads_detected_on_targets += len(head_detections)
|
395 |
+
if not head_detections and enable_head_filtering:
|
396 |
+
keep_this_target = False
|
397 |
+
items_filtered_by_head_count += 1
|
398 |
+
print(f" FILTERING TARGET {current_base_name} (no head).")
|
399 |
+
potential_face_crops_pil.clear() # Clear faces if head filter removed target
|
400 |
+
elif head_detections and enable_head_cropping:
|
401 |
+
for h_idx, (h_bbox, _, _) in enumerate(head_detections):
|
402 |
+
h_w = h_bbox[2]-h_bbox[0] # h_h = h_bbox[3]-h_bbox[1]
|
403 |
+
if h_w >= min_head_crop_px_width and item_area > 0:
|
404 |
+
potential_head_crops_pil.append(current_pil.crop(h_bbox))
|
405 |
+
else:
|
406 |
+
print(f" Head {h_idx} too small or too large relative to parent. Skipping crop.")
|
407 |
+
|
408 |
+
# --- If target is filtered, clean up and skip to next target ---
|
409 |
+
if not keep_this_target:
|
410 |
+
print(f" Target {current_base_name} was filtered by face/head presence rules. Discarding it and its potential crops.")
|
411 |
+
if current_pil is not None:
|
412 |
+
del current_pil
|
413 |
+
potential_face_crops_pil.clear()
|
414 |
+
potential_head_crops_pil.clear()
|
415 |
+
continue # To the next primary_target_for_frame
|
416 |
+
|
417 |
+
# --- C. CCIP Classification (on current_pil, if it's kept) ---
|
418 |
+
assigned_ccip_id = None # This is the original CCIP ID
|
419 |
+
if enable_ccip_classification:
|
420 |
+
print(f" Classifying {current_base_name} with CCIP...")
|
421 |
+
try:
|
422 |
+
feature = ccip_analyzer.ccip_extract_feature(current_pil, model=ccip_model_name)
|
423 |
+
best_match_cid = None
|
424 |
+
min_diff = float('inf')
|
425 |
+
# Find the best potential match among existing clusters
|
426 |
+
if ccip_clusters_info: # Only loop if there are clusters to compare against
|
427 |
+
for cid, rep_f in ccip_clusters_info:
|
428 |
+
diff = ccip_analyzer.ccip_difference(feature, rep_f, model=ccip_model_name)
|
429 |
+
if diff < min_diff:
|
430 |
+
min_diff = diff
|
431 |
+
best_match_cid = cid
|
432 |
+
|
433 |
+
# Decide whether to use the best match or create a new cluster
|
434 |
+
if best_match_cid is not None and min_diff < ccip_threshold:
|
435 |
+
assigned_ccip_id = best_match_cid
|
436 |
+
print(f" -> Matched Cluster {assigned_ccip_id} (Diff: {min_diff:.6f} <= Threshold {ccip_threshold:.3f})")
|
437 |
+
else:
|
438 |
+
# No suitable match found (either no clusters existed, or the best match's diff was strictly greater than threshold)
|
439 |
+
# Create a new cluster
|
440 |
+
assigned_ccip_id = next_ccip_cluster_id
|
441 |
+
ccip_clusters_info.append((assigned_ccip_id, feature))
|
442 |
+
if not ccip_clusters_info or len(ccip_clusters_info) == 1:
|
443 |
+
print(f" -> New Cluster {assigned_ccip_id} (First item or no prior suitable clusters)")
|
444 |
+
else:
|
445 |
+
# MODIFIED: Log message reflecting that new cluster is formed if diff > threshold
|
446 |
+
print(f" -> New Cluster {assigned_ccip_id} (Min diff to others: {min_diff:.6f} > Threshold {ccip_threshold:.3f})")
|
447 |
+
next_ccip_cluster_id += 1
|
448 |
+
print(f" CCIP: Target {current_base_name} -> Original Cluster ID {assigned_ccip_id}")
|
449 |
+
del feature
|
450 |
+
ccip_applied_count += 1
|
451 |
+
except Exception as e_ccip:
|
452 |
+
print(f" Error CCIP: {e_ccip}")
|
453 |
+
|
454 |
+
# --- D. Aesthetic Analysis (on current_pil, if it's kept) ---
|
455 |
+
item_aesthetic_label = None
|
456 |
+
if enable_aesthetic_analysis:
|
457 |
+
print(f" Analyzing {current_base_name} for aesthetics...")
|
458 |
+
try:
|
459 |
+
res = dbaesthetic_analyzer.anime_dbaesthetic(current_pil, model_name=aesthetic_model_name)
|
460 |
+
if isinstance(res, tuple) and len(res) >= 1:
|
461 |
+
item_aesthetic_label = res[0]
|
462 |
+
print(f" Aesthetic: Target {current_base_name} -> {item_aesthetic_label}")
|
463 |
+
aesthetic_applied_count += 1
|
464 |
+
except Exception as e_aes:
|
465 |
+
print(f" Error Aesthetic: {e_aes}")
|
466 |
+
|
467 |
+
add_current_pil_to_pending_list = True
|
468 |
+
if current_source_type == 'fullframe':
|
469 |
+
can_skip_fullframe_target = False
|
470 |
+
if enable_face_detection or enable_head_detection:
|
471 |
+
found_valid_sub_crop_from_enabled_detector = False
|
472 |
+
if enable_face_detection and len(potential_face_crops_pil) > 0:
|
473 |
+
found_valid_sub_crop_from_enabled_detector = True
|
474 |
+
|
475 |
+
if not found_valid_sub_crop_from_enabled_detector and \
|
476 |
+
enable_head_detection and len(potential_head_crops_pil) > 0:
|
477 |
+
found_valid_sub_crop_from_enabled_detector = True
|
478 |
+
|
479 |
+
if not found_valid_sub_crop_from_enabled_detector: # No valid crops from any enabled sub-detector
|
480 |
+
can_skip_fullframe_target = True # All enabled sub-detectors failed
|
481 |
+
|
482 |
+
if can_skip_fullframe_target:
|
483 |
+
add_current_pil_to_pending_list = False
|
484 |
+
print(f" Skipping save of fullframe target '{current_base_name}' because all enabled sub-detectors (Face/Head) yielded no valid-width crops.")
|
485 |
+
|
486 |
+
if add_current_pil_to_pending_list:
|
487 |
+
# --- E. Save current_pil (if it passed all filters) ---
|
488 |
+
# Add main target to pending list
|
489 |
+
images_pending_final_processing.append({
|
490 |
+
'pil_image': current_pil.copy(), 'base_name_for_filename': current_base_name,
|
491 |
+
'ccip_cluster_id': assigned_ccip_id, 'aesthetic_label': item_aesthetic_label,
|
492 |
+
'is_halfbody_primary_target_type': (current_source_type == 'halfbody'),
|
493 |
+
'is_derived_head_crop': False, 'is_derived_face_crop': False,
|
494 |
+
'lpips_cluster_id': None, # Will be filled by LPIPS clustering
|
495 |
+
'lpips_folder_naming_index': None # Will be filled by LPIPS renaming
|
496 |
+
})
|
497 |
+
main_targets_pending_count +=1
|
498 |
+
|
499 |
+
# --- F. Save Face Crops (derived from current_pil) ---
|
500 |
+
for i, fc_pil in enumerate(potential_face_crops_pil):
|
501 |
+
images_pending_final_processing.append({
|
502 |
+
'pil_image': fc_pil, 'base_name_for_filename': f"{current_base_name}_face{i}",
|
503 |
+
'ccip_cluster_id': assigned_ccip_id, 'aesthetic_label': item_aesthetic_label,
|
504 |
+
'is_halfbody_primary_target_type': False,
|
505 |
+
'is_derived_head_crop': False, 'is_derived_face_crop': True,
|
506 |
+
'lpips_cluster_id': None,
|
507 |
+
'lpips_folder_naming_index': None
|
508 |
+
})
|
509 |
+
face_crops_pending_count+=1
|
510 |
+
potential_face_crops_pil.clear()
|
511 |
+
|
512 |
+
# --- G. Save Head Crops (derived from current_pil) ---
|
513 |
+
for i, hc_pil in enumerate(potential_head_crops_pil):
|
514 |
+
images_pending_final_processing.append({
|
515 |
+
'pil_image': hc_pil, 'base_name_for_filename': f"{current_base_name}_head{i}",
|
516 |
+
'ccip_cluster_id': assigned_ccip_id, 'aesthetic_label': item_aesthetic_label,
|
517 |
+
'is_halfbody_primary_target_type': False,
|
518 |
+
'is_derived_head_crop': True, 'is_derived_face_crop': False,
|
519 |
+
'lpips_cluster_id': None,
|
520 |
+
'lpips_folder_naming_index': None
|
521 |
+
})
|
522 |
+
head_crops_pending_count+=1
|
523 |
+
potential_head_crops_pil.clear()
|
524 |
+
|
525 |
+
if current_pil is not None: # Ensure current_pil exists before attempting to delete
|
526 |
+
del current_pil # Clean up the PIL for this target_data
|
527 |
+
|
528 |
+
primary_targets_for_frame.clear()
|
529 |
+
except Exception as item_proc_err:
|
530 |
+
print(f"!! Major Error processing item ID {frame_specific_index} for {source_file_prefix}: {item_proc_err}")
|
531 |
+
traceback.print_exc()
|
532 |
+
# Cleanup local vars for this item if error
|
533 |
+
if 'primary_targets_for_frame' in locals():
|
534 |
+
primary_targets_for_frame.clear()
|
535 |
+
# Also ensure current_pil from inner loop is cleaned up if error happened mid-loop
|
536 |
+
if 'current_pil' in locals() and current_pil is not None:
|
537 |
+
del current_pil
|
538 |
+
|
539 |
+
if processed_items_count % gc_interval == 0:
|
540 |
+
gc.collect()
|
541 |
+
print(f" [GC triggered at {processed_items_count} items for {source_file_prefix}]")
|
542 |
+
# --- End of Main Item Processing Loop ---
|
543 |
+
|
544 |
+
print(f"\nRunning final GC before LPIPS/Zipping for {source_file_prefix}...")
|
545 |
+
gc.collect()
|
546 |
+
|
547 |
+
if not images_pending_final_processing:
|
548 |
+
status_message = f"Processing for {source_file_prefix} finished, but no images were generated or passed filters for LPIPS/Zipping."
|
549 |
+
print(status_message)
|
550 |
+
return None, status_message
|
551 |
+
|
552 |
+
# --- LPIPS Clustering Stage ---
|
553 |
+
print(f"\n--- LPIPS Clustering Stage for {source_file_prefix} (Images pending: {len(images_pending_final_processing)}) ---")
|
554 |
+
if enable_lpips_clustering:
|
555 |
+
print(f" LPIPS Clustering enabled with threshold: {lpips_threshold}")
|
556 |
+
lpips_images_subject_to_clustering = len(images_pending_final_processing)
|
557 |
+
|
558 |
+
if enable_ccip_classification and next_ccip_cluster_id > 0: # CCIP was used
|
559 |
+
print(" LPIPS clustering within CCIP clusters.")
|
560 |
+
images_by_ccip: Dict[Optional[int], List[int]] = {} # ccip_id -> list of original indices
|
561 |
+
for i, item_data in enumerate(images_pending_final_processing):
|
562 |
+
ccip_id = item_data['ccip_cluster_id'] # Original CCIP ID
|
563 |
+
if ccip_id not in images_by_ccip:
|
564 |
+
images_by_ccip[ccip_id] = []
|
565 |
+
images_by_ccip[ccip_id].append(i)
|
566 |
+
|
567 |
+
for ccip_id, indices_in_ccip_cluster in images_by_ccip.items():
|
568 |
+
pils_for_lpips_sub_cluster = [images_pending_final_processing[idx]['pil_image'] for idx in indices_in_ccip_cluster]
|
569 |
+
if len(pils_for_lpips_sub_cluster) > 1:
|
570 |
+
print(f" Clustering {len(pils_for_lpips_sub_cluster)} images in CCIP cluster {ccip_id}...")
|
571 |
+
try:
|
572 |
+
lpips_sub_ids = lpips_module.lpips_clustering(pils_for_lpips_sub_cluster, threshold=lpips_threshold)
|
573 |
+
for i_sub, lpips_id in enumerate(lpips_sub_ids):
|
574 |
+
original_idx = indices_in_ccip_cluster[i_sub]
|
575 |
+
images_pending_final_processing[original_idx]['lpips_cluster_id'] = lpips_id
|
576 |
+
except Exception as e_lpips_sub:
|
577 |
+
print(f" Error LPIPS sub-cluster CCIP {ccip_id}: {e_lpips_sub}")
|
578 |
+
elif len(pils_for_lpips_sub_cluster) == 1:
|
579 |
+
images_pending_final_processing[indices_in_ccip_cluster[0]]['lpips_cluster_id'] = 0 # type: ignore
|
580 |
+
del images_by_ccip
|
581 |
+
if 'pils_for_lpips_sub_cluster' in locals():
|
582 |
+
del pils_for_lpips_sub_cluster # Ensure cleanup
|
583 |
+
else: # LPIPS on all images globally
|
584 |
+
print(" LPIPS clustering on all collected images.")
|
585 |
+
all_pils_for_global_lpips = [item['pil_image'] for item in images_pending_final_processing]
|
586 |
+
if len(all_pils_for_global_lpips) > 1:
|
587 |
+
try:
|
588 |
+
lpips_global_ids = lpips_module.lpips_clustering(all_pils_for_global_lpips, threshold=lpips_threshold)
|
589 |
+
for i, lpips_id in enumerate(lpips_global_ids):
|
590 |
+
images_pending_final_processing[i]['lpips_cluster_id'] = lpips_id
|
591 |
+
except Exception as e_lpips_global:
|
592 |
+
print(f" Error LPIPS global: {e_lpips_global}")
|
593 |
+
elif len(all_pils_for_global_lpips) == 1:
|
594 |
+
images_pending_final_processing[0]['lpips_cluster_id'] = 0 # type: ignore
|
595 |
+
del all_pils_for_global_lpips
|
596 |
+
|
597 |
+
# Calculate LPIPS stats
|
598 |
+
all_final_lpips_ids = [item.get('lpips_cluster_id') for item in images_pending_final_processing if item.get('lpips_cluster_id') is not None]
|
599 |
+
if all_final_lpips_ids:
|
600 |
+
unique_lpips_clusters = set(filter(lambda x: x != -1, all_final_lpips_ids))
|
601 |
+
total_lpips_clusters_created = len(unique_lpips_clusters)
|
602 |
+
total_lpips_noise_samples = sum(1 for x in all_final_lpips_ids if x == -1)
|
603 |
+
else:
|
604 |
+
print(" LPIPS Clustering disabled.")
|
605 |
+
|
606 |
+
# --- CCIP Folder Renaming Logic ---
|
607 |
+
original_ccip_id_to_new_naming_index: Dict[int, int] = {}
|
608 |
+
if enable_ccip_classification:
|
609 |
+
print(f" Preparing CCIP folder renaming for {source_file_prefix}...")
|
610 |
+
ccip_image_counts: Dict[int, int] = {} # original_ccip_id -> count of images in it
|
611 |
+
for item_data_for_count in images_pending_final_processing:
|
612 |
+
original_ccip_id_val = item_data_for_count.get('ccip_cluster_id')
|
613 |
+
if original_ccip_id_val is not None:
|
614 |
+
ccip_image_counts[original_ccip_id_val] = ccip_image_counts.get(original_ccip_id_val, 0) + 1
|
615 |
+
|
616 |
+
if ccip_image_counts:
|
617 |
+
# Sort original ccip_ids by their counts in descending order
|
618 |
+
sorted_ccip_groups_by_count: List[Tuple[int, int]] = sorted(
|
619 |
+
ccip_image_counts.items(),
|
620 |
+
key=lambda item: item[1], # Sort by count
|
621 |
+
reverse=True
|
622 |
+
)
|
623 |
+
for new_idx, (original_id, count) in enumerate(sorted_ccip_groups_by_count):
|
624 |
+
original_ccip_id_to_new_naming_index[original_id] = new_idx
|
625 |
+
print(f" CCIP Remap for {source_file_prefix}: Original ID {original_id} (count: {count}) -> New Naming Index {new_idx:03d}")
|
626 |
+
else:
|
627 |
+
print(f" No CCIP-assigned images found for {source_file_prefix} to perform renaming.")
|
628 |
+
|
629 |
+
# --- LPIPS Folder Renaming Logic ---
|
630 |
+
if enable_lpips_clustering:
|
631 |
+
print(f" Preparing LPIPS folder renaming for {source_file_prefix}...")
|
632 |
+
# Initialize/Reset lpips_folder_naming_index for all items
|
633 |
+
for item_data in images_pending_final_processing:
|
634 |
+
item_data['lpips_folder_naming_index'] = None
|
635 |
+
|
636 |
+
if enable_ccip_classification and next_ccip_cluster_id > 0: # LPIPS within CCIP
|
637 |
+
print(f" LPIPS renaming within CCIP clusters for {source_file_prefix}.")
|
638 |
+
items_grouped_by_original_ccip: Dict[Optional[int], List[Dict[str, Any]]] = {}
|
639 |
+
for item_data in images_pending_final_processing:
|
640 |
+
original_ccip_id = item_data.get('ccip_cluster_id')
|
641 |
+
if original_ccip_id not in items_grouped_by_original_ccip: items_grouped_by_original_ccip[original_ccip_id] = []
|
642 |
+
items_grouped_by_original_ccip[original_ccip_id].append(item_data)
|
643 |
+
|
644 |
+
for original_ccip_id, items_in_ccip in items_grouped_by_original_ccip.items():
|
645 |
+
lpips_counts_in_ccip: Dict[int, int] = {} # original_lpips_id (non-noise) -> count
|
646 |
+
for item_data in items_in_ccip:
|
647 |
+
lpips_id = item_data.get('lpips_cluster_id')
|
648 |
+
if lpips_id is not None and lpips_id != -1:
|
649 |
+
lpips_counts_in_ccip[lpips_id] = lpips_counts_in_ccip.get(lpips_id, 0) + 1
|
650 |
+
|
651 |
+
lpips_id_to_naming_in_ccip: Dict[int, Union[int, str]] = {}
|
652 |
+
if lpips_counts_in_ccip:
|
653 |
+
sorted_lpips = sorted(lpips_counts_in_ccip.items(), key=lambda x: x[1], reverse=True)
|
654 |
+
for new_idx, (lpips_id, count) in enumerate(sorted_lpips):
|
655 |
+
lpips_id_to_naming_in_ccip[lpips_id] = new_idx
|
656 |
+
ccip_disp = f"OrigCCIP-{original_ccip_id}" if original_ccip_id is not None else "NoCCIP"
|
657 |
+
print(f" LPIPS Remap in {ccip_disp}: OrigLPIPS ID {lpips_id} (count: {count}) -> New Naming Index {new_idx:03d}")
|
658 |
+
|
659 |
+
for item_data in items_in_ccip:
|
660 |
+
lpips_id = item_data.get('lpips_cluster_id')
|
661 |
+
if lpips_id is not None:
|
662 |
+
if lpips_id == -1: item_data['lpips_folder_naming_index'] = "noise"
|
663 |
+
elif lpips_id in lpips_id_to_naming_in_ccip:
|
664 |
+
item_data['lpips_folder_naming_index'] = lpips_id_to_naming_in_ccip[lpips_id]
|
665 |
+
del items_grouped_by_original_ccip
|
666 |
+
else: # Global LPIPS
|
667 |
+
print(f" Global LPIPS renaming for {source_file_prefix}.")
|
668 |
+
global_lpips_counts: Dict[int, int] = {}
|
669 |
+
for item_data in images_pending_final_processing:
|
670 |
+
lpips_id = item_data.get('lpips_cluster_id')
|
671 |
+
if lpips_id is not None and lpips_id != -1:
|
672 |
+
global_lpips_counts[lpips_id] = global_lpips_counts.get(lpips_id, 0) + 1
|
673 |
+
|
674 |
+
global_lpips_id_to_naming: Dict[int, Union[int, str]] = {}
|
675 |
+
if global_lpips_counts:
|
676 |
+
sorted_global_lpips = sorted(global_lpips_counts.items(), key=lambda x: x[1], reverse=True)
|
677 |
+
for new_idx, (lpips_id, count) in enumerate(sorted_global_lpips):
|
678 |
+
global_lpips_id_to_naming[lpips_id] = new_idx
|
679 |
+
print(f" Global LPIPS Remap: OrigLPIPS ID {lpips_id} (count: {count}) -> New Naming Index {new_idx:03d}")
|
680 |
+
|
681 |
+
for item_data in images_pending_final_processing:
|
682 |
+
lpips_id = item_data.get('lpips_cluster_id')
|
683 |
+
if lpips_id is not None:
|
684 |
+
if lpips_id == -1: item_data['lpips_folder_naming_index'] = "noise"
|
685 |
+
elif lpips_id in global_lpips_id_to_naming:
|
686 |
+
item_data['lpips_folder_naming_index'] = global_lpips_id_to_naming[lpips_id]
|
687 |
+
gc.collect()
|
688 |
+
|
689 |
+
# --- Final Zipping Stage ---
|
690 |
+
images_to_zip: Dict[str, bytes] = {}
|
691 |
+
print(f"\n--- Final Zipping Stage for {source_file_prefix} ({len(images_pending_final_processing)} items) ---")
|
692 |
+
for item_data in images_pending_final_processing:
|
693 |
+
original_ccip_id_for_item = item_data.get('ccip_cluster_id')
|
694 |
+
current_ccip_naming_idx_for_folder: Optional[int] = None
|
695 |
+
|
696 |
+
if enable_ccip_classification and original_ccip_id_for_item is not None and \
|
697 |
+
original_ccip_id_for_item in original_ccip_id_to_new_naming_index:
|
698 |
+
current_ccip_naming_idx_for_folder = original_ccip_id_to_new_naming_index[original_ccip_id_for_item]
|
699 |
+
|
700 |
+
current_lpips_naming_idx_for_folder = item_data.get('lpips_folder_naming_index')
|
701 |
+
|
702 |
+
final_filename = generate_filename(
|
703 |
+
base_name=item_data['base_name_for_filename'],
|
704 |
+
aesthetic_label=item_data.get('aesthetic_label'),
|
705 |
+
ccip_cluster_id_for_lpips_logic=original_ccip_id_for_item,
|
706 |
+
ccip_folder_naming_index=current_ccip_naming_idx_for_folder,
|
707 |
+
source_prefix_for_ccip_folder=source_file_prefix if current_ccip_naming_idx_for_folder is not None else None,
|
708 |
+
lpips_folder_naming_index=current_lpips_naming_idx_for_folder,
|
709 |
+
is_halfbody_primary_target_type=item_data['is_halfbody_primary_target_type'],
|
710 |
+
is_derived_head_crop=item_data['is_derived_head_crop'],
|
711 |
+
is_derived_face_crop=item_data['is_derived_face_crop']
|
712 |
+
)
|
713 |
+
try:
|
714 |
+
images_to_zip[final_filename] = image_to_bytes(item_data['pil_image'])
|
715 |
+
except Exception as e_bytes:
|
716 |
+
print(f" Error converting/adding {final_filename} to zip: {e_bytes}")
|
717 |
+
finally:
|
718 |
+
if 'pil_image' in item_data and item_data['pil_image'] is not None:
|
719 |
+
del item_data['pil_image']
|
720 |
+
images_pending_final_processing.clear()
|
721 |
+
|
722 |
+
if not images_to_zip:
|
723 |
+
status_message = f"Processing for {source_file_prefix} finished, but no images were converted for zipping."
|
724 |
+
print(status_message)
|
725 |
+
return None, status_message
|
726 |
+
|
727 |
+
print(f"Preparing zip file for {source_file_prefix} with {len(images_to_zip)} images...")
|
728 |
+
progress_updater(1.0, desc=f"Creating Zip File for {source_file_prefix}...")
|
729 |
+
zip_start_time = time.time()
|
730 |
+
|
731 |
+
# Use NamedTemporaryFile with delete=False for the final output path
|
732 |
+
# This file will persist until manually cleaned or OS cleanup
|
733 |
+
temp_zip_file = tempfile.NamedTemporaryFile(delete=False, suffix=".zip")
|
734 |
+
output_zip_path_temp = temp_zip_file.name
|
735 |
+
temp_zip_file.close() # Close the handle, but file remains
|
736 |
+
|
737 |
+
try:
|
738 |
+
# Write data to the temporary file path
|
739 |
+
create_zip_file(images_to_zip, output_zip_path_temp)
|
740 |
+
zip_duration = time.time() - zip_start_time
|
741 |
+
print(f"Temporary zip file for {source_file_prefix} created in {zip_duration:.2f} seconds at {output_zip_path_temp}")
|
742 |
+
|
743 |
+
# Construct the new, desired filename
|
744 |
+
temp_dir = os.path.dirname(output_zip_path_temp)
|
745 |
+
timestamp = int(time.time())
|
746 |
+
desired_filename = f"{source_file_prefix}_processed_{timestamp}.zip"
|
747 |
+
output_zip_path_final = os.path.join(temp_dir, desired_filename)
|
748 |
+
|
749 |
+
# Rename the temporary file to the desired name
|
750 |
+
print(f"Renaming temp file for {source_file_prefix} to: {output_zip_path_final}")
|
751 |
+
os.rename(output_zip_path_temp, output_zip_path_final)
|
752 |
+
print("Rename successful.")
|
753 |
+
output_zip_path_temp = None # Clear temp path as it's been renamed
|
754 |
+
|
755 |
+
except Exception as zip_or_rename_err:
|
756 |
+
print(f"Error during zip creation or renaming for {source_file_prefix}: {zip_or_rename_err}")
|
757 |
+
# Clean up the *original* temp file if it still exists and renaming failed
|
758 |
+
if output_zip_path_temp and os.path.exists(output_zip_path_temp):
|
759 |
+
try:
|
760 |
+
os.remove(output_zip_path_temp)
|
761 |
+
except OSError:
|
762 |
+
pass
|
763 |
+
if output_zip_path_final and os.path.exists(output_zip_path_final): # Check if rename partially happened
|
764 |
+
try:
|
765 |
+
os.remove(output_zip_path_final)
|
766 |
+
except OSError:
|
767 |
+
pass
|
768 |
+
raise zip_or_rename_err # Re-raise the error
|
769 |
+
|
770 |
+
# --- Prepare Status Message ---
|
771 |
+
processing_duration = time.time() - start_time - zip_duration # Exclude zipping time from processing time
|
772 |
+
total_duration = time.time() - start_time # Includes zipping/renaming
|
773 |
+
|
774 |
+
# --- Build final status message ---
|
775 |
+
person_stats = "N/A"
|
776 |
+
if enable_person_detection:
|
777 |
+
person_stats = f"{total_persons_detected_raw} raw, {person_targets_processed_count} targets (>{min_target_width_person_percentage*100:.1f}% itemW)"
|
778 |
+
|
779 |
+
halfbody_stats = "N/A"
|
780 |
+
if enable_halfbody_detection:
|
781 |
+
halfbody_stats = f"{total_halfbodies_detected_raw} raw, {halfbody_targets_processed_count} targets (>{min_target_width_halfbody_percentage*100:.1f}% itemW)"
|
782 |
+
fullframe_stats = f"{fullframe_targets_processed_count} targets"
|
783 |
+
|
784 |
+
face_stats = "N/A"
|
785 |
+
if enable_face_detection:
|
786 |
+
face_stats = f"{total_faces_detected_on_targets} on targets, {face_crops_pending_count} crops pending (>{min_crop_width_face_percentage*100:.1f}% parentW)"
|
787 |
+
if enable_face_filtering:
|
788 |
+
face_stats += f", {items_filtered_by_face_count} targets filtered"
|
789 |
+
|
790 |
+
head_stats = "N/A"
|
791 |
+
if enable_head_detection:
|
792 |
+
head_stats = f"{total_heads_detected_on_targets} on targets, {head_crops_pending_count} crops pending (>{min_crop_width_head_percentage*100:.1f}% parentW)"
|
793 |
+
if enable_head_filtering:
|
794 |
+
head_stats += f", {items_filtered_by_head_count} targets filtered"
|
795 |
+
|
796 |
+
ccip_stats = "N/A"
|
797 |
+
if enable_ccip_classification:
|
798 |
+
ccip_stats = f"{next_ccip_cluster_id} original clusters created, on {ccip_applied_count} targets. Folders renamed by image count."
|
799 |
+
|
800 |
+
lpips_stats = "N/A"
|
801 |
+
if enable_lpips_clustering:
|
802 |
+
lpips_stats = f"{lpips_images_subject_to_clustering} images processed, {total_lpips_clusters_created} clusters, {total_lpips_noise_samples} noise. Folders renamed by image count."
|
803 |
+
|
804 |
+
aesthetic_stats = "N/A"
|
805 |
+
if enable_aesthetic_analysis:
|
806 |
+
aesthetic_stats = f"On {aesthetic_applied_count} targets"
|
807 |
+
|
808 |
+
item_desc_for_stats = "Items from Provider" if not is_video_source else "Sampled Frames"
|
809 |
+
status_message = (
|
810 |
+
f"Processing for '{source_file_prefix}' Complete!\n"
|
811 |
+
f"Total time: {total_duration:.2f}s (Proc: {processing_duration:.2f}s, Zip: {zip_duration:.2f}s)\n"
|
812 |
+
f"{item_desc_for_stats}: {total_items_for_desc}, Processed Items: {processed_items_count}\n"
|
813 |
+
f"--- Primary Targets Processed ---\n"
|
814 |
+
f" Person Detection: {person_stats}\n"
|
815 |
+
f" Half-Body Detection: {halfbody_stats}\n"
|
816 |
+
f" Full Item Processing: {fullframe_stats}\n"
|
817 |
+
f"--- Items Pending Final Processing ({main_targets_pending_count} main, {face_crops_pending_count} face, {head_crops_pending_count} head) ---\n"
|
818 |
+
f" Face Detection: {face_stats}\n"
|
819 |
+
f" Head Detection: {head_stats}\n"
|
820 |
+
f" CCIP Classification: {ccip_stats}\n"
|
821 |
+
f" LPIPS Clustering: {lpips_stats}\n"
|
822 |
+
f" Aesthetic Analysis: {aesthetic_stats}\n"
|
823 |
+
f"Zip file contains {len(images_to_zip)} images.\n"
|
824 |
+
f"Output Zip: {output_zip_path_final}"
|
825 |
+
)
|
826 |
+
print(status_message)
|
827 |
+
progress_updater(1.0, desc=f"Finished {source_file_prefix}!")
|
828 |
+
|
829 |
+
# Return the path to the zip file
|
830 |
+
return output_zip_path_final, status_message
|
831 |
+
|
832 |
+
except Exception as e:
|
833 |
+
print(f"!! An unhandled error occurred during processing of {source_file_prefix}: {e}")
|
834 |
+
traceback.print_exc() # Print detailed traceback for debugging
|
835 |
+
# Clean up main data structures
|
836 |
+
images_pending_final_processing.clear()
|
837 |
+
ccip_clusters_info.clear()
|
838 |
+
gc.collect()
|
839 |
+
|
840 |
+
# Clean up temp file if it exists on general error
|
841 |
+
if output_zip_path_temp and os.path.exists(output_zip_path_temp):
|
842 |
+
try:
|
843 |
+
os.remove(output_zip_path_temp)
|
844 |
+
except OSError:
|
845 |
+
pass
|
846 |
+
|
847 |
+
# Clean up final file if it exists on general error (maybe renaming succeeded but later code failed)
|
848 |
+
if output_zip_path_final and os.path.exists(output_zip_path_final):
|
849 |
+
try:
|
850 |
+
os.remove(output_zip_path_final)
|
851 |
+
except OSError:
|
852 |
+
pass
|
853 |
+
return None, f"An error occurred with {source_file_prefix}: {e}"
|
854 |
+
|
855 |
+
# --- Main Processing Function for Input files ---
|
856 |
+
def process_inputs_main(
|
857 |
+
input_file_objects: List[Any], # Gradio File component gives list of tempfile._TemporaryFileWrapper
|
858 |
+
sample_interval_ms: int, # Relevant for videos only
|
859 |
+
# Person Detection
|
860 |
+
enable_person_detection: bool,
|
861 |
+
min_target_width_person_percentage: float,
|
862 |
+
person_model_name: str,
|
863 |
+
person_conf_threshold: float,
|
864 |
+
person_iou_threshold: float,
|
865 |
+
# Half-Body Detection
|
866 |
+
enable_halfbody_detection: bool,
|
867 |
+
enable_halfbody_cropping: bool,
|
868 |
+
min_target_width_halfbody_percentage: float,
|
869 |
+
halfbody_model_name: str,
|
870 |
+
halfbody_conf_threshold: float,
|
871 |
+
halfbody_iou_threshold: float,
|
872 |
+
# Head Detection
|
873 |
+
enable_head_detection: bool,
|
874 |
+
enable_head_cropping: bool,
|
875 |
+
min_crop_width_head_percentage: float,
|
876 |
+
enable_head_filtering: bool,
|
877 |
+
head_model_name: str,
|
878 |
+
head_conf_threshold: float,
|
879 |
+
head_iou_threshold: float,
|
880 |
+
# Face Detection
|
881 |
+
enable_face_detection: bool,
|
882 |
+
enable_face_cropping: bool,
|
883 |
+
min_crop_width_face_percentage: float,
|
884 |
+
enable_face_filtering: bool,
|
885 |
+
face_model_name: str,
|
886 |
+
face_conf_threshold: float,
|
887 |
+
face_iou_threshold: float,
|
888 |
+
# CCIP Classification
|
889 |
+
enable_ccip_classification: bool,
|
890 |
+
ccip_model_name: str,
|
891 |
+
ccip_threshold: float,
|
892 |
+
# LPIPS Clustering
|
893 |
+
enable_lpips_clustering: bool,
|
894 |
+
lpips_threshold: float,
|
895 |
+
# Aesthetic Analysis
|
896 |
+
enable_aesthetic_analysis: bool,
|
897 |
+
aesthetic_model_name: str,
|
898 |
+
progress=gr.Progress(track_tqdm=True) # Gradio progress for overall processing
|
899 |
+
) -> Tuple[Optional[List[str]], str]: # Returns list of ZIP paths and combined status
|
900 |
+
|
901 |
+
if not input_file_objects:
|
902 |
+
return [], "Error: No files provided."
|
903 |
+
|
904 |
+
video_file_temp_objects: List[Any] = []
|
905 |
+
image_file_temp_objects: List[Any] = []
|
906 |
+
|
907 |
+
for file_obj in input_file_objects:
|
908 |
+
# gr.Files returns a list of tempfile._TemporaryFileWrapper objects
|
909 |
+
# We need the .name attribute to get the actual file path
|
910 |
+
file_name = getattr(file_obj, 'orig_name', file_obj.name) # Use original name if available
|
911 |
+
if isinstance(file_name, str):
|
912 |
+
lower_file_name = file_name.lower()
|
913 |
+
if any(lower_file_name.endswith(ext) for ext in VIDEO_EXTENSIONS):
|
914 |
+
video_file_temp_objects.append(file_obj)
|
915 |
+
elif any(lower_file_name.endswith(ext) for ext in IMAGE_EXTENSIONS):
|
916 |
+
image_file_temp_objects.append(file_obj)
|
917 |
+
else:
|
918 |
+
print(f"Warning: File '{file_name}' has an unrecognized extension and will be skipped.")
|
919 |
+
else:
|
920 |
+
print(f"Warning: File object {file_obj} does not have a valid name and will be skipped.")
|
921 |
+
|
922 |
+
|
923 |
+
output_zip_paths_all_sources = []
|
924 |
+
all_status_messages = []
|
925 |
+
|
926 |
+
total_processing_tasks = (1 if image_file_temp_objects else 0) + len(video_file_temp_objects)
|
927 |
+
if total_processing_tasks == 0:
|
928 |
+
return [], "No processable video or image files found in the input."
|
929 |
+
|
930 |
+
tasks_completed_count = 0
|
931 |
+
|
932 |
+
# Print overall settings once
|
933 |
+
print(f"--- Overall Batch Processing Settings ---")
|
934 |
+
print(f" Number of image sequences to process: {1 if image_file_temp_objects else 0}")
|
935 |
+
print(f" Number of videos to process: {len(video_file_temp_objects)}")
|
936 |
+
print(f" Sample Interval (for videos): {sample_interval_ms}ms")
|
937 |
+
print(f" Detection Order: Person => Half-Body (alt) => Face => Head. Then: CCIP => LPIPS => Aesthetic.")
|
938 |
+
print(f" Person Detect = {enable_person_detection}" + (f" (MinW:{min_target_width_person_percentage*100:.1f}%, Mdl:{person_model_name}, Conf:{person_conf_threshold:.2f}, IoU:{person_iou_threshold:.2f})" if enable_person_detection else ""))
|
939 |
+
print(f" HalfBody Detect = {enable_halfbody_detection}" + (f" (FullFrameOnly, Crop:{enable_halfbody_cropping}, MinW:{min_target_width_halfbody_percentage*100:.1f}%, Mdl:{halfbody_model_name}, Conf:{halfbody_conf_threshold:.2f}, IoU:{halfbody_iou_threshold:.2f})" if enable_halfbody_detection else ""))
|
940 |
+
print(f" Face Detect = {enable_face_detection}" + (f" (Crop:{enable_face_cropping}, MinW:{min_crop_width_face_percentage*100:.1f}%, Filter:{enable_face_filtering}, Mdl:{face_model_name}, Conf:{face_conf_threshold:.2f}, IoU:{face_iou_threshold:.2f})" if enable_face_detection else ""))
|
941 |
+
print(f" Head Detect = {enable_head_detection}" + (f" (Crop:{enable_head_cropping}, MinW:{min_crop_width_head_percentage*100:.1f}%, Filter:{enable_head_filtering}, Mdl:{head_model_name}, Conf:{head_conf_threshold:.2f}, IoU:{head_iou_threshold:.2f})" if enable_head_detection else ""))
|
942 |
+
print(f" CCIP Classify = {enable_ccip_classification}" + (f" (Mdl:{ccip_model_name}, Thr:{ccip_threshold:.3f})" if enable_ccip_classification else ""))
|
943 |
+
print(f" LPIPS Clustering = {enable_lpips_clustering}" + (f" (Thr:{lpips_threshold:.3f})" if enable_lpips_clustering else ""))
|
944 |
+
print(f" Aesthetic Analyze = {enable_aesthetic_analysis}" + (f" (Mdl:{aesthetic_model_name})" if enable_aesthetic_analysis else ""))
|
945 |
+
print(f"--- End of Overall Settings ---")
|
946 |
+
|
947 |
+
|
948 |
+
# --- Process Image Sequence (if any) ---
|
949 |
+
if image_file_temp_objects:
|
950 |
+
image_group_label_base = "ImageGroup"
|
951 |
+
# Attempt to use first image name for more uniqueness, fallback to timestamp
|
952 |
+
try:
|
953 |
+
first_image_orig_name = getattr(image_file_temp_objects[0], 'orig_name', image_file_temp_objects[0].name)
|
954 |
+
image_group_label_base = sanitize_filename(first_image_orig_name, max_len=20)
|
955 |
+
except:
|
956 |
+
pass # Stick with "ImageGroup"
|
957 |
+
|
958 |
+
image_source_file_prefix = f"{image_group_label_base}_{int(time.time())}"
|
959 |
+
|
960 |
+
current_task_number = tasks_completed_count + 1
|
961 |
+
progress_description_prefix = f"Image Seq. {current_task_number}/{total_processing_tasks} ({image_source_file_prefix})"
|
962 |
+
progress(tasks_completed_count / total_processing_tasks, desc=f"{progress_description_prefix}: Starting...")
|
963 |
+
print(f"\n>>> Processing Image Sequence: {image_source_file_prefix} ({len(image_file_temp_objects)} images) <<<")
|
964 |
+
|
965 |
+
def image_frames_provider_generator() -> Iterator[Tuple[Image.Image, int, int, int]]:
|
966 |
+
num_images = len(image_file_temp_objects)
|
967 |
+
for idx, img_obj in enumerate(image_file_temp_objects):
|
968 |
+
try:
|
969 |
+
pil_img = Image.open(img_obj.name).convert('RGB')
|
970 |
+
yield pil_img, idx, idx + 1, num_images
|
971 |
+
except Exception as e_load:
|
972 |
+
print(f"Error loading image {getattr(img_obj, 'orig_name', img_obj.name)}: {e_load}. Skipping.")
|
973 |
+
# If we skip, the total_items_in_source for _process_input_source_frames might be off
|
974 |
+
# For simplicity, we'll proceed, but this could be refined to adjust total_items dynamically.
|
975 |
+
# Or, pre-filter loadable images. For now, just skip.
|
976 |
+
continue
|
977 |
+
|
978 |
+
def image_group_progress_updater(item_progress_value: float, desc: str):
|
979 |
+
overall_progress = (tasks_completed_count + item_progress_value) / total_processing_tasks
|
980 |
+
progress(overall_progress, desc=f"{progress_description_prefix}: {desc}")
|
981 |
+
|
982 |
+
try:
|
983 |
+
zip_file_path_single, status_message_single = _process_input_source_frames(
|
984 |
+
source_file_prefix=image_source_file_prefix,
|
985 |
+
frames_provider=image_frames_provider_generator(),
|
986 |
+
is_video_source=False,
|
987 |
+
enable_person_detection=enable_person_detection,
|
988 |
+
min_target_width_person_percentage=min_target_width_person_percentage,
|
989 |
+
person_model_name=person_model_name,
|
990 |
+
person_conf_threshold=person_conf_threshold,
|
991 |
+
person_iou_threshold=person_iou_threshold,
|
992 |
+
enable_halfbody_detection=enable_halfbody_detection,
|
993 |
+
enable_halfbody_cropping=enable_halfbody_cropping,
|
994 |
+
min_target_width_halfbody_percentage=min_target_width_halfbody_percentage,
|
995 |
+
halfbody_model_name=halfbody_model_name,
|
996 |
+
halfbody_conf_threshold=halfbody_conf_threshold,
|
997 |
+
halfbody_iou_threshold=halfbody_iou_threshold,
|
998 |
+
enable_head_detection=enable_head_detection,
|
999 |
+
enable_head_cropping=enable_head_cropping,
|
1000 |
+
min_crop_width_head_percentage=min_crop_width_head_percentage,
|
1001 |
+
enable_head_filtering=enable_head_filtering,
|
1002 |
+
head_model_name=head_model_name,
|
1003 |
+
head_conf_threshold=head_conf_threshold,
|
1004 |
+
head_iou_threshold=head_iou_threshold,
|
1005 |
+
enable_face_detection=enable_face_detection,
|
1006 |
+
enable_face_cropping=enable_face_cropping,
|
1007 |
+
min_crop_width_face_percentage=min_crop_width_face_percentage,
|
1008 |
+
enable_face_filtering=enable_face_filtering,
|
1009 |
+
face_model_name=face_model_name,
|
1010 |
+
face_conf_threshold=face_conf_threshold,
|
1011 |
+
face_iou_threshold=face_iou_threshold,
|
1012 |
+
enable_ccip_classification=enable_ccip_classification,
|
1013 |
+
ccip_model_name=ccip_model_name,
|
1014 |
+
ccip_threshold=ccip_threshold,
|
1015 |
+
enable_lpips_clustering=enable_lpips_clustering,
|
1016 |
+
lpips_threshold=lpips_threshold,
|
1017 |
+
enable_aesthetic_analysis=enable_aesthetic_analysis,
|
1018 |
+
aesthetic_model_name=aesthetic_model_name,
|
1019 |
+
progress_updater=image_group_progress_updater
|
1020 |
+
)
|
1021 |
+
if zip_file_path_single:
|
1022 |
+
output_zip_paths_all_sources.append(zip_file_path_single)
|
1023 |
+
all_status_messages.append(f"--- Image Sequence ({image_source_file_prefix}) Processing Succeeded ---\n{status_message_single}")
|
1024 |
+
else:
|
1025 |
+
all_status_messages.append(f"--- Image Sequence ({image_source_file_prefix}) Processing Failed ---\n{status_message_single}")
|
1026 |
+
except Exception as e_img_seq:
|
1027 |
+
error_msg = f"Critical error during processing of image sequence {image_source_file_prefix}: {e_img_seq}"
|
1028 |
+
print(error_msg)
|
1029 |
+
traceback.print_exc()
|
1030 |
+
all_status_messages.append(f"--- Image Sequence ({image_source_file_prefix}) Processing CRITICALLY FAILED ---\n{error_msg}")
|
1031 |
+
|
1032 |
+
tasks_completed_count += 1
|
1033 |
+
print(f">>> Finished attempt for Image Sequence: {image_source_file_prefix} <<<")
|
1034 |
+
|
1035 |
+
# --- Process Video Files (if any) ---
|
1036 |
+
for video_idx, video_file_temp_obj in enumerate(video_file_temp_objects):
|
1037 |
+
video_path_temp = video_file_temp_obj.name
|
1038 |
+
video_original_filename = os.path.basename(getattr(video_file_temp_obj, 'orig_name', video_path_temp))
|
1039 |
+
video_source_file_prefix = sanitize_filename(video_original_filename)
|
1040 |
+
|
1041 |
+
current_task_number = tasks_completed_count + 1
|
1042 |
+
progress_description_prefix = f"Video {current_task_number}/{total_processing_tasks}"
|
1043 |
+
|
1044 |
+
print(f"\n>>> Processing Video: {video_original_filename} (Sanitized Prefix: {video_source_file_prefix}) <<<")
|
1045 |
+
progress(tasks_completed_count / total_processing_tasks, desc=f"{progress_description_prefix}: Starting processing...")
|
1046 |
+
|
1047 |
+
# It yields: (PIL.Image, frame_identifier_string, current_raw_frame_index_from_video, total_items_for_desc)
|
1048 |
+
# The third element will be the raw frame number based on CAP_PROP_POS_FRAMES or current_pos_ms
|
1049 |
+
# to align progress with total_items_for_desc (raw frame count).
|
1050 |
+
def video_frames_provider_generator(video_path: str, interval_ms: int) -> Iterator[Tuple[Image.Image, int, int, int]]:
|
1051 |
+
cap = cv2.VideoCapture(video_path)
|
1052 |
+
if not cap.isOpened():
|
1053 |
+
print(f"Error: Could not open video file for provider: {video_path}")
|
1054 |
+
return
|
1055 |
+
|
1056 |
+
total_items_for_desc = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
|
1057 |
+
|
1058 |
+
if total_items_for_desc <= 0:
|
1059 |
+
print(f"Warning: Video {video_original_filename} reported {total_items_for_desc} frames. This might be inaccurate. Proceeding...")
|
1060 |
+
# If it's 0, the progress in _process_input_source_frames might behave unexpectedly.
|
1061 |
+
# Setting to 1 to avoid division by zero, but this means progress won't be very useful.
|
1062 |
+
total_items_for_desc = 1 # Fallback to prevent division by zero
|
1063 |
+
|
1064 |
+
# processed_count_in_provider = 0 # Counts *sampled* frames, not used for progress index
|
1065 |
+
last_processed_ms = -float('inf')
|
1066 |
+
raw_frames_read_by_provider = 0 # Counts all frames read by cap.read()
|
1067 |
+
|
1068 |
+
try:
|
1069 |
+
while True:
|
1070 |
+
# For progress, use current_pos_ms or CAP_PROP_POS_FRAMES
|
1071 |
+
# CAP_PROP_POS_FRAMES is a 0-based index of the next frame to be decoded/captured.
|
1072 |
+
current_raw_frame_index = int(cap.get(cv2.CAP_PROP_POS_FRAMES)) # Use this for progress
|
1073 |
+
current_pos_ms_in_provider = cap.get(cv2.CAP_PROP_POS_MSEC)
|
1074 |
+
|
1075 |
+
# Loop break condition (more robust)
|
1076 |
+
if raw_frames_read_by_provider > 0 and current_pos_ms_in_provider <= last_processed_ms and interval_ms > 0 :
|
1077 |
+
# If interval_ms is 0 or very small, current_pos_ms might not advance much for consecutive reads.
|
1078 |
+
# Adding a check for raw_frames_read_by_provider against a large number or CAP_PROP_FRAME_COUNT
|
1079 |
+
# could be an additional safety, but CAP_PROP_FRAME_COUNT can be unreliable.
|
1080 |
+
# The ret_frame check is the primary exit.
|
1081 |
+
pass # Let ret_frame handle the actual end. This check is for stuck videos.
|
1082 |
+
|
1083 |
+
|
1084 |
+
should_process_this_frame = current_pos_ms_in_provider >= last_processed_ms + interval_ms - 1
|
1085 |
+
|
1086 |
+
ret_frame, frame_cv_data = cap.read()
|
1087 |
+
if not ret_frame: # Primary exit point for the loop
|
1088 |
+
break
|
1089 |
+
raw_frames_read_by_provider +=1 # Incremented after successful read
|
1090 |
+
|
1091 |
+
if should_process_this_frame:
|
1092 |
+
try:
|
1093 |
+
pil_img = convert_to_pil(frame_cv_data)
|
1094 |
+
last_processed_ms = current_pos_ms_in_provider
|
1095 |
+
yield pil_img, int(current_pos_ms_in_provider), current_raw_frame_index + 1, total_items_for_desc # Yield 1-based raw frame index
|
1096 |
+
except Exception as e_conv:
|
1097 |
+
print(f"Error converting frame at {current_pos_ms_in_provider}ms (raw index {current_raw_frame_index}) for {video_original_filename}: {e_conv}. Skipping.")
|
1098 |
+
finally:
|
1099 |
+
pass
|
1100 |
+
finally:
|
1101 |
+
if cap.isOpened():
|
1102 |
+
cap.release()
|
1103 |
+
print(f" Video capture for provider ({video_original_filename}) released.")
|
1104 |
+
|
1105 |
+
def video_progress_updater(item_progress_value: float, desc: str):
|
1106 |
+
overall_progress = (tasks_completed_count + item_progress_value) / total_processing_tasks
|
1107 |
+
progress(overall_progress, desc=f"{progress_description_prefix}: {desc}")
|
1108 |
+
|
1109 |
+
try:
|
1110 |
+
zip_file_path_single, status_message_single = _process_input_source_frames(
|
1111 |
+
source_file_prefix=video_source_file_prefix,
|
1112 |
+
frames_provider=video_frames_provider_generator(video_path_temp, sample_interval_ms),
|
1113 |
+
is_video_source=True,
|
1114 |
+
enable_person_detection=enable_person_detection,
|
1115 |
+
min_target_width_person_percentage=min_target_width_person_percentage,
|
1116 |
+
person_model_name=person_model_name,
|
1117 |
+
person_conf_threshold=person_conf_threshold,
|
1118 |
+
person_iou_threshold=person_iou_threshold,
|
1119 |
+
enable_halfbody_detection=enable_halfbody_detection,
|
1120 |
+
enable_halfbody_cropping=enable_halfbody_cropping,
|
1121 |
+
min_target_width_halfbody_percentage=min_target_width_halfbody_percentage,
|
1122 |
+
halfbody_model_name=halfbody_model_name,
|
1123 |
+
halfbody_conf_threshold=halfbody_conf_threshold,
|
1124 |
+
halfbody_iou_threshold=halfbody_iou_threshold,
|
1125 |
+
enable_head_detection=enable_head_detection,
|
1126 |
+
enable_head_cropping=enable_head_cropping,
|
1127 |
+
min_crop_width_head_percentage=min_crop_width_head_percentage,
|
1128 |
+
enable_head_filtering=enable_head_filtering,
|
1129 |
+
head_model_name=head_model_name,
|
1130 |
+
head_conf_threshold=head_conf_threshold,
|
1131 |
+
head_iou_threshold=head_iou_threshold,
|
1132 |
+
enable_face_detection=enable_face_detection,
|
1133 |
+
enable_face_cropping=enable_face_cropping,
|
1134 |
+
min_crop_width_face_percentage=min_crop_width_face_percentage,
|
1135 |
+
enable_face_filtering=enable_face_filtering,
|
1136 |
+
face_model_name=face_model_name,
|
1137 |
+
face_conf_threshold=face_conf_threshold,
|
1138 |
+
face_iou_threshold=face_iou_threshold,
|
1139 |
+
enable_ccip_classification=enable_ccip_classification,
|
1140 |
+
ccip_model_name=ccip_model_name,
|
1141 |
+
ccip_threshold=ccip_threshold,
|
1142 |
+
enable_lpips_clustering=enable_lpips_clustering,
|
1143 |
+
lpips_threshold=lpips_threshold,
|
1144 |
+
enable_aesthetic_analysis=enable_aesthetic_analysis,
|
1145 |
+
aesthetic_model_name=aesthetic_model_name,
|
1146 |
+
progress_updater=video_progress_updater
|
1147 |
+
)
|
1148 |
+
if zip_file_path_single:
|
1149 |
+
output_zip_paths_all_sources.append(zip_file_path_single)
|
1150 |
+
all_status_messages.append(f"--- Video ({video_original_filename}) Processing Succeeded ---\n{status_message_single}")
|
1151 |
+
else:
|
1152 |
+
all_status_messages.append(f"--- Video ({video_original_filename}) Processing Failed ---\n{status_message_single}")
|
1153 |
+
|
1154 |
+
except Exception as e_vid:
|
1155 |
+
# This catches errors if process_video itself raises an unhandled exception
|
1156 |
+
# (though process_video has its own try-except)
|
1157 |
+
error_msg = f"Critical error during processing of video {video_original_filename}: {e_vid}"
|
1158 |
+
print(error_msg)
|
1159 |
+
traceback.print_exc()
|
1160 |
+
all_status_messages.append(f"--- Video ({video_original_filename}) Processing CRITICALLY FAILED ---\n{error_msg}")
|
1161 |
+
|
1162 |
+
tasks_completed_count += 1
|
1163 |
+
print(f">>> Finished attempt for Video: {video_original_filename} <<<")
|
1164 |
+
# Gradio manages the lifecycle of video_path_temp (the uploaded temp file)
|
1165 |
+
|
1166 |
+
final_summary_message = "\n\n==============================\n\n".join(all_status_messages)
|
1167 |
+
|
1168 |
+
successful_zips_count = len(output_zip_paths_all_sources)
|
1169 |
+
if successful_zips_count == 0 and total_processing_tasks > 0:
|
1170 |
+
final_summary_message = f"ALL {total_processing_tasks} INPUT SOURCE(S) FAILED TO PRODUCE A ZIP FILE.\n\n" + final_summary_message
|
1171 |
+
elif total_processing_tasks > 0:
|
1172 |
+
final_summary_message = f"Successfully processed {successful_zips_count} out of {total_processing_tasks} input source(s).\n\n" + final_summary_message
|
1173 |
+
else: # Should be caught earlier by "No processable files"
|
1174 |
+
final_summary_message = "No inputs were processed."
|
1175 |
+
|
1176 |
+
progress(1.0, desc="All processing attempts finished.")
|
1177 |
+
|
1178 |
+
# gr.Files output expects a list of file paths. An empty list is fine if no files.
|
1179 |
+
return output_zip_paths_all_sources, final_summary_message
|
1180 |
+
|
1181 |
+
|
1182 |
+
# --- Gradio Interface Setup ---
|
1183 |
+
|
1184 |
+
css = """
|
1185 |
+
/* Default (Light Mode) Styles */
|
1186 |
+
#warning {
|
1187 |
+
background-color: #FFCCCB; /* Light red background */
|
1188 |
+
padding: 10px;
|
1189 |
+
border-radius: 5px;
|
1190 |
+
color: #A00000; /* Dark red text */
|
1191 |
+
border: 1px solid #E5B8B7; /* A slightly darker border for more definition */
|
1192 |
+
}
|
1193 |
+
/* Dark Mode Styles */
|
1194 |
+
@media (prefers-color-scheme: dark) {
|
1195 |
+
#warning {
|
1196 |
+
background-color: #5C1A1A; /* Darker red background, suitable for dark mode */
|
1197 |
+
color: #FFDDDD; /* Light pink text, for good contrast against the dark red background */
|
1198 |
+
border: 1px solid #8B0000; /* A more prominent dark red border in dark mode */
|
1199 |
+
}
|
1200 |
+
}
|
1201 |
+
#status_box {
|
1202 |
+
white-space: pre-wrap !important; /* Ensure status messages show newlines */
|
1203 |
+
font-family: monospace; /* Optional: Use monospace for better alignment */
|
1204 |
+
}
|
1205 |
+
"""
|
1206 |
+
|
1207 |
+
# --- Define Model Lists ---
|
1208 |
+
person_models = ['person_detect_v1.3_s', 'person_detect_v1.2_s', 'person_detect_v1.1_s', 'person_detect_v1.1_m', 'person_detect_v1_m', 'person_detect_v1.1_n', 'person_detect_v0_s', 'person_detect_v0_m', 'person_detect_v0_x']
|
1209 |
+
halfbody_models = ['halfbody_detect_v1.0_s', 'halfbody_detect_v1.0_n', 'halfbody_detect_v0.4_s', 'halfbody_detect_v0.3_s', 'halfbody_detect_v0.2_s']
|
1210 |
+
head_models = ['head_detect_v2.0_s', 'head_detect_v2.0_m', 'head_detect_v2.0_n', 'head_detect_v2.0_x', 'head_detect_v2.0_s_yv11', 'head_detect_v2.0_m_yv11', 'head_detect_v2.0_n_yv11', 'head_detect_v2.0_x_yv11', 'head_detect_v2.0_l_yv11']
|
1211 |
+
face_models = ['face_detect_v1.4_s', 'face_detect_v1.4_n', 'face_detect_v1.3_s', 'face_detect_v1.3_n', 'face_detect_v1.2_s', 'face_detect_v1.1_s', 'face_detect_v1.1_n', 'face_detect_v1_s', 'face_detect_v1_n', 'face_detect_v0_s', 'face_detect_v0_n']
|
1212 |
+
ccip_models = ['ccip-caformer-24-randaug-pruned', 'ccip-caformer-6-randaug-pruned_fp32', 'ccip-caformer-5_fp32']
|
1213 |
+
aesthetic_models = ['swinv2pv3_v0_448_ls0.2_x', 'swinv2pv3_v0_448_ls0.2', 'caformer_s36_v0_ls0.2']
|
1214 |
+
|
1215 |
+
with gr.Blocks(css=css) as demo:
|
1216 |
+
gr.Markdown("# Video Processor using dghs-imgutils")
|
1217 |
+
gr.Markdown("Upload one or more videos, or a sequence of images. Videos are processed individually, while multiple images are treated as a single sequence. Each processed source (video or image sequence) is then sequentially analyzed by [dghs-imgutils](https://github.com/deepghs/imgutils) to detect subjects, classify items, and process its content according to your settings, ultimately generating a ZIP file with the extracted images.")
|
1218 |
+
gr.Markdown("**Detection Flow:** " +
|
1219 |
+
"[Person](https://dghs-imgutils.deepghs.org/main/api_doc/detect/person.html) ⇒ " +
|
1220 |
+
"[Half-Body](https://dghs-imgutils.deepghs.org/main/api_doc/detect/halfbody.html) (if no person) ⇒ " +
|
1221 |
+
"[Face](https://dghs-imgutils.deepghs.org/main/api_doc/detect/face.html) (on target) ⇒ " +
|
1222 |
+
"[Head](https://dghs-imgutils.deepghs.org/main/api_doc/detect/head.html) (on target).")
|
1223 |
+
gr.Markdown("**Analysis Flow:** " +
|
1224 |
+
"[CCIP](https://dghs-imgutils.deepghs.org/main/api_doc/metrics/ccip.html) Clustering ⇒ " +
|
1225 |
+
"[LPIPS](https://dghs-imgutils.deepghs.org/main/api_doc/metrics/lpips.html) Clustering ⇒ " +
|
1226 |
+
"[Aesthetic](https://dghs-imgutils.deepghs.org/main/api_doc/metrics/dbaesthetic.html) Labeling.")
|
1227 |
+
gr.Markdown("**Note on CCIP Folders:** CCIP cluster folders are named `{source_prefix}_ccip_XXX`, sorted by image count (most images = `_ccip_000`).")
|
1228 |
+
gr.Markdown("**Note on LPIPS Folders:** LPIPS cluster folders (e.g., `lpips_XXX` or `lpips_sub_XXX`) are also sorted by image count within their scope. 'noise' folders are named explicitly.")
|
1229 |
+
|
1230 |
+
with gr.Row():
|
1231 |
+
with gr.Column(scale=1):
|
1232 |
+
# --- Input Components ---
|
1233 |
+
process_button = gr.Button("Process Input(s) & Generate ZIP(s)", variant="primary")
|
1234 |
+
input_files = gr.Files(label="Upload Videos or Image Sequences", file_types=['video', 'image'], file_count="multiple")
|
1235 |
+
sample_interval_ms = gr.Number(label="Sample Interval (ms, for videos)", value=1000, minimum=1, step=100)
|
1236 |
+
|
1237 |
+
# --- Detection Options ---
|
1238 |
+
gr.Markdown("**Detection Options**")
|
1239 |
+
# --- Person Detection Block ---
|
1240 |
+
with gr.Accordion("Person Detection Options", open=True):
|
1241 |
+
enable_person_detection = gr.Checkbox(label="Enable Person Detection", value=True)
|
1242 |
+
with gr.Group() as person_detection_params_group:
|
1243 |
+
min_target_width_person_percentage_slider = gr.Slider(
|
1244 |
+
minimum=0.0, maximum=1.0, value=0.25, step=0.01,
|
1245 |
+
label="Min Target Width (% of Item Width)",
|
1246 |
+
info="Minimum width for a detected person to be processed (e.g., 0.25 = 25%)."
|
1247 |
+
)
|
1248 |
+
person_model_name_dd = gr.Dropdown(person_models, label="PD Model", value=person_models[0])
|
1249 |
+
person_conf_threshold = gr.Slider(0.0, 1.0, value=0.3, step=0.05, label="PD Conf")
|
1250 |
+
person_iou_threshold = gr.Slider(0.0, 1.0, value=0.5, step=0.05, label="PD IoU")
|
1251 |
+
enable_person_detection.change(fn=lambda e: gr.update(visible=e), inputs=enable_person_detection, outputs=person_detection_params_group)
|
1252 |
+
|
1253 |
+
# --- Half-Body Detection Block ---
|
1254 |
+
with gr.Accordion("Half-Body Detection Options", open=True):
|
1255 |
+
enable_halfbody_detection = gr.Checkbox(label="Enable Half-Body Detection", value=True)
|
1256 |
+
with gr.Group() as halfbody_params_group:
|
1257 |
+
gr.Markdown("<small>_Detects half-bodies in full items if Person Detection is off/fails._</small>")
|
1258 |
+
enable_halfbody_cropping = gr.Checkbox(label="Use Half-Bodies as Targets", value=True)
|
1259 |
+
min_target_width_halfbody_percentage_slider = gr.Slider(
|
1260 |
+
minimum=0.0, maximum=1.0, value=0.25, step=0.01,
|
1261 |
+
label="Min Target Width (% of Item Width)",
|
1262 |
+
info="Minimum width for a detected half-body to be processed (e.g., 0.25 = 25%)."
|
1263 |
+
)
|
1264 |
+
halfbody_model_name_dd = gr.Dropdown(halfbody_models, label="HBD Model", value=halfbody_models[0])
|
1265 |
+
halfbody_conf_threshold = gr.Slider(0.0, 1.0, value=0.5, step=0.05, label="HBD Conf")
|
1266 |
+
halfbody_iou_threshold = gr.Slider(0.0, 1.0, value=0.7, step=0.05, label="HBD IoU")
|
1267 |
+
enable_halfbody_detection.change(fn=lambda e: gr.update(visible=e), inputs=enable_halfbody_detection, outputs=halfbody_params_group)
|
1268 |
+
|
1269 |
+
# --- Face Detection Block ---
|
1270 |
+
with gr.Accordion("Face Detection Options", open=True):
|
1271 |
+
enable_face_detection = gr.Checkbox(label="Enable Face Detection", value=True)
|
1272 |
+
with gr.Group() as face_params_group:
|
1273 |
+
enable_face_filtering = gr.Checkbox(label="Filter Targets Without Detected Faces", value=True)
|
1274 |
+
enable_face_cropping = gr.Checkbox(label="Crop Detected Faces", value=False)
|
1275 |
+
min_crop_width_face_percentage_slider = gr.Slider(
|
1276 |
+
minimum=0.0, maximum=1.0, value=0.2, step=0.01,
|
1277 |
+
label="Min Crop Width (% of Parent Width)",
|
1278 |
+
info="Minimum width for a face crop relative to its parent image's width (e.g., 0.2 = 20%)."
|
1279 |
+
)
|
1280 |
+
face_model_name_dd = gr.Dropdown(face_models, label="FD Model", value=face_models[0])
|
1281 |
+
face_conf_threshold = gr.Slider(0.0, 1.0, value=0.25, step=0.05, label="FD Conf")
|
1282 |
+
face_iou_threshold = gr.Slider(0.0, 1.0, value=0.7, step=0.05, label="FD IoU")
|
1283 |
+
enable_face_detection.change(fn=lambda e: gr.update(visible=e), inputs=enable_face_detection, outputs=face_params_group)
|
1284 |
+
|
1285 |
+
# --- Head Detection Block ---
|
1286 |
+
with gr.Accordion("Head Detection Options", open=True):
|
1287 |
+
enable_head_detection = gr.Checkbox(label="Enable Head Detection", value=True)
|
1288 |
+
with gr.Group() as head_params_group:
|
1289 |
+
gr.Markdown("<small>_Detects heads in targets. Crops if meets width req._</small>")
|
1290 |
+
enable_head_filtering = gr.Checkbox(label="Filter Targets Without Heads", value=True)
|
1291 |
+
enable_head_cropping = gr.Checkbox(label="Crop Detected Heads", value=False)
|
1292 |
+
min_crop_width_head_percentage_slider = gr.Slider(
|
1293 |
+
minimum=0.0, maximum=1.0, value=0.2, step=0.01,
|
1294 |
+
label="Min Crop Width (% of Parent Width)",
|
1295 |
+
info="Minimum width for a head crop relative to its parent image's width (e.g., 0.2 = 20%)."
|
1296 |
+
)
|
1297 |
+
head_model_name_dd = gr.Dropdown(head_models, label="HD Model", value=head_models[0])
|
1298 |
+
head_conf_threshold = gr.Slider(0.0, 1.0, value=0.4, step=0.05, label="HD Conf")
|
1299 |
+
head_iou_threshold = gr.Slider(0.0, 1.0, value=0.7, step=0.05, label="HD IoU")
|
1300 |
+
enable_head_detection.change(fn=lambda e: gr.update(visible=e), inputs=enable_head_detection, outputs=head_params_group)
|
1301 |
+
|
1302 |
+
# --- Analysis/Classification Options ---
|
1303 |
+
gr.Markdown("**Analysis & Classification**")
|
1304 |
+
# --- CCIP Classification Block ---
|
1305 |
+
with gr.Accordion("CCIP Classification Options", open=True):
|
1306 |
+
enable_ccip_classification = gr.Checkbox(label="Enable CCIP Classification", value=True)
|
1307 |
+
with gr.Group() as ccip_params_group:
|
1308 |
+
gr.Markdown("<small>_Clusters results by similarity. Folders sorted by image count._</small>")
|
1309 |
+
ccip_model_name_dd = gr.Dropdown(ccip_models, label="CCIP Model", value=ccip_models[0])
|
1310 |
+
ccip_threshold_slider = gr.Slider(0.0, 1.0, step=0.01, value=0.20, label="CCIP Similarity Threshold")
|
1311 |
+
enable_ccip_classification.change(fn=lambda e: gr.update(visible=e), inputs=enable_ccip_classification, outputs=ccip_params_group)
|
1312 |
+
|
1313 |
+
# LPIPS Clustering Options
|
1314 |
+
with gr.Accordion("LPIPS Clustering Options", open=True):
|
1315 |
+
enable_lpips_clustering = gr.Checkbox(label="Enable LPIPS Clustering", value=True)
|
1316 |
+
with gr.Group() as lpips_params_group:
|
1317 |
+
gr.Markdown("<small>_Clusters images by LPIPS similarity. Applied after CCIP (if enabled) or globally. Folders sorted by image count._</small>")
|
1318 |
+
lpips_threshold_slider = gr.Slider(0.0, 1.0, step=0.01, value=0.45, label="LPIPS Similarity Threshold")
|
1319 |
+
enable_lpips_clustering.change(fn=lambda e: gr.update(visible=e), inputs=enable_lpips_clustering, outputs=lpips_params_group)
|
1320 |
+
|
1321 |
+
# --- Aesthetic Analysis Block ---
|
1322 |
+
with gr.Accordion("Aesthetic Analysis Options", open=True):
|
1323 |
+
enable_aesthetic_analysis = gr.Checkbox(label="Enable Aesthetic Analysis (Anime)", value=True)
|
1324 |
+
with gr.Group() as aesthetic_params_group:
|
1325 |
+
gr.Markdown("<small>_Prepends aesthetic label to filenames._</small>")
|
1326 |
+
aesthetic_model_name_dd = gr.Dropdown(aesthetic_models, label="Aesthetic Model", value=aesthetic_models[0])
|
1327 |
+
enable_aesthetic_analysis.change(fn=lambda e: gr.update(visible=e), inputs=enable_aesthetic_analysis, outputs=aesthetic_params_group)
|
1328 |
+
|
1329 |
+
gr.Markdown("---")
|
1330 |
+
gr.Markdown("**Warning:** Complex combinations can be slow. Models downloaded on first use.", elem_id="warning")
|
1331 |
+
|
1332 |
+
with gr.Column(scale=1):
|
1333 |
+
# --- Output Components ---
|
1334 |
+
status_text = gr.Textbox(label="Processing Status", interactive=False, lines=20, elem_id="status_box")
|
1335 |
+
output_zips = gr.Files(label="Download Processed Images (ZIPs)")
|
1336 |
+
|
1337 |
+
# Connect button click
|
1338 |
+
process_button.click(
|
1339 |
+
fn=process_inputs_main,
|
1340 |
+
inputs=[
|
1341 |
+
input_files, sample_interval_ms,
|
1342 |
+
# Person Detect
|
1343 |
+
enable_person_detection, min_target_width_person_percentage_slider,
|
1344 |
+
person_model_name_dd, person_conf_threshold, person_iou_threshold,
|
1345 |
+
# HalfBody Detect
|
1346 |
+
enable_halfbody_detection, enable_halfbody_cropping, min_target_width_halfbody_percentage_slider,
|
1347 |
+
halfbody_model_name_dd, halfbody_conf_threshold, halfbody_iou_threshold,
|
1348 |
+
# Head Detect
|
1349 |
+
enable_head_detection, enable_head_cropping, min_crop_width_head_percentage_slider,
|
1350 |
+
enable_head_filtering, head_model_name_dd, head_conf_threshold, head_iou_threshold,
|
1351 |
+
# Face Detect
|
1352 |
+
enable_face_detection, enable_face_cropping, min_crop_width_face_percentage_slider,
|
1353 |
+
enable_face_filtering, face_model_name_dd, face_conf_threshold, face_iou_threshold,
|
1354 |
+
# CCIP
|
1355 |
+
enable_ccip_classification, ccip_model_name_dd, ccip_threshold_slider,
|
1356 |
+
# LPIPS
|
1357 |
+
enable_lpips_clustering, lpips_threshold_slider,
|
1358 |
+
# Aesthetic
|
1359 |
+
enable_aesthetic_analysis, aesthetic_model_name_dd,
|
1360 |
+
],
|
1361 |
+
outputs=[output_zips, status_text]
|
1362 |
+
)
|
1363 |
+
|
1364 |
+
# --- Launch Script ---
|
1365 |
+
if __name__ == "__main__":
|
1366 |
+
print("Starting Gradio App...")
|
1367 |
+
# Model pre-check
|
1368 |
+
try:
|
1369 |
+
print("Checking/Downloading models (this might take a moment)...")
|
1370 |
+
# Use simple, small images for checks
|
1371 |
+
dummy_img_pil = Image.new('RGB', (64, 64), color = 'orange')
|
1372 |
+
print(" - Person detection...")
|
1373 |
+
_ = person_detector.detect_person(dummy_img_pil, model_name=person_models[0])
|
1374 |
+
print(" - HalfBody detection...")
|
1375 |
+
_ = halfbody_detector.detect_halfbody(dummy_img_pil, model_name=halfbody_models[0])
|
1376 |
+
print(" - Head detection...")
|
1377 |
+
_ = head_detector.detect_heads(dummy_img_pil, model_name=head_models[0])
|
1378 |
+
print(" - Face detection...")
|
1379 |
+
_ = face_detector.detect_faces(dummy_img_pil, model_name=face_models[0])
|
1380 |
+
print(" - CCIP feature extraction...")
|
1381 |
+
_ = ccip_analyzer.ccip_extract_feature(dummy_img_pil, size=384, model=ccip_models[0])
|
1382 |
+
print(" - LPIPS feature extraction...")
|
1383 |
+
_ = lpips_module.lpips_extract_feature(dummy_img_pil)
|
1384 |
+
print(" - Aesthetic analysis...")
|
1385 |
+
_ = dbaesthetic_analyzer.anime_dbaesthetic(dummy_img_pil, model_name=aesthetic_models[0])
|
1386 |
+
print("Models seem ready or downloaded.")
|
1387 |
+
del dummy_img_pil
|
1388 |
+
gc.collect()
|
1389 |
+
except Exception as model_err:
|
1390 |
+
print(f"\n--- !!! WARNING !!! ---")
|
1391 |
+
print(f"Could not pre-check/download all models: {model_err}")
|
1392 |
+
print(f"Models will be downloaded when first used by the application, which may cause a delay on the first run.")
|
1393 |
+
print(f"Check your internet connection and library installation (pip install \"dghs-imgutils[gpu]\").")
|
1394 |
+
print(f"-----------------------\n")
|
1395 |
+
# Launch the app
|
1396 |
+
demo.launch(inbrowser=True)
|
requirements.txt
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
gradio==5.29.0
|
2 |
+
dghs-imgutils[gpu]
|
webui.bat
ADDED
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
@echo off
|
2 |
+
|
3 |
+
:: The source of the webui.bat file is stable-diffusion-webui
|
4 |
+
:: set COMMANDLINE_ARGS=--whisper_implementation faster-whisper --input_audio_max_duration -1 --default_model_name large-v2 --auto_parallel True --output_dir output --vad_max_merge_size 90 --save_downloaded_files --autolaunch
|
5 |
+
|
6 |
+
if not defined PYTHON (set PYTHON=python)
|
7 |
+
if not defined VENV_DIR (set "VENV_DIR=%~dp0%venv")
|
8 |
+
|
9 |
+
mkdir tmp 2>NUL
|
10 |
+
|
11 |
+
%PYTHON% -c "" >tmp/stdout.txt 2>tmp/stderr.txt
|
12 |
+
if %ERRORLEVEL% == 0 goto :check_pip
|
13 |
+
echo Couldn't launch python
|
14 |
+
goto :show_stdout_stderr
|
15 |
+
|
16 |
+
:check_pip
|
17 |
+
%PYTHON% -mpip --help >tmp/stdout.txt 2>tmp/stderr.txt
|
18 |
+
if %ERRORLEVEL% == 0 goto :start_venv
|
19 |
+
if "%PIP_INSTALLER_LOCATION%" == "" goto :show_stdout_stderr
|
20 |
+
%PYTHON% "%PIP_INSTALLER_LOCATION%" >tmp/stdout.txt 2>tmp/stderr.txt
|
21 |
+
if %ERRORLEVEL% == 0 goto :start_venv
|
22 |
+
echo Couldn't install pip
|
23 |
+
goto :show_stdout_stderr
|
24 |
+
|
25 |
+
:start_venv
|
26 |
+
if ["%VENV_DIR%"] == ["-"] goto :skip_venv
|
27 |
+
if ["%SKIP_VENV%"] == ["1"] goto :skip_venv
|
28 |
+
|
29 |
+
dir "%VENV_DIR%\Scripts\Python.exe" >tmp/stdout.txt 2>tmp/stderr.txt
|
30 |
+
if %ERRORLEVEL% == 0 goto :activate_venv
|
31 |
+
|
32 |
+
for /f "delims=" %%i in ('CALL %PYTHON% -c "import sys; print(sys.executable)"') do set PYTHON_FULLNAME="%%i"
|
33 |
+
echo Creating venv in directory %VENV_DIR% using python %PYTHON_FULLNAME%
|
34 |
+
%PYTHON_FULLNAME% -m venv "%VENV_DIR%" >tmp/stdout.txt 2>tmp/stderr.txt
|
35 |
+
if %ERRORLEVEL% == 0 goto :activate_venv
|
36 |
+
echo Unable to create venv in directory "%VENV_DIR%"
|
37 |
+
goto :show_stdout_stderr
|
38 |
+
|
39 |
+
:activate_venv
|
40 |
+
set PYTHON="%VENV_DIR%\Scripts\Python.exe"
|
41 |
+
echo venv %PYTHON%
|
42 |
+
|
43 |
+
:skip_venv
|
44 |
+
goto :launch
|
45 |
+
|
46 |
+
:launch
|
47 |
+
%PYTHON% app.py %COMMANDLINE_ARGS% %*
|
48 |
+
pause
|
49 |
+
exit /b
|
50 |
+
|
51 |
+
:show_stdout_stderr
|
52 |
+
|
53 |
+
echo.
|
54 |
+
echo exit code: %errorlevel%
|
55 |
+
|
56 |
+
for /f %%i in ("tmp\stdout.txt") do set size=%%~zi
|
57 |
+
if %size% equ 0 goto :show_stderr
|
58 |
+
echo.
|
59 |
+
echo stdout:
|
60 |
+
type tmp\stdout.txt
|
61 |
+
|
62 |
+
:show_stderr
|
63 |
+
for /f %%i in ("tmp\stderr.txt") do set size=%%~zi
|
64 |
+
if %size% equ 0 goto :show_stderr
|
65 |
+
echo.
|
66 |
+
echo stderr:
|
67 |
+
type tmp\stderr.txt
|
68 |
+
|
69 |
+
:endofscript
|
70 |
+
|
71 |
+
echo.
|
72 |
+
echo Launch unsuccessful. Exiting.
|
73 |
+
pause
|