v0.31.0
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.31.0 for changelog.
README.md
CHANGED
@@ -38,18 +38,18 @@ More details on model performance across various devices, can be found
|
|
38 |
|
39 |
| Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
40 |
|---|---|---|---|---|---|---|---|---|
|
41 |
-
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
42 |
-
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
43 |
-
| TextEncoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
44 |
-
| UNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
45 |
-
| UNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
46 |
-
| UNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
47 |
-
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
48 |
-
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
49 |
-
| VAEDecoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
50 |
-
| ControlNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile |
|
51 |
-
| ControlNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile |
|
52 |
-
| ControlNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) |
|
53 |
|
54 |
|
55 |
|
@@ -112,7 +112,7 @@ Profiling Results
|
|
112 |
------------------------------------------------------------
|
113 |
TextEncoder_Quantized
|
114 |
Device : cs_8_gen_2 (ANDROID 13)
|
115 |
-
Runtime :
|
116 |
Estimated inference time (ms) : 10.9
|
117 |
Estimated peak memory usage (MB): [0, 3]
|
118 |
Total # Ops : 569
|
@@ -121,7 +121,7 @@ Compute Unit(s) : npu (569 ops) gpu (0 ops) cpu (0 ops)
|
|
121 |
------------------------------------------------------------
|
122 |
UNet_Quantized
|
123 |
Device : cs_8_gen_2 (ANDROID 13)
|
124 |
-
Runtime :
|
125 |
Estimated inference time (ms) : 258.2
|
126 |
Estimated peak memory usage (MB): [13, 15]
|
127 |
Total # Ops : 5433
|
@@ -130,7 +130,7 @@ Compute Unit(s) : npu (5433 ops) gpu (0 ops) cpu (0 ops)
|
|
130 |
------------------------------------------------------------
|
131 |
VAEDecoder_Quantized
|
132 |
Device : cs_8_gen_2 (ANDROID 13)
|
133 |
-
Runtime :
|
134 |
Estimated inference time (ms) : 397.6
|
135 |
Estimated peak memory usage (MB): [0, 2]
|
136 |
Total # Ops : 408
|
@@ -139,7 +139,7 @@ Compute Unit(s) : npu (408 ops) gpu (0 ops) cpu (0 ops)
|
|
139 |
------------------------------------------------------------
|
140 |
ControlNet_Quantized
|
141 |
Device : cs_8_gen_2 (ANDROID 13)
|
142 |
-
Runtime :
|
143 |
Estimated inference time (ms) : 104.7
|
144 |
Estimated peak memory usage (MB): [2, 9]
|
145 |
Total # Ops : 2405
|
|
|
38 |
|
39 |
| Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
|
40 |
|---|---|---|---|---|---|---|---|---|
|
41 |
+
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 10.874 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
42 |
+
| TextEncoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 7.918 ms | 0 - 18 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
43 |
+
| TextEncoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 10.875 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
44 |
+
| UNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 258.151 ms | 13 - 15 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
45 |
+
| UNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 197.629 ms | 13 - 31 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
46 |
+
| UNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 256.936 ms | 13 - 16 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
47 |
+
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 397.625 ms | 0 - 2 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
48 |
+
| VAEDecoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 300.627 ms | 0 - 21 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
49 |
+
| VAEDecoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 395.006 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
50 |
+
| ControlNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 104.668 ms | 2 - 9 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
51 |
+
| ControlNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 77.289 ms | 2 - 23 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
52 |
+
| ControlNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 103.817 ms | 2 - 5 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
|
53 |
|
54 |
|
55 |
|
|
|
112 |
------------------------------------------------------------
|
113 |
TextEncoder_Quantized
|
114 |
Device : cs_8_gen_2 (ANDROID 13)
|
115 |
+
Runtime : QNN_DLC
|
116 |
Estimated inference time (ms) : 10.9
|
117 |
Estimated peak memory usage (MB): [0, 3]
|
118 |
Total # Ops : 569
|
|
|
121 |
------------------------------------------------------------
|
122 |
UNet_Quantized
|
123 |
Device : cs_8_gen_2 (ANDROID 13)
|
124 |
+
Runtime : QNN_DLC
|
125 |
Estimated inference time (ms) : 258.2
|
126 |
Estimated peak memory usage (MB): [13, 15]
|
127 |
Total # Ops : 5433
|
|
|
130 |
------------------------------------------------------------
|
131 |
VAEDecoder_Quantized
|
132 |
Device : cs_8_gen_2 (ANDROID 13)
|
133 |
+
Runtime : QNN_DLC
|
134 |
Estimated inference time (ms) : 397.6
|
135 |
Estimated peak memory usage (MB): [0, 2]
|
136 |
Total # Ops : 408
|
|
|
139 |
------------------------------------------------------------
|
140 |
ControlNet_Quantized
|
141 |
Device : cs_8_gen_2 (ANDROID 13)
|
142 |
+
Runtime : QNN_DLC
|
143 |
Estimated inference time (ms) : 104.7
|
144 |
Estimated peak memory usage (MB): [2, 9]
|
145 |
Total # Ops : 2405
|