qaihm-bot commited on
Commit
e9c0792
·
verified ·
1 Parent(s): 7f29e2b

See https://github.com/quic/ai-hub-models/releases/v0.31.0 for changelog.

Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -38,18 +38,18 @@ More details on model performance across various devices, can be found
38
 
39
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
- | TextEncoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 10.874 ms | 0 - 3 MB | NPU | Use Export Script |
42
- | TextEncoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 7.918 ms | 0 - 18 MB | NPU | Use Export Script |
43
- | TextEncoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 10.875 ms | 0 - 3 MB | NPU | Use Export Script |
44
- | UNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 258.151 ms | 13 - 15 MB | NPU | Use Export Script |
45
- | UNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 197.629 ms | 13 - 31 MB | NPU | Use Export Script |
46
- | UNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 256.936 ms | 13 - 16 MB | NPU | Use Export Script |
47
- | VAEDecoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 397.625 ms | 0 - 2 MB | NPU | Use Export Script |
48
- | VAEDecoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 300.627 ms | 0 - 21 MB | NPU | Use Export Script |
49
- | VAEDecoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 395.006 ms | 0 - 3 MB | NPU | Use Export Script |
50
- | ControlNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN | 104.668 ms | 2 - 9 MB | NPU | Use Export Script |
51
- | ControlNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN | 77.289 ms | 2 - 23 MB | NPU | Use Export Script |
52
- | ControlNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN | 103.817 ms | 2 - 5 MB | NPU | Use Export Script |
53
 
54
 
55
 
@@ -112,7 +112,7 @@ Profiling Results
112
  ------------------------------------------------------------
113
  TextEncoder_Quantized
114
  Device : cs_8_gen_2 (ANDROID 13)
115
- Runtime : QNN
116
  Estimated inference time (ms) : 10.9
117
  Estimated peak memory usage (MB): [0, 3]
118
  Total # Ops : 569
@@ -121,7 +121,7 @@ Compute Unit(s) : npu (569 ops) gpu (0 ops) cpu (0 ops)
121
  ------------------------------------------------------------
122
  UNet_Quantized
123
  Device : cs_8_gen_2 (ANDROID 13)
124
- Runtime : QNN
125
  Estimated inference time (ms) : 258.2
126
  Estimated peak memory usage (MB): [13, 15]
127
  Total # Ops : 5433
@@ -130,7 +130,7 @@ Compute Unit(s) : npu (5433 ops) gpu (0 ops) cpu (0 ops)
130
  ------------------------------------------------------------
131
  VAEDecoder_Quantized
132
  Device : cs_8_gen_2 (ANDROID 13)
133
- Runtime : QNN
134
  Estimated inference time (ms) : 397.6
135
  Estimated peak memory usage (MB): [0, 2]
136
  Total # Ops : 408
@@ -139,7 +139,7 @@ Compute Unit(s) : npu (408 ops) gpu (0 ops) cpu (0 ops)
139
  ------------------------------------------------------------
140
  ControlNet_Quantized
141
  Device : cs_8_gen_2 (ANDROID 13)
142
- Runtime : QNN
143
  Estimated inference time (ms) : 104.7
144
  Estimated peak memory usage (MB): [2, 9]
145
  Total # Ops : 2405
 
38
 
39
  | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
40
  |---|---|---|---|---|---|---|---|---|
41
+ | TextEncoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 10.874 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
42
+ | TextEncoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 7.918 ms | 0 - 18 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
43
+ | TextEncoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 10.875 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
44
+ | UNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 258.151 ms | 13 - 15 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
45
+ | UNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 197.629 ms | 13 - 31 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
46
+ | UNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 256.936 ms | 13 - 16 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
47
+ | VAEDecoder_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 397.625 ms | 0 - 2 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
48
+ | VAEDecoder_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 300.627 ms | 0 - 21 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
49
+ | VAEDecoder_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 395.006 ms | 0 - 3 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
50
+ | ControlNet_Quantized | w8a16 | Samsung Galaxy S23 | Snapdragon® 8 Gen 2 Mobile | QNN_DLC | 104.668 ms | 2 - 9 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
51
+ | ControlNet_Quantized | w8a16 | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | QNN_DLC | 77.289 ms | 2 - 23 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
52
+ | ControlNet_Quantized | w8a16 | QCS8550 (Proxy) | Qualcomm® QCS8550 (Proxy) | QNN_DLC | 103.817 ms | 2 - 5 MB | NPU | [ControlNet.dlc](https://huggingface.co/qualcomm/ControlNet/blob/main/ControlNet_w8a16.dlc) |
53
 
54
 
55
 
 
112
  ------------------------------------------------------------
113
  TextEncoder_Quantized
114
  Device : cs_8_gen_2 (ANDROID 13)
115
+ Runtime : QNN_DLC
116
  Estimated inference time (ms) : 10.9
117
  Estimated peak memory usage (MB): [0, 3]
118
  Total # Ops : 569
 
121
  ------------------------------------------------------------
122
  UNet_Quantized
123
  Device : cs_8_gen_2 (ANDROID 13)
124
+ Runtime : QNN_DLC
125
  Estimated inference time (ms) : 258.2
126
  Estimated peak memory usage (MB): [13, 15]
127
  Total # Ops : 5433
 
130
  ------------------------------------------------------------
131
  VAEDecoder_Quantized
132
  Device : cs_8_gen_2 (ANDROID 13)
133
+ Runtime : QNN_DLC
134
  Estimated inference time (ms) : 397.6
135
  Estimated peak memory usage (MB): [0, 2]
136
  Total # Ops : 408
 
139
  ------------------------------------------------------------
140
  ControlNet_Quantized
141
  Device : cs_8_gen_2 (ANDROID 13)
142
+ Runtime : QNN_DLC
143
  Estimated inference time (ms) : 104.7
144
  Estimated peak memory usage (MB): [2, 9]
145
  Total # Ops : 2405