qaihm-bot commited on
Commit
cd90961
·
verified ·
1 Parent(s): 74fec24

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +8 -80
README.md CHANGED
@@ -33,10 +33,13 @@ More details on model performance across various devices, can be found
33
  - Model size: 6.04 MB
34
 
35
 
 
 
36
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
37
  | ---|---|---|---|---|---|---|---|
38
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 3.331 ms | 0 - 3 MB | INT8 | NPU | [DeepLabV3-Plus-MobileNet-Quantized.tflite](https://huggingface.co/qualcomm/DeepLabV3-Plus-MobileNet-Quantized/blob/main/DeepLabV3-Plus-MobileNet-Quantized.tflite)
39
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 5.345 ms | 0 - 45 MB | INT8 | NPU | [DeepLabV3-Plus-MobileNet-Quantized.so](https://huggingface.co/qualcomm/DeepLabV3-Plus-MobileNet-Quantized/blob/main/DeepLabV3-Plus-MobileNet-Quantized.so)
 
40
 
41
 
42
  ## Installation
@@ -98,89 +101,14 @@ python -m qai_hub_models.models.deeplabv3_plus_mobilenet_quantized.export
98
  Profile Job summary of DeepLabV3-Plus-MobileNet-Quantized
99
  --------------------------------------------------
100
  Device: Snapdragon X Elite CRD (11)
101
- Estimated Inference Time: 5.38 ms
102
- Estimated Peak Memory Range: 0.75-0.75 MB
103
  Compute Units: NPU (100) | Total (100)
104
 
105
 
106
  ```
107
- ## How does this work?
108
-
109
- This [export script](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/DeepLabV3-Plus-MobileNet-Quantized/export.py)
110
- leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
111
- on-device. Lets go through each step below in detail:
112
-
113
- Step 1: **Compile model for on-device deployment**
114
-
115
- To compile a PyTorch model for on-device deployment, we first trace the model
116
- in memory using the `jit.trace` and then call the `submit_compile_job` API.
117
-
118
- ```python
119
- import torch
120
-
121
- import qai_hub as hub
122
- from qai_hub_models.models.deeplabv3_plus_mobilenet_quantized import Model
123
-
124
- # Load the model
125
- torch_model = Model.from_pretrained()
126
- torch_model.eval()
127
-
128
- # Device
129
- device = hub.Device("Samsung Galaxy S23")
130
-
131
- # Trace model
132
- input_shape = torch_model.get_input_spec()
133
- sample_inputs = torch_model.sample_inputs()
134
-
135
- pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
136
-
137
- # Compile model on a specific device
138
- compile_job = hub.submit_compile_job(
139
- model=pt_model,
140
- device=device,
141
- input_specs=torch_model.get_input_spec(),
142
- )
143
 
144
- # Get target model to run on-device
145
- target_model = compile_job.get_target_model()
146
-
147
- ```
148
-
149
-
150
- Step 2: **Performance profiling on cloud-hosted device**
151
-
152
- After compiling models from step 1. Models can be profiled model on-device using the
153
- `target_model`. Note that this scripts runs the model on a device automatically
154
- provisioned in the cloud. Once the job is submitted, you can navigate to a
155
- provided job URL to view a variety of on-device performance metrics.
156
- ```python
157
- profile_job = hub.submit_profile_job(
158
- model=target_model,
159
- device=device,
160
- )
161
-
162
- ```
163
-
164
- Step 3: **Verify on-device accuracy**
165
-
166
- To verify the accuracy of the model on-device, you can run on-device inference
167
- on sample input data on the same cloud hosted device.
168
- ```python
169
- input_data = torch_model.sample_inputs()
170
- inference_job = hub.submit_inference_job(
171
- model=target_model,
172
- device=device,
173
- inputs=input_data,
174
- )
175
-
176
- on_device_output = inference_job.download_output_data()
177
-
178
- ```
179
- With the output of the model, you can compute like PSNR, relative errors or
180
- spot check the output with expected output.
181
 
182
- **Note**: This on-device profiling and inference requires access to Qualcomm®
183
- AI Hub. [Sign up for access](https://myaccount.qualcomm.com/signup).
184
 
185
 
186
  ## Run demo on a cloud-hosted device
@@ -219,7 +147,7 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
219
  ## License
220
  - The license for the original implementation of DeepLabV3-Plus-MobileNet-Quantized can be found
221
  [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
222
- - The license for the compiled assets for on-device deployment can be found [here]({deploy_license_url})
223
 
224
  ## References
225
  * [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)
 
33
  - Model size: 6.04 MB
34
 
35
 
36
+
37
+
38
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
39
  | ---|---|---|---|---|---|---|---|
40
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 3.596 ms | 0 - 2 MB | INT8 | NPU | [DeepLabV3-Plus-MobileNet-Quantized.tflite](https://huggingface.co/qualcomm/DeepLabV3-Plus-MobileNet-Quantized/blob/main/DeepLabV3-Plus-MobileNet-Quantized.tflite)
41
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 5.322 ms | 1 - 7 MB | INT8 | NPU | [DeepLabV3-Plus-MobileNet-Quantized.so](https://huggingface.co/qualcomm/DeepLabV3-Plus-MobileNet-Quantized/blob/main/DeepLabV3-Plus-MobileNet-Quantized.so)
42
+
43
 
44
 
45
  ## Installation
 
101
  Profile Job summary of DeepLabV3-Plus-MobileNet-Quantized
102
  --------------------------------------------------
103
  Device: Snapdragon X Elite CRD (11)
104
+ Estimated Inference Time: 5.24 ms
105
+ Estimated Peak Memory Range: 0.76-0.76 MB
106
  Compute Units: NPU (100) | Total (100)
107
 
108
 
109
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
 
 
 
112
 
113
 
114
  ## Run demo on a cloud-hosted device
 
147
  ## License
148
  - The license for the original implementation of DeepLabV3-Plus-MobileNet-Quantized can be found
149
  [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
150
+ - The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
151
 
152
  ## References
153
  * [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/abs/1706.05587)