shreyajn commited on
Commit
ba1940d
·
verified ·
1 Parent(s): cbdc1ac

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +15 -32
README.md CHANGED
@@ -39,8 +39,8 @@ More details on model performance across various devices, can be found
39
 
40
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
41
  | ---|---|---|---|---|---|---|---|
42
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 0.845 ms | 0 - 2 MB | FP16 | NPU | [MobileNet-v3-Small.tflite](https://huggingface.co/qualcomm/MobileNet-v3-Small/blob/main/MobileNet-v3-Small.tflite)
43
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 0.87 ms | 0 - 159 MB | FP16 | NPU | [MobileNet-v3-Small.so](https://huggingface.co/qualcomm/MobileNet-v3-Small/blob/main/MobileNet-v3-Small.so)
44
 
45
 
46
 
@@ -101,9 +101,9 @@ python -m qai_hub_models.models.mobilenet_v3_small.export
101
  ```
102
  Profile Job summary of MobileNet-v3-Small
103
  --------------------------------------------------
104
- Device: SA8255 (Proxy) (13)
105
- Estimated Inference Time: 0.87 ms
106
- Estimated Peak Memory Range: 0.02-169.88 MB
107
  Compute Units: NPU (126) | Total (126)
108
 
109
 
@@ -125,29 +125,13 @@ in memory using the `jit.trace` and then call the `submit_compile_job` API.
125
  import torch
126
 
127
  import qai_hub as hub
128
- from qai_hub_models.models.mobilenet_v3_small import Model
129
 
130
  # Load the model
131
- torch_model = Model.from_pretrained()
132
 
133
  # Device
134
  device = hub.Device("Samsung Galaxy S23")
135
 
136
- # Trace model
137
- input_shape = torch_model.get_input_spec()
138
- sample_inputs = torch_model.sample_inputs()
139
-
140
- pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
141
-
142
- # Compile model on a specific device
143
- compile_job = hub.submit_compile_job(
144
- model=pt_model,
145
- device=device,
146
- input_specs=torch_model.get_input_spec(),
147
- )
148
-
149
- # Get target model to run on-device
150
- target_model = compile_job.get_target_model()
151
 
152
  ```
153
 
@@ -160,10 +144,10 @@ provisioned in the cloud. Once the job is submitted, you can navigate to a
160
  provided job URL to view a variety of on-device performance metrics.
161
  ```python
162
  profile_job = hub.submit_profile_job(
163
- model=target_model,
164
- device=device,
165
- )
166
-
167
  ```
168
 
169
  Step 3: **Verify on-device accuracy**
@@ -173,12 +157,11 @@ on sample input data on the same cloud hosted device.
173
  ```python
174
  input_data = torch_model.sample_inputs()
175
  inference_job = hub.submit_inference_job(
176
- model=target_model,
177
- device=device,
178
- inputs=input_data,
179
- )
180
-
181
- on_device_output = inference_job.download_output_data()
182
 
183
  ```
184
  With the output of the model, you can compute like PSNR, relative errors or
 
39
 
40
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
41
  | ---|---|---|---|---|---|---|---|
42
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 0.844 ms | 0 - 1 MB | FP16 | NPU | [MobileNet-v3-Small.tflite](https://huggingface.co/qualcomm/MobileNet-v3-Small/blob/main/MobileNet-v3-Small.tflite)
43
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 0.882 ms | 0 - 138 MB | FP16 | NPU | [MobileNet-v3-Small.so](https://huggingface.co/qualcomm/MobileNet-v3-Small/blob/main/MobileNet-v3-Small.so)
44
 
45
 
46
 
 
101
  ```
102
  Profile Job summary of MobileNet-v3-Small
103
  --------------------------------------------------
104
+ Device: Snapdragon X Elite CRD (11)
105
+ Estimated Inference Time: 1.13 ms
106
+ Estimated Peak Memory Range: 0.57-0.57 MB
107
  Compute Units: NPU (126) | Total (126)
108
 
109
 
 
125
  import torch
126
 
127
  import qai_hub as hub
128
+ from qai_hub_models.models.mobilenet_v3_small import
129
 
130
  # Load the model
 
131
 
132
  # Device
133
  device = hub.Device("Samsung Galaxy S23")
134
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
 
136
  ```
137
 
 
144
  provided job URL to view a variety of on-device performance metrics.
145
  ```python
146
  profile_job = hub.submit_profile_job(
147
+ model=target_model,
148
+ device=device,
149
+ )
150
+
151
  ```
152
 
153
  Step 3: **Verify on-device accuracy**
 
157
  ```python
158
  input_data = torch_model.sample_inputs()
159
  inference_job = hub.submit_inference_job(
160
+ model=target_model,
161
+ device=device,
162
+ inputs=input_data,
163
+ )
164
+ on_device_output = inference_job.download_output_data()
 
165
 
166
  ```
167
  With the output of the model, you can compute like PSNR, relative errors or