Image Classification
File size: 19,214 Bytes
26a5002
 
 
 
df33c02
26a5002
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3f91535
 
 
 
26a5002
 
 
 
 
df33c02
 
 
26a5002
 
 
 
 
df33c02
 
 
26a5002
 
 
 
 
df33c02
 
 
 
 
 
 
 
 
26a5002
 
 
 
 
 
 
df33c02
26a5002
 
 
 
 
 
df33c02
26a5002
 
 
 
 
df33c02
 
 
26a5002
 
 
 
 
 
 
 
 
 
df33c02
 
 
 
26a5002
 
 
 
 
 
 
 
 
 
df33c02
 
26a5002
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
license: other
license_name: sla0044
license_link: >-
  https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/LICENSE.md
pipeline_tag: image-classification
---
# ResNet v1

## **Use case** : `Image classification`

# Model description

ResNet models perform image classification - they take images as input and classify the major object in the image into a
set of pre-defined classes.  ResNet models provide very high accuracies with affordable model sizes. They are ideal for cases when high accuracy of classification is required.
ResNet models consist of residual blocks and came up to counter the effect of deteriorating accuracies with more layers due to network not learning the initial layers.
ResNet v1 uses post-activation for the residual blocks. The models below have 8 and 32 layers with ResNet v1 architecture.
(source: https://keras.io/api/applications/resnet/)
The model is quantized in int8 using tensorflow lite converter.

In addition, we introduce a new model family inspired from ResNet v1 which takes benefit from hybrid quantization.
Later on, they are named as ST ResNet 8 Hybrid v1 and ST ResNet 8 Hybrid v2.
By hybrid quantization, we mean that whenever it is possible, some network layers are quantized for weights and/or activations on less than 8 bits.
We used Larq library to define and train these models. In particular, in our topology some layers/activations are kept in 8 bits while others are in binary.
Please note that since this quantization is performed during training (Quantization Aware Training), these networks no longer need to be converted with tensorflow lite.
STM32Cube.AI is able to import them directly in .h5 format and to generate the corresponding optimized FW code.
Even if many layers are in binary, these models provide comparable accuracy to the full 8-bit ResNet v1 8 but have a significantly lower inference time.


## Network information

| Network Information     | Value                                                                   |
|-------------------------|-------------------------------------------------------------------------|
|  Framework              | TensorFlow Lite                                                         |
|  Quantization           | int8                                                                    |
|  Provenance             | https://www.tensorflow.org/api_docs/python/tf/keras/applications/resnet |
|  Paper                  | https://arxiv.org/abs/1512.03385                                        |

The models are quantized using tensorflow lite converter.

## Network inputs / outputs

For an image resolution of NxM and P classes

| Input Shape    | Description                                                 |
|----------------|-------------------------------------------------------------|
| (1, N, M, 3)   | Single NxM RGB image with UINT8 values between 0 and 255    |

| Output Shape   | Description                                                 |
|----------------|-------------------------------------------------------------|
| (1, P)         | Per-class confidence for P classes in FLOAT32               |

## Recommended Platforms

| Platform | Supported | Optimized |
|----------|-----------|-----------|
| STM32L0  | []        | []        |
| STM32L4  | [x]       | []        |
| STM32U5  | [x]       | []        |
| STM32H7  | [x]       | [x]       |
| STM32MP1 | [x]       | [x]*      |
| STM32MP2 | [x]       | []        |
| STM32N6  | [x]       | []        |

* Only for Cifar 100 models

# Performances

## Metrics

- Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
- `tfs` stands for "training from scratch", meaning that the model weights were randomly initialized before training.
- `tl` stands for "transfer learning", meaning that the model backbone weights were initialized from a pre-trained model, then only the last layer was unfrozen during the training.
- `fft` stands for "full fine-tuning", meaning that the full model weights were initialized from a transfer learning pre-trained model, and all the layers were unfrozen during the training.

### Reference **MCU** memory footprint based on Cifar 10 dataset (see Accuracy for details on dataset)

| Model                                                                                                                                 | Format | Resolution  | Series  | Activation RAM | Runtime RAM | Weights Flash | Code Flash | Total RAM | Total Flash | STM32Cube.AI version  |
|---------------------------------------------------------------------------------------------------------------------------------------|--------|-------------|---------|----------------|-------------|---------------|------------|-----------|-------------|-----------------------|
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs_int8.tflite)                      | Int8   | 32x32x3     | STM32H7 | 62.51 KiB      | 7.21 KiB    | 76.9 KiB      | 56.45 KiB  | 69.72 KiB | 133.35 KiB  | 10.0.0                 |
| [ST ResNet 8 Hybrid v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v1_32_tfs/st_resnet_8_hybrid_v1_32_tfs.h5) | Hybrid | 32x32x3     | STM32H7 | 77.84 KiB      | 18.38 KiB   | 85.79 KiB     | 61.75 KiB  | 96.22 KiB | 147.54 KiB  | 10.0.0                 |
| [ST ResNet 8 Hybrid v2 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v2_32_tfs/st_resnet_8_hybrid_v2_32_tfs.h5) | Hybrid | 32x32x3     | STM32H7 | 78.99 KiB      | 18.38 KiB   | 66.28 KiB     | 60.99 KiB  | 97.37 KiB | 127.27 KiB  | 10.0.0                 |

### Reference **MCU** inference time based on Cifar 10 dataset (see Accuracy for details on dataset)

| Model                            | Format | Resolution  | Board            | Execution Engine | Frequency    | Inference time (ms) | STM32Cube.AI version  |
|----------------------------------|--------|-------------|------------------|------------------|--------------|---------------------|-----------------------|
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs_int8.tflite) | Int8   | 32x32x3     | STM32H747I-DISCO | 1 CPU            | 400 MHz      | 28.67 ms            | 10.0.0                 |
| [ST ResNet 8 Hybrid v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v1_32_tfs/st_resnet_8_hybrid_v1_32_tfs.h5) | Hybrid | 32x32x3     | STM32H747I-DISCO | 1 CPU            | 400 MHz      | 28.93 ms            | 10.0.0                 |
| [ST ResNet 8 Hybrid v2 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v2_32_tfs/st_resnet_8_hybrid_v2_32_tfs.h5) | Hybrid | 32x32x3     | STM32H747I-DISCO | 1 CPU            | 400 MHz      | 25.2 ms            | 10.0.0                 |


### Reference **MPU** inference time based on Flowers dataset (see Accuracy for details on dataset)
| Model                                                                                                                                 |  Format  | Resolution | Quantization  | Board             | Execution Engine | Frequency | Inference time (ms) | %NPU  | %GPU  | %CPU | X-LINUX-AI version |       Framework       |
|---------------------------------------------------------------------------------------------------------------------------------------|----------|------------|---------------|-------------------|------------------|-----------|---------------------|-------|-------|------|--------------------|-----------------------|
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs_int8.tflite)                      | Int8     | 32x32x3    |  per-channel**  | STM32MP257F-DK2   | NPU/GPU          | 800  MHz  | 2.02 ms             | 12.26 | 87.74 | 0    |   v5.1.0           | OpenVX                |
| [ST ResNet 8 Hybrid v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v1_32_tfs/st_resnet_8_hybrid_v1_32_tfs.h5) | Hybrid   | 32x32x3    |  per-channel**  | STM32MP257F-DK2   | NPU/GPU          | 800  MHz  | TBD ms              | 0 | 0 | 0    |   v5.1.0           | OpenVX                |
| [ST ResNet 8 Hybrid v2 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v2_32_tfs/st_resnet_8_hybrid_v2_32_tfs.h5) | Hybrid   | 32x32x3    |  per-channel**  | STM32MP257F-DK2   | NPU/GPU          | 800  MHz  | TBD ms            | 0 | 0 | 0    |   v5.1.0           | OpenVX                |
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs_int8.tflite)                      | Int8     | 32x32x3    |  per-channel  | STM32MP157F-DK2   | 2 CPU            | 800  MHz  | 6.50 ms             | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |
| [ST ResNet 8 Hybrid v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v1_32_tfs/st_resnet_8_hybrid_v1_32_tfs.h5) | Hybrid   | 32x32x3    |  per-channel  | STM32MP157F-DK2   | 2 CPU            | 800  MHz  | TBD ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |
| [ST ResNet 8 Hybrid v2 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v2_32_tfs/st_resnet_8_hybrid_v2_32_tfs.h5) | Hybrid   | 32x32x3    |  per-channel  | STM32MP157F-DK2   | 2 CPU            | 800  MHz  | TBD ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs_int8.tflite)                      | Int8     | 32x32x3    |  per-channel  | STM32MP135F-DK2   | 1 CPU            | 1000 MHz  | 10.77 ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |
| [ST ResNet 8 Hybrid v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v1_32_tfs/st_resnet_8_hybrid_v1_32_tfs.h5) | Hybrid   | 32x32x3    |  per-channel  | STM32MP135F-DK2   | 1 CPU            | 1000 MHz  | TBD ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |
| [ST ResNet 8 Hybrid v2 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v2_32_tfs/st_resnet_8_hybrid_v2_32_tfs.h5) | Hybrid   | 32x32x3    |  per-channel  | STM32MP135F-DK2   | 1 CPU            | 1000 MHz  | TBD ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |

** **To get the most out of MP25 NPU hardware acceleration, please use per-tensor quantization**

### Reference **MCU** memory footprint based on Cifar 100 dataset (see Accuracy for details on dataset)

| Model                                                                                                                | Format | Resolution  | Series  | Activation RAM | Runtime RAM | Weights Flash | Code Flash | Total RAM   | Total Flash |
|----------------------------------------------------------------------------------------------------------------------|--------|-------------|---------|----------------|-------------|---------------|------------|-------------|-------------|
| [ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs_int8.tflite) | Int8   | 32x32x3     | STM32H7 | 45.41 KiB      | 24.98 KiB   | 464.38 KiB    | 78.65 KiB  | 70.39 KiB   | 543.03 KiB  |


### Reference **MCU** inference time based on Cifar 100 dataset (see Accuracy for details on dataset)

| Model                                                                                                                | Format | Resolution | Board            | Execution Engine | Frequency    | Inference time (ms) |
|----------------------------------------------------------------------------------------------------------------------|--------|------------|------------------|------------------|--------------|---------------------|
| [ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs_int8.tflite) | Int8   | 32x32x3    | STM32H747I-DISCO | 1 CPU            | 400 MHz      | 177.7 ms            |


### Reference **MPU** inference time based on Flowers dataset (see Accuracy for details on dataset)
| Model                                                                                                               |  Format  | Resolution | Quantization  | Board             | Execution Engine | Frequency | Inference time (ms) | %NPU  | %GPU  | %CPU | X-LINUX-AI version |       Framework       |
|---------------------------------------------------------------------------------------------------------------------|----------|------------|---------------|-------------------|------------------|-----------|---------------------|-------|-------|------|--------------------|-----------------------|
|[ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs_int8.tflite) | Int8     | 32x32x3    |  per-channel  | STM32MP257F-DK2   | NPU/GPU          | 800  MHz  | 9.160 ms            | 14.75 | 85.25 | 0    |   v5.1.0           | OpenVX                |
|[ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs_int8.tflite) | Int8     | 32x32x3    |  per-channel  | STM32MP157F-DK2   | 2 CPU            | 800  MHz  | 34.78 ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |
|[ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs_int8.tflite) | Int8     | 32x32x3    |  per-channel  | STM32MP135F-DK2   | 1 CPU            | 1000 MHz  | 55.32 ms            | NA    | NA    | 100  |   v5.1.0           | TensorFlowLite 2.11.0 |


### Accuracy with Cifar10 dataset

Dataset details: [link](https://www.cs.toronto.edu/~kriz/cifar.html) ,
License [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) , Quotation[[1]](#1) , Number of classes: 10, Number of
images: 60 000

| Model                                                                                                            | Format   | Resolution  | Top 1 Accuracy |
|------------------------------------------------------------------------------------------------------------------|----------|-------------|----------------|
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs.h5)          | Float    | 32x32x3     | 87.01 %        |
| [ResNet v1 8 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/resnet_v1_8_32_tfs/resnet_v1_8_32_tfs_int8.tflite) | Int8     | 32x32x3     | 85.59 %        |
| [ST ResNet 8 Hybrid v1 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v1_32_tfs/st_resnet_8_hybrid_v1_32_tfs.h5)                                                                                        | Hybrid   | 32x32x3     | 86 %           |
| [ST ResNet 8 Hybrid v2 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar10/st_resnet_8_hybrid_v2_32_tfs/st_resnet_8_hybrid_v2_32_tfs.h5)                                                                                        | Hybrid   | 32x32x3     | 84.85 %        |


### Accuracy with Cifar100 dataset

Dataset details: [link](https://www.cs.toronto.edu/~kriz/cifar.html) ,
License [CC0 4.0](https://creativecommons.org/licenses/by/4.0/), Quotation[[2]](#2)  , Number of classes:100,
Number of images:  600 000

| Model                                                                                                                | Format  | Resolution | Top 1 Accuracy |
|----------------------------------------------------------------------------------------------------------------------|---------|------------|----------------|
| [ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs.h5)          | Float   | 32x32x3    | 67.75 %        |
| [ResNet v1 32 tfs](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/resnetv1/ST_pretrainedmodel_public_dataset/cifar100/resnet_v1_32_32_tfs/resnet_v1_32_32_tfs_int8.tflite) | Int8    | 32x32x3    | 66.58 %        |

## Retraining and Integration in a simple example:

Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)


# References

<a id="1">[1]</a>
"Tf_flowers : tensorflow datasets," TensorFlow. [Online]. Available: https://www.tensorflow.org/datasets/catalog/tf_flowers.

<a id="2">[2]</a>
J, ARUN PANDIAN; GOPAL, GEETHARAMANI (2019), "Data for: Identification of Plant Leaf Diseases Using a 9-layer Deep Convolutional Neural Network", Mendeley Data, V1, doi: 10.17632/tywbtsjrjv.1

<a id="3">[3]</a>
L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 -- Mining Discriminative Components with Random Forests." European Conference on Computer Vision, 2014.