ST Yolo X quantized

Use case : `Object detection`

Model description

ST Yolo X is a real-time object detection model targeted for real-time processing implemented in Tensorflow. This is an optimized ST version of the well known yolo x, quantized in int8 format using tensorflow lite converter.

Network information

Network information	Value
Framework	TensorFlow Lite
Quantization	int8
Provenance
Paper

Network inputs / outputs

For an image resolution of NxM and NC classes

Input Shape	Description
(1, W, H, 3)	Single NxM RGB image with UINT8 values between 0 and 255

Output Shape	Description

Recommended Platforms

Platform	Supported	Recommended
STM32L0	[]	[]
STM32L4	[]	[]
STM32U5	[]	[]
STM32H7	[x]	[]
STM32MP1	[x]	[]
STM32MP2	[x]	[x]
STM32N6	[x]	[x]

Performances

Metrics

Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.

Reference NPU memory footprint based on COCO Person dataset (see Accuracy for details on dataset)

Model	Dataset	Format	Resolution	Series	Internal RAM (KiB)	Weights Flash (KiB)	STM32Cube.AI version	STEdgeAI Core version
st_yolo_x_nano	COCO-Person	Int8	192x192x3	STM32N6	297	980.38	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	256x256x3	STM32N6	560	980.31	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	256x256x3	STM32N6	971.62	2452.39	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	320x320x3	STM32N6	847.5	980.31	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	416x416x3	STM32N6	2682.88	980.31	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	480x480x3	STM32N6	2418.75	1383.56	10.2.0	2.2.0

Reference NPU inference time based on COCO Person dataset (see Accuracy for details on dataset)

Model	Dataset	Format	Resolution	Board	Execution Engine	Inference time (ms)	Inf / sec	STM32Cube.AI version	STEdgeAI Core version
st_yolo_x_nano	COCO-Person	Int8	192x192x3	STM32N6570-DK	NPU/MCU	6.01	166.39	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	256x256x3	STM32N6570-DK	NPU/MCU	8.59	116.41	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	256x256x3	STM32N6570-DK	NPU/MCU	21.27	47.01	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	320x320x3	STM32N6570-DK	NPU/MCU	11.89	84.1	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	416x416x3	STM32N6570-DK	NPU/MCU	17.69	56.53	10.2.0	2.2.0
st_yolo_x_nano	COCO-Person	Int8	480x480x3	STM32N6570-DK	NPU/MCU	32.4	30.8	10.2.0	2.2.0

Reference MCU memory footprint based on COCO Person dataset (see Accuracy for details on dataset)

Model	Format	Resolution	Series	Activation RAM (KiB)	Runtime RAM (KiB)	Weights Flash (KiB)	Code Flash (KiB)	Total RAM	Total Flash	STM32Cube.AI version
st_yolo_x_nano	Int8	192x192x3	STM32H7	162.42	64.05	891.18	165.3	226.47	1056.48	10.2.0
st_yolo_x_nano	Int8	256x256x3	STM32H7	284.92	64.05	891.18	165.31	348.97	1056.49	10.2.0
st_yolo_x_nano	Int8	256x256x3	STM32H7	463.9	83.8	2435.76	227.33	547.7	2663.09	10.2.0
st_yolo_x_nano	Int8	320x320x3	STM32H7	442.42	64.05	891.18	165.36	506.47	1056.54	10.2.0

Reference MCU inference time based on COCO Person dataset (see Accuracy for details on dataset)

Model	Format	Resolution	Board	Execution Engine	Frequency	Inference time (ms)	STM32Cube.AI version
st_yolo_x_nano	Int8	192x192x3	STM32H747I-DISCO	1 CPU	400 MHz	335.19	10.2.0
st_yolo_x_nano	Int8	256x256x3	STM32H747I-DISCO	1 CPU	400 MHz	603.06	10.2.0
st_yolo_x_nano	Int8	256x256x3	STM32H747I-DISCO	1 CPU	400 MHz	1708.16	10.2.0
st_yolo_x_nano	Int8	320x320x3	STM32H747I-DISCO	1 CPU	400 MHz	967.8	10.2.0

AP on COCO Person dataset

Dataset details: link , License CC BY 4.0 , Quotation[1] , Number of classes: 80, Number of images: 118,287

Model	Format	Resolution	Depth Multiplier	Width Multiplier	Anchors	AP
st_yolo_x_nano	Int8	192x192x3	0.33	0.25	1	36.1 %
st_yolo_x_nano	Float	192x192x3	0.33	0.25	1	36.1 %
st_yolo_x_nano	Int8	256x256x3	0.33	0.25	1	44.2 %
st_yolo_x_nano	Float	256x256x3	0.33	0.25	1	44.1 %
st_yolo_x_nano	Int8	256x256x3	0.5	0.4	1	50.1 %
st_yolo_x_nano	Float	256x256x3	0.5	0.4	1	50.0 %
st_yolo_x_nano	Int8	320x320x3	0.33	0.25	1	48.8 %
st_yolo_x_nano	Float	320x320x3	0.33	0.25	1	48.5 %
st_yolo_x_nano	Int8	416x416x3	0.33	0.25	1	54.0 %
st_yolo_x_nano	Float	416x416x3	0.33	0.25	1	54.5 %
st_yolo_x_nano	Int8	480x480x3	1.0	0.25	3	61.4 %
st_yolo_x_nano	Float	480x480x3	1.0	0.25	3	62.1 %

* EVAL_IOU = 0.5, NMS_THRESH = 0.5, SCORE_THRESH = 0.001, MAX_DETECTIONS = 100

Retraining and Integration in a simple example:

Please refer to the stm32ai-modelzoo-services GitHub here

References

[1] “Microsoft COCO: Common Objects in Context”. [Online]. Available: https://cocodataset.org/#download. @article{DBLP:journals/corr/LinMBHPRDZ14, author = {Tsung{-}Yi Lin and Michael Maire and Serge J. Belongie and Lubomir D. Bourdev and Ross B. Girshick and James Hays and Pietro Perona and Deva Ramanan and Piotr Doll{'{a} }r and C. Lawrence Zitnick}, title = {Microsoft {COCO:} Common Objects in Context}, journal = {CoRR}, volume = {abs/1405.0312}, year = {2014}, url = {http://arxiv.org/abs/1405.0312}, archivePrefix = {arXiv}, eprint = {1405.0312}, timestamp = {Mon, 13 Aug 2018 16:48:13 +0200}, biburl = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14}, bibsource = {dblp computer science bibliography, https://dblp.org} }