葛政(实习)
commited on
Commit
·
d9f51c5
1
Parent(s):
65998a0
feat(demo): add OpenVINO and ONNXRuntime demo
Browse files- README.md +128 -2
- demo/ONNXRuntime/README.md +66 -0
- demo/ONNXRuntime/demo_utils.py +86 -0
- demo/ONNXRuntime/onnx_inference.py +90 -0
- demo/OpenVINO/README.md +4 -0
- demo/OpenVINO/cpp/CMakeLists.txt +23 -0
- demo/OpenVINO/cpp/README.md +94 -0
- demo/OpenVINO/cpp/yolox_openvino.cpp +531 -0
- demo/OpenVINO/python/README.md +88 -0
- demo/OpenVINO/python/demo_utils.py +86 -0
- demo/OpenVINO/python/openvino_inference.py +155 -0
- demo/TensorRT/cpp/CMakeLists.txt +36 -0
- demo/TensorRT/cpp/README.md +43 -0
- demo/TensorRT/cpp/logging.h +503 -0
- demo/TensorRT/cpp/yolox.cpp +554 -0
- demo/TensorRT/python/README.md +46 -0
README.md
CHANGED
@@ -1,2 +1,128 @@
|
|
1 |
-
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<div align="center"><img src="assets/logo.png" width="600"></div>
|
2 |
+
|
3 |
+
<img src="assets/demo.png" >
|
4 |
+
|
5 |
+
## <div align="center">Introduction</div>
|
6 |
+
YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities.
|
7 |
+
|
8 |
+
|
9 |
+
## <div align="center">Why YOLOX?</div>
|
10 |
+
|
11 |
+
<div align="center"><img src="assets/fig1.png" width="400" ><img src="assets/fig2.png" width="400"></div>
|
12 |
+
|
13 |
+
## <div align="center">News!!</div>
|
14 |
+
* 【2020/07/19】 We have released our technical report on [Arxiv](xxx)!!
|
15 |
+
|
16 |
+
## <div align="center">Benchmark</div>
|
17 |
+
|
18 |
+
### Standard Models.
|
19 |
+
|Model |size |mAP<sup>test<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
|
20 |
+
| ------ |:---: | :---: |:---: |:---: | :---: | :----: |
|
21 |
+
|[YOLOX-s]() |640 |39.6 |9.8 |9.0 | 26.8 | - |
|
22 |
+
|[YOLOX-m]() |640 |46.4 |12.3 |25.3 |73.8| - |
|
23 |
+
|[YOLOX-l]() |640 |50.0 |14.5 |54.2| 155.6 | - |
|
24 |
+
|[YOLOX-x]() |640 |**51.2** | 17.3 |99.1 |281.9 | - |
|
25 |
+
|
26 |
+
### Light Models.
|
27 |
+
|Model |size |mAP<sup>val<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
|
28 |
+
| ------ |:---: | :---: |:---: |:---: | :---: | :----: |
|
29 |
+
|[YOLOX-Nano]() |416 |25.3 |- | 0.91 |1.08 | - |
|
30 |
+
|[YOLOX-Tiny]() |416 |31.7 |- | 5.06 |6.45 | - |
|
31 |
+
|
32 |
+
## <div align="center">Quick Start</div>
|
33 |
+
|
34 |
+
### Installation
|
35 |
+
|
36 |
+
Step1. Install [apex](https://github.com/NVIDIA/apex).
|
37 |
+
|
38 |
+
```shell
|
39 |
+
git clone https://github.com/NVIDIA/apex
|
40 |
+
cd apex
|
41 |
+
pip3 install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
|
42 |
+
```
|
43 |
+
Step2. Install YOLOX.
|
44 |
+
```bash
|
45 |
+
$ git clone [email protected]:Megvii-BaseDetection/YOLOX.git
|
46 |
+
$ cd yolox
|
47 |
+
$ pip3 install -v -e . # or "python3 setup.py develop
|
48 |
+
```
|
49 |
+
|
50 |
+
### Demo
|
51 |
+
|
52 |
+
You can use either -n or -f to specify your detector's config:
|
53 |
+
|
54 |
+
```shell
|
55 |
+
python tools/demo.py -n yolox-s -c <MODEL_PATH> --conf 0.3 --nms 0.65 --tsize 640
|
56 |
+
```
|
57 |
+
or
|
58 |
+
```shell
|
59 |
+
python tools/demo.py -f exps/base/yolox_s.py -c <MODEL_PATH> --conf 0.3 --nms 0.65 --tsize 640
|
60 |
+
```
|
61 |
+
|
62 |
+
|
63 |
+
<details open>
|
64 |
+
<summary>Reproduce our results on COCO</summary>
|
65 |
+
|
66 |
+
Step1.
|
67 |
+
|
68 |
+
* Reproduce our results on COCO by specifying -n:
|
69 |
+
|
70 |
+
```shell
|
71 |
+
python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
|
72 |
+
yolox-m
|
73 |
+
yolox-l
|
74 |
+
yolox-x
|
75 |
+
```
|
76 |
+
Notes:
|
77 |
+
* -d: number of gpu devices
|
78 |
+
* -b: total batch size, the recommended number for -b equals to num_gpu * 8
|
79 |
+
* --fp16: mixed precision training
|
80 |
+
|
81 |
+
The above commands are equivalent to:
|
82 |
+
|
83 |
+
```shell
|
84 |
+
python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
|
85 |
+
exps/base/yolox-m.py
|
86 |
+
exps/base/yolox-l.py
|
87 |
+
exps/base/yolox-x.py
|
88 |
+
```
|
89 |
+
|
90 |
+
* Customize your training.
|
91 |
+
|
92 |
+
* Finetune your datset on COCO pretrained models.
|
93 |
+
</details>
|
94 |
+
|
95 |
+
<details open>
|
96 |
+
<summary>Evaluation</summary>
|
97 |
+
We support batch testing for fast evaluation:
|
98 |
+
|
99 |
+
```shell
|
100 |
+
python tools/eval.py -n yolox-s -b 64 --conf 0.001 --fp16 (optional) --fuse (optional) --test (for test-dev set)
|
101 |
+
yolox-m
|
102 |
+
yolox-l
|
103 |
+
yolox-x
|
104 |
+
```
|
105 |
+
|
106 |
+
To reproduce speed test, we use the following command:
|
107 |
+
```shell
|
108 |
+
python tools/eval.py -n yolox-s -b 1 -d 0 --conf 0.001 --fp16 --fuse --test (for test-dev set)
|
109 |
+
yolox-m
|
110 |
+
yolox-l
|
111 |
+
yolox-x
|
112 |
+
```
|
113 |
+
|
114 |
+
## <div align="center">Deployment</div>
|
115 |
+
|
116 |
+
</details>
|
117 |
+
|
118 |
+
1. [ONNX: Including ONNX export and an ONNXRuntime demo.]()
|
119 |
+
2. [TensorRT in both C++ and Python]()
|
120 |
+
3. [NCNN in C++]()
|
121 |
+
4. [OpenVINO in both C++ and Python]()
|
122 |
+
|
123 |
+
## <div align="center">Cite Our Work</div>
|
124 |
+
|
125 |
+
|
126 |
+
If you find this project useful for you, please use the following BibTeX entry.
|
127 |
+
|
128 |
+
TODO
|
demo/ONNXRuntime/README.md
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## ONNXRuntime Demo in Python
|
2 |
+
|
3 |
+
This doc introduces how to convert you pytorch model into onnx, and how to run an onnxruntime demo to verify your convertion.
|
4 |
+
|
5 |
+
### Download ONNX models.
|
6 |
+
| Model | Parameters | GFLOPs | Test Size | mAP |
|
7 |
+
|:------| :----: | :----: | :---: | :---: |
|
8 |
+
| [YOLOX-Nano](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.res101.fpn.coco.800size.1x) | 0.91M | 1.08 | 416x416 | 25.3 |
|
9 |
+
| [YOLOX-Tiny](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.fpn.coco.800size.1x) | 5.06M | 6.45 | 416x416 |31.7 |
|
10 |
+
| [YOLOX-S](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 9.0M | 26.8 | 640x640 |39.6 |
|
11 |
+
| [YOLOX-M](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 25.3M | 73.8 | 640x640 |46.4 |
|
12 |
+
| [YOLOX-L](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 54.2M | 155.6 | 640x640 |50.0 |
|
13 |
+
| [YOLOX-X](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 99.1M | 281.9 | 640x640 |51.2 |
|
14 |
+
| [YOLOX-Darknet53](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 63.72M | 185.3 | 640x640 |47.3 |
|
15 |
+
|
16 |
+
### Convert Your Model to ONNX
|
17 |
+
|
18 |
+
First, you should move to <YOLOX_HOME> by:
|
19 |
+
```shell
|
20 |
+
cd <YOLOX_HOME>
|
21 |
+
```
|
22 |
+
Then, you can:
|
23 |
+
|
24 |
+
1. Convert a standard YOLOX model by -n:
|
25 |
+
```shell
|
26 |
+
python3 tools/export_onnx.py --output-name yolox_s.onnx -n yolox-s -c yolox_s.pth.tar
|
27 |
+
```
|
28 |
+
Notes:
|
29 |
+
* -n: specify a model name. The model name must be one of the [yolox-s,m,l,x and yolox-nane, yolox-tiny, yolov3]
|
30 |
+
* -c: the model you have trained
|
31 |
+
* -o: opset version, default 11. **However, if you will further convert your onnx model to [OpenVINO](), please specify the opset version to 10.**
|
32 |
+
* --no-onnxsim: disable onnxsim
|
33 |
+
* To customize an input shape for onnx model, modify the following code in tools/export.py:
|
34 |
+
|
35 |
+
```python
|
36 |
+
dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
|
37 |
+
```
|
38 |
+
|
39 |
+
2. Convert a standard YOLOX model by -f. By using -f, the above command is equivalent to:
|
40 |
+
|
41 |
+
```shell
|
42 |
+
python3 tools/export_onnx.py --output-name yolox_s.onnx -f exps/yolox_s.py -c yolox_s.pth.tar
|
43 |
+
```
|
44 |
+
|
45 |
+
3. To convert your customized model, please use -f:
|
46 |
+
|
47 |
+
```shell
|
48 |
+
python3 tools/export_onnx.py --output-name your_yolox.onnx -f exps/your_yolox.py -c your_yolox.pth.tar
|
49 |
+
```
|
50 |
+
|
51 |
+
### ONNXRuntime Demo
|
52 |
+
|
53 |
+
Step1.
|
54 |
+
```shell
|
55 |
+
cd <YOLOX_HOME>/yolox/deploy/demo_onnxruntime/
|
56 |
+
```
|
57 |
+
|
58 |
+
Step2.
|
59 |
+
```shell
|
60 |
+
python3 onnx_inference.py -m <ONNX_MODEL_PATH> -i <IMAGE_PATH> -o <OUTPUT_DIR> -s 0.3 --input_shape 640,640
|
61 |
+
```
|
62 |
+
Notes:
|
63 |
+
* -m: your converted onnx model
|
64 |
+
* -i: input_image
|
65 |
+
* -s: score threshold for visualization.
|
66 |
+
* --input_shape: should be consistent with the shape you used for onnx convertion.
|
demo/ONNXRuntime/demo_utils.py
ADDED
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import numpy as np
|
2 |
+
|
3 |
+
import os
|
4 |
+
|
5 |
+
|
6 |
+
def mkdir(path):
|
7 |
+
if not os.path.exists(path):
|
8 |
+
os.makedirs(path)
|
9 |
+
|
10 |
+
|
11 |
+
def nms(boxes, scores, nms_thr):
|
12 |
+
"""Single class NMS implemented in Numpy."""
|
13 |
+
x1 = boxes[:, 0]
|
14 |
+
y1 = boxes[:, 1]
|
15 |
+
x2 = boxes[:, 2]
|
16 |
+
y2 = boxes[:, 3]
|
17 |
+
|
18 |
+
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
|
19 |
+
order = scores.argsort()[::-1]
|
20 |
+
|
21 |
+
keep = []
|
22 |
+
while order.size > 0:
|
23 |
+
i = order[0]
|
24 |
+
keep.append(i)
|
25 |
+
xx1 = np.maximum(x1[i], x1[order[1:]])
|
26 |
+
yy1 = np.maximum(y1[i], y1[order[1:]])
|
27 |
+
xx2 = np.minimum(x2[i], x2[order[1:]])
|
28 |
+
yy2 = np.minimum(y2[i], y2[order[1:]])
|
29 |
+
|
30 |
+
w = np.maximum(0.0, xx2 - xx1 + 1)
|
31 |
+
h = np.maximum(0.0, yy2 - yy1 + 1)
|
32 |
+
inter = w * h
|
33 |
+
ovr = inter / (areas[i] + areas[order[1:]] - inter)
|
34 |
+
|
35 |
+
inds = np.where(ovr <= nms_thr)[0]
|
36 |
+
order = order[inds + 1]
|
37 |
+
|
38 |
+
return keep
|
39 |
+
|
40 |
+
|
41 |
+
def multiclass_nms(boxes, scores, nms_thr, score_thr):
|
42 |
+
"""Multiclass NMS implemented in Numpy"""
|
43 |
+
final_dets = []
|
44 |
+
num_classes = scores.shape[1]
|
45 |
+
for cls_ind in range(num_classes):
|
46 |
+
cls_scores = scores[:, cls_ind]
|
47 |
+
valid_score_mask = cls_scores > score_thr
|
48 |
+
if valid_score_mask.sum() == 0:
|
49 |
+
continue
|
50 |
+
else:
|
51 |
+
valid_scores = cls_scores[valid_score_mask]
|
52 |
+
valid_boxes = boxes[valid_score_mask]
|
53 |
+
keep = nms(valid_boxes, valid_scores, nms_thr)
|
54 |
+
if len(keep) > 0:
|
55 |
+
cls_inds = np.ones((len(keep), 1)) * cls_ind
|
56 |
+
dets = np.concatenate([valid_boxes[keep], valid_scores[keep, None], cls_inds], 1)
|
57 |
+
final_dets.append(dets)
|
58 |
+
return np.concatenate(final_dets, 0)
|
59 |
+
|
60 |
+
|
61 |
+
def postprocess(outputs, img_size, p6=False):
|
62 |
+
|
63 |
+
grids = []
|
64 |
+
expanded_strides = []
|
65 |
+
|
66 |
+
if not p6:
|
67 |
+
strides = [8, 16, 32]
|
68 |
+
else:
|
69 |
+
strides = [8, 16, 32, 64]
|
70 |
+
|
71 |
+
hsizes = [img_size[0]//stride for stride in strides]
|
72 |
+
wsizes = [img_size[1]//stride for stride in strides]
|
73 |
+
|
74 |
+
for hsize, wsize, stride in zip(hsizes, wsizes, strides):
|
75 |
+
xv, yv = np.meshgrid(np.arange(hsize), np.arange(wsize))
|
76 |
+
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
|
77 |
+
grids.append(grid)
|
78 |
+
shape = grid.shape[:2]
|
79 |
+
expanded_strides.append(np.full((*shape, 1), stride))
|
80 |
+
|
81 |
+
grids = np.concatenate(grids, 1)
|
82 |
+
expanded_strides = np.concatenate(expanded_strides, 1)
|
83 |
+
outputs[..., :2] = (outputs[..., :2] + grids) * expanded_strides
|
84 |
+
outputs[..., 2:4] = np.exp(outputs[..., 2:4]) * expanded_strides
|
85 |
+
|
86 |
+
return outputs
|
demo/ONNXRuntime/onnx_inference.py
ADDED
@@ -0,0 +1,90 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import cv2
|
2 |
+
import numpy as np
|
3 |
+
|
4 |
+
from yolox.data.data_augment import preproc as preprocess
|
5 |
+
from yolox.data.datasets import COCO_CLASSES
|
6 |
+
from yolox.utils.visualize import vis
|
7 |
+
|
8 |
+
import argparse
|
9 |
+
import onnxruntime
|
10 |
+
import os
|
11 |
+
from demo_utils import mkdir, multiclass_nms, postprocess
|
12 |
+
|
13 |
+
|
14 |
+
def make_parser():
|
15 |
+
parser = argparse.ArgumentParser("onnxruntime inference sample")
|
16 |
+
parser.add_argument(
|
17 |
+
"-m",
|
18 |
+
"--model",
|
19 |
+
type=str,
|
20 |
+
default="yolox.onnx",
|
21 |
+
help="Input your onnx model.",
|
22 |
+
)
|
23 |
+
parser.add_argument(
|
24 |
+
"-i",
|
25 |
+
"--image_path",
|
26 |
+
type=str,
|
27 |
+
default='test_image.png',
|
28 |
+
help="Path to your input image.",
|
29 |
+
)
|
30 |
+
parser.add_argument(
|
31 |
+
"-o",
|
32 |
+
"--output_dir",
|
33 |
+
type=str,
|
34 |
+
default='demo_output',
|
35 |
+
help="Path to your output directory.",
|
36 |
+
)
|
37 |
+
parser.add_argument(
|
38 |
+
"-s",
|
39 |
+
"--score_thr",
|
40 |
+
type=float,
|
41 |
+
default=0.3,
|
42 |
+
help="Score threshould to filter the result.",
|
43 |
+
)
|
44 |
+
parser.add_argument(
|
45 |
+
"--input_shape",
|
46 |
+
type=str,
|
47 |
+
default="640,640",
|
48 |
+
help="Specify an input shape for inference.",
|
49 |
+
)
|
50 |
+
parser.add_argument(
|
51 |
+
"--with_p6",
|
52 |
+
action="store_true",
|
53 |
+
help="Whether your model uses p6 in FPN/PAN.",
|
54 |
+
)
|
55 |
+
return parser
|
56 |
+
|
57 |
+
|
58 |
+
if __name__ == '__main__':
|
59 |
+
args = make_parser().parse_args()
|
60 |
+
|
61 |
+
input_shape = tuple(map(int, args.input_shape.split(',')))
|
62 |
+
origin_img = cv2.imread(args.image_path)
|
63 |
+
mean = (0.485, 0.456, 0.406)
|
64 |
+
std = (0.229, 0.224, 0.225)
|
65 |
+
img, ratio = preprocess(origin_img, input_shape, mean, std)
|
66 |
+
|
67 |
+
session = onnxruntime.InferenceSession(args.model)
|
68 |
+
|
69 |
+
ort_inputs = {session.get_inputs()[0].name: img[None, :, :, :]}
|
70 |
+
output = session.run(None, ort_inputs)
|
71 |
+
predictions = postprocess(output[0], input_shape, p6=args.with_p6)[0]
|
72 |
+
|
73 |
+
boxes = predictions[:, :4]
|
74 |
+
scores = predictions[:, 4:5] * predictions[:, 5:]
|
75 |
+
|
76 |
+
boxes_xyxy = np.ones_like(boxes)
|
77 |
+
boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
|
78 |
+
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
|
79 |
+
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
|
80 |
+
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
|
81 |
+
boxes_xyxy /= ratio
|
82 |
+
dets = multiclass_nms(boxes_xyxy, scores, nms_thr=0.65, score_thr=0.1)
|
83 |
+
|
84 |
+
final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5]
|
85 |
+
origin_img = vis(origin_img, final_boxes, final_scores, final_cls_inds,
|
86 |
+
conf=args.score_thr, class_names=COCO_CLASSES)
|
87 |
+
|
88 |
+
mkdir(args.output_dir)
|
89 |
+
output_path = os.path.join(args.output_dir, args.image_path.split("/")[-1])
|
90 |
+
cv2.imwrite(output_path, origin_img)
|
demo/OpenVINO/README.md
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## YOLOX on OpenVINO
|
2 |
+
|
3 |
+
* [C++ Demo]()
|
4 |
+
* [Python Demo]()
|
demo/OpenVINO/cpp/CMakeLists.txt
ADDED
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
cmake_minimum_required(VERSION 3.4.1)
|
2 |
+
set(CMAKE_CXX_STANDARD 14)
|
3 |
+
|
4 |
+
project(yolox_openvino_demo)
|
5 |
+
|
6 |
+
find_package(OpenCV REQUIRED)
|
7 |
+
find_package(InferenceEngine REQUIRED)
|
8 |
+
find_package(ngraph REQUIRED)
|
9 |
+
|
10 |
+
include_directories(
|
11 |
+
${OpenCV_INCLUDE_DIRS}
|
12 |
+
${CMAKE_CURRENT_SOURCE_DIR}
|
13 |
+
${CMAKE_CURRENT_BINARY_DIR}
|
14 |
+
)
|
15 |
+
|
16 |
+
add_executable(yolox_openvino yolox_openvino.cpp)
|
17 |
+
|
18 |
+
target_link_libraries(
|
19 |
+
yolox_openvino
|
20 |
+
${InferenceEngine_LIBRARIES}
|
21 |
+
${NGRAPH_LIBRARIES}
|
22 |
+
${OpenCV_LIBS}
|
23 |
+
)
|
demo/OpenVINO/cpp/README.md
ADDED
@@ -0,0 +1,94 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# User Guide for Deploy YOLOX on OpenVINO
|
2 |
+
|
3 |
+
This toturial includes a C++ demo for OpenVINO, as well as some converted models.
|
4 |
+
|
5 |
+
### Download OpenVINO models.
|
6 |
+
| Model | Parameters | GFLOPs | Test Size | mAP |
|
7 |
+
|:------| :----: | :----: | :---: | :---: |
|
8 |
+
| [YOLOX-Nano](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.res101.fpn.coco.800size.1x) | 0.91M | 1.08 | 416x416 | 25.3 |
|
9 |
+
| [YOLOX-Tiny](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.fpn.coco.800size.1x) | 5.06M | 6.45 | 416x416 |31.7 |
|
10 |
+
| [YOLOX-S](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 9.0M | 26.8 | 640x640 |39.6 |
|
11 |
+
| [YOLOX-M](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 25.3M | 73.8 | 640x640 |46.4 |
|
12 |
+
| [YOLOX-L](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 54.2M | 155.6 | 640x640 |50.0 |
|
13 |
+
| [YOLOX-X](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 99.1M | 281.9 | 640x640 |51.2 |
|
14 |
+
| [YOLOX-Darknet53](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 63.72M | 185.3 | 640x640 |47.3 |
|
15 |
+
|
16 |
+
## Install OpenVINO Toolkit
|
17 |
+
|
18 |
+
Please visit [Openvino Homepage](https://docs.openvinotoolkit.org/latest/get_started_guides.html) for more details.
|
19 |
+
|
20 |
+
## Set up the Environment
|
21 |
+
|
22 |
+
### For Linux
|
23 |
+
|
24 |
+
**Option1. Set up the environment tempororally. You need to run this command everytime you start a new shell window.**
|
25 |
+
|
26 |
+
```shell
|
27 |
+
source /opt/intel/openvino_2021/bin/setupvars.sh
|
28 |
+
```
|
29 |
+
|
30 |
+
**Option2. Set up the environment permenantly.**
|
31 |
+
|
32 |
+
*Step1.* For Linux:
|
33 |
+
```shell
|
34 |
+
vim ~/.bashrc
|
35 |
+
```
|
36 |
+
|
37 |
+
*Step2.* Add the following line into your file:
|
38 |
+
|
39 |
+
```shell
|
40 |
+
source /opt/intel/openvino_2021/bin/setupvars.sh
|
41 |
+
```
|
42 |
+
|
43 |
+
*Step3.* Save and exit the file, then run:
|
44 |
+
|
45 |
+
```shell
|
46 |
+
source ~/.bashrc
|
47 |
+
```
|
48 |
+
|
49 |
+
|
50 |
+
## Convert model
|
51 |
+
|
52 |
+
1. Export ONNX model
|
53 |
+
|
54 |
+
Please refer to the [ONNX toturial]() for more details. **Note that you should set --opset to 10, otherwise your next step will fail.**
|
55 |
+
|
56 |
+
2. Convert ONNX to OpenVINO
|
57 |
+
|
58 |
+
``` shell
|
59 |
+
cd <INSTSLL_DIR>/openvino_2021/deployment_tools/model_optimizer
|
60 |
+
```
|
61 |
+
|
62 |
+
Install requirements for convert tool
|
63 |
+
|
64 |
+
```shell
|
65 |
+
sudo ./install_prerequisites/install_prerequisites_onnx.sh
|
66 |
+
```
|
67 |
+
|
68 |
+
Then convert model.
|
69 |
+
```shell
|
70 |
+
python3 mo.py --input_model <ONNX_MODEL> --input_shape <INPUT_SHAPE> [--data_type FP16]
|
71 |
+
```
|
72 |
+
For example:
|
73 |
+
```shell
|
74 |
+
python3 mo.py --input_model yolox.onnx --input_shape (1,3,640,640) --data_type FP16
|
75 |
+
```
|
76 |
+
|
77 |
+
## Build
|
78 |
+
|
79 |
+
### Linux
|
80 |
+
```shell
|
81 |
+
source /opt/intel/openvino_2021/bin/setupvars.sh
|
82 |
+
mkdir build
|
83 |
+
cd build
|
84 |
+
cmake ..
|
85 |
+
make
|
86 |
+
```
|
87 |
+
|
88 |
+
## Demo
|
89 |
+
|
90 |
+
### c++
|
91 |
+
|
92 |
+
```shell
|
93 |
+
./yolox_openvino <XML_MODEL_PATH> <IMAGE_PATH> <DEVICE>
|
94 |
+
```
|
demo/OpenVINO/cpp/yolox_openvino.cpp
ADDED
@@ -0,0 +1,531 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
// Copyright (C) 2018-2021 Intel Corporation
|
2 |
+
// SPDX-License-Identifier: Apache-2.0
|
3 |
+
//
|
4 |
+
|
5 |
+
#include <iterator>
|
6 |
+
#include <memory>
|
7 |
+
#include <string>
|
8 |
+
#include <vector>
|
9 |
+
#include <opencv2/opencv.hpp>
|
10 |
+
#include <iostream>
|
11 |
+
#include <inference_engine.hpp>
|
12 |
+
|
13 |
+
using namespace InferenceEngine;
|
14 |
+
|
15 |
+
/**
|
16 |
+
* @brief Define names based depends on Unicode path support
|
17 |
+
*/
|
18 |
+
#define tcout std::cout
|
19 |
+
#define file_name_t std::string
|
20 |
+
#define imread_t cv::imread
|
21 |
+
#define NMS_THRESH 0.65
|
22 |
+
#define BBOX_CONF_THRESH 0.3
|
23 |
+
|
24 |
+
static const int INPUT_W = 416;
|
25 |
+
static const int INPUT_H = 416;
|
26 |
+
|
27 |
+
cv::Mat static_resize(cv::Mat& img) {
|
28 |
+
float r = std::min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
|
29 |
+
// r = std::min(r, 1.0f);
|
30 |
+
int unpad_w = r * img.cols;
|
31 |
+
int unpad_h = r * img.rows;
|
32 |
+
cv::Mat re(unpad_h, unpad_w, CV_8UC3);
|
33 |
+
cv::resize(img, re, re.size());
|
34 |
+
cv::Mat out(INPUT_W, INPUT_H, CV_8UC3, cv::Scalar(114, 114, 114));
|
35 |
+
re.copyTo(out(cv::Rect(0, 0, re.cols, re.rows)));
|
36 |
+
return out;
|
37 |
+
}
|
38 |
+
|
39 |
+
void blobFromImage(cv::Mat& img, Blob::Ptr& blob){
|
40 |
+
cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
|
41 |
+
int channels = 3;
|
42 |
+
int img_h = img.rows;
|
43 |
+
int img_w = img.cols;
|
44 |
+
std::vector<float> mean = {0.485, 0.456, 0.406};
|
45 |
+
std::vector<float> std = {0.229, 0.224, 0.225};
|
46 |
+
InferenceEngine::MemoryBlob::Ptr mblob = InferenceEngine::as<InferenceEngine::MemoryBlob>(blob);
|
47 |
+
if (!mblob)
|
48 |
+
{
|
49 |
+
THROW_IE_EXCEPTION << "We expect blob to be inherited from MemoryBlob in matU8ToBlob, "
|
50 |
+
<< "but by fact we were not able to cast inputBlob to MemoryBlob";
|
51 |
+
}
|
52 |
+
// locked memory holder should be alive all time while access to its buffer happens
|
53 |
+
auto mblobHolder = mblob->wmap();
|
54 |
+
|
55 |
+
float *blob_data = mblobHolder.as<float *>();
|
56 |
+
|
57 |
+
for (size_t c = 0; c < channels; c++)
|
58 |
+
{
|
59 |
+
for (size_t h = 0; h < img_h; h++)
|
60 |
+
{
|
61 |
+
for (size_t w = 0; w < img_w; w++)
|
62 |
+
{
|
63 |
+
blob_data[c * img_w * img_h + h * img_w + w] =
|
64 |
+
(((float)img.at<cv::Vec3b>(h, w)[c]) / 255.0f - mean[c]) / std[c];
|
65 |
+
}
|
66 |
+
}
|
67 |
+
}
|
68 |
+
}
|
69 |
+
|
70 |
+
|
71 |
+
struct Object
|
72 |
+
{
|
73 |
+
cv::Rect_<float> rect;
|
74 |
+
int label;
|
75 |
+
float prob;
|
76 |
+
};
|
77 |
+
|
78 |
+
struct GridAndStride
|
79 |
+
{
|
80 |
+
int grid0;
|
81 |
+
int grid1;
|
82 |
+
int stride;
|
83 |
+
};
|
84 |
+
|
85 |
+
static int generate_grids_and_stride(const int target_size, std::vector<int>& strides, std::vector<GridAndStride>& grid_strides)
|
86 |
+
{
|
87 |
+
for (auto stride : strides)
|
88 |
+
{
|
89 |
+
int num_grid = target_size / stride;
|
90 |
+
for (int g1 = 0; g1 < num_grid; g1++)
|
91 |
+
{
|
92 |
+
for (int g0 = 0; g0 < num_grid; g0++)
|
93 |
+
{
|
94 |
+
grid_strides.push_back((GridAndStride){g0, g1, stride});
|
95 |
+
}
|
96 |
+
}
|
97 |
+
}
|
98 |
+
}
|
99 |
+
|
100 |
+
|
101 |
+
static void generate_yolox_proposals(std::vector<GridAndStride> grid_strides, const float* feat_ptr, float prob_threshold, std::vector<Object>& objects)
|
102 |
+
{
|
103 |
+
const int num_class = 80; // COCO has 80 classes. Modify this value on your own dataset.
|
104 |
+
|
105 |
+
const int num_anchors = grid_strides.size();
|
106 |
+
|
107 |
+
for (int anchor_idx = 0; anchor_idx < num_anchors; anchor_idx++)
|
108 |
+
{
|
109 |
+
const int grid0 = grid_strides[anchor_idx].grid0;
|
110 |
+
const int grid1 = grid_strides[anchor_idx].grid1;
|
111 |
+
const int stride = grid_strides[anchor_idx].stride;
|
112 |
+
|
113 |
+
const int basic_pos = anchor_idx * 85;
|
114 |
+
|
115 |
+
// yolox/models/yolo_head.py decode logic
|
116 |
+
// outputs[..., :2] = (outputs[..., :2] + grids) * strides
|
117 |
+
// outputs[..., 2:4] = torch.exp(outputs[..., 2:4]) * strides
|
118 |
+
float x_center = (feat_ptr[basic_pos + 0] + grid0) * stride;
|
119 |
+
float y_center = (feat_ptr[basic_pos + 1] + grid1) * stride;
|
120 |
+
float w = exp(feat_ptr[basic_pos + 2]) * stride;
|
121 |
+
float h = exp(feat_ptr[basic_pos + 3]) * stride;
|
122 |
+
float x0 = x_center - w * 0.5f;
|
123 |
+
float y0 = y_center - h * 0.5f;
|
124 |
+
|
125 |
+
float box_objectness = feat_ptr[basic_pos + 4];
|
126 |
+
for (int class_idx = 0; class_idx < num_class; class_idx++)
|
127 |
+
{
|
128 |
+
float box_cls_score = feat_ptr[basic_pos + 5 + class_idx];
|
129 |
+
float box_prob = box_objectness * box_cls_score;
|
130 |
+
if (box_prob > prob_threshold)
|
131 |
+
{
|
132 |
+
Object obj;
|
133 |
+
obj.rect.x = x0;
|
134 |
+
obj.rect.y = y0;
|
135 |
+
obj.rect.width = w;
|
136 |
+
obj.rect.height = h;
|
137 |
+
obj.label = class_idx;
|
138 |
+
obj.prob = box_prob;
|
139 |
+
|
140 |
+
objects.push_back(obj);
|
141 |
+
}
|
142 |
+
|
143 |
+
} // class loop
|
144 |
+
|
145 |
+
} // point anchor loop
|
146 |
+
}
|
147 |
+
|
148 |
+
static inline float intersection_area(const Object& a, const Object& b)
|
149 |
+
{
|
150 |
+
cv::Rect_<float> inter = a.rect & b.rect;
|
151 |
+
return inter.area();
|
152 |
+
}
|
153 |
+
|
154 |
+
static void qsort_descent_inplace(std::vector<Object>& faceobjects, int left, int right)
|
155 |
+
{
|
156 |
+
int i = left;
|
157 |
+
int j = right;
|
158 |
+
float p = faceobjects[(left + right) / 2].prob;
|
159 |
+
|
160 |
+
while (i <= j)
|
161 |
+
{
|
162 |
+
while (faceobjects[i].prob > p)
|
163 |
+
i++;
|
164 |
+
|
165 |
+
while (faceobjects[j].prob < p)
|
166 |
+
j--;
|
167 |
+
|
168 |
+
if (i <= j)
|
169 |
+
{
|
170 |
+
// swap
|
171 |
+
std::swap(faceobjects[i], faceobjects[j]);
|
172 |
+
|
173 |
+
i++;
|
174 |
+
j--;
|
175 |
+
}
|
176 |
+
}
|
177 |
+
|
178 |
+
#pragma omp parallel sections
|
179 |
+
{
|
180 |
+
#pragma omp section
|
181 |
+
{
|
182 |
+
if (left < j) qsort_descent_inplace(faceobjects, left, j);
|
183 |
+
}
|
184 |
+
#pragma omp section
|
185 |
+
{
|
186 |
+
if (i < right) qsort_descent_inplace(faceobjects, i, right);
|
187 |
+
}
|
188 |
+
}
|
189 |
+
}
|
190 |
+
|
191 |
+
|
192 |
+
static void qsort_descent_inplace(std::vector<Object>& objects)
|
193 |
+
{
|
194 |
+
if (objects.empty())
|
195 |
+
return;
|
196 |
+
|
197 |
+
qsort_descent_inplace(objects, 0, objects.size() - 1);
|
198 |
+
}
|
199 |
+
|
200 |
+
static void nms_sorted_bboxes(const std::vector<Object>& faceobjects, std::vector<int>& picked, float nms_threshold)
|
201 |
+
{
|
202 |
+
picked.clear();
|
203 |
+
|
204 |
+
const int n = faceobjects.size();
|
205 |
+
|
206 |
+
std::vector<float> areas(n);
|
207 |
+
for (int i = 0; i < n; i++)
|
208 |
+
{
|
209 |
+
areas[i] = faceobjects[i].rect.area();
|
210 |
+
}
|
211 |
+
|
212 |
+
for (int i = 0; i < n; i++)
|
213 |
+
{
|
214 |
+
const Object& a = faceobjects[i];
|
215 |
+
|
216 |
+
int keep = 1;
|
217 |
+
for (int j = 0; j < (int)picked.size(); j++)
|
218 |
+
{
|
219 |
+
const Object& b = faceobjects[picked[j]];
|
220 |
+
|
221 |
+
// intersection over union
|
222 |
+
float inter_area = intersection_area(a, b);
|
223 |
+
float union_area = areas[i] + areas[picked[j]] - inter_area;
|
224 |
+
// float IoU = inter_area / union_area
|
225 |
+
if (inter_area / union_area > nms_threshold)
|
226 |
+
keep = 0;
|
227 |
+
}
|
228 |
+
|
229 |
+
if (keep)
|
230 |
+
picked.push_back(i);
|
231 |
+
}
|
232 |
+
}
|
233 |
+
|
234 |
+
|
235 |
+
static void decode_outputs(const float* prob, std::vector<Object>& objects, float scale, const int img_w, const int img_h) {
|
236 |
+
std::vector<Object> proposals;
|
237 |
+
std::vector<int> strides = {8, 16, 32};
|
238 |
+
std::vector<GridAndStride> grid_strides;
|
239 |
+
|
240 |
+
generate_grids_and_stride(INPUT_W, strides, grid_strides);
|
241 |
+
generate_yolox_proposals(grid_strides, prob, BBOX_CONF_THRESH, proposals);
|
242 |
+
qsort_descent_inplace(proposals);
|
243 |
+
|
244 |
+
std::vector<int> picked;
|
245 |
+
nms_sorted_bboxes(proposals, picked, NMS_THRESH);
|
246 |
+
int count = picked.size();
|
247 |
+
objects.resize(count);
|
248 |
+
|
249 |
+
for (int i = 0; i < count; i++)
|
250 |
+
{
|
251 |
+
objects[i] = proposals[picked[i]];
|
252 |
+
|
253 |
+
// adjust offset to original unpadded
|
254 |
+
float x0 = (objects[i].rect.x) / scale;
|
255 |
+
float y0 = (objects[i].rect.y) / scale;
|
256 |
+
float x1 = (objects[i].rect.x + objects[i].rect.width) / scale;
|
257 |
+
float y1 = (objects[i].rect.y + objects[i].rect.height) / scale;
|
258 |
+
|
259 |
+
// clip
|
260 |
+
x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f);
|
261 |
+
y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f);
|
262 |
+
x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f);
|
263 |
+
y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f);
|
264 |
+
|
265 |
+
objects[i].rect.x = x0;
|
266 |
+
objects[i].rect.y = y0;
|
267 |
+
objects[i].rect.width = x1 - x0;
|
268 |
+
objects[i].rect.height = y1 - y0;
|
269 |
+
}
|
270 |
+
}
|
271 |
+
|
272 |
+
const float color_list[80][3] =
|
273 |
+
{
|
274 |
+
{0.000, 0.447, 0.741},
|
275 |
+
{0.850, 0.325, 0.098},
|
276 |
+
{0.929, 0.694, 0.125},
|
277 |
+
{0.494, 0.184, 0.556},
|
278 |
+
{0.466, 0.674, 0.188},
|
279 |
+
{0.301, 0.745, 0.933},
|
280 |
+
{0.635, 0.078, 0.184},
|
281 |
+
{0.300, 0.300, 0.300},
|
282 |
+
{0.600, 0.600, 0.600},
|
283 |
+
{1.000, 0.000, 0.000},
|
284 |
+
{1.000, 0.500, 0.000},
|
285 |
+
{0.749, 0.749, 0.000},
|
286 |
+
{0.000, 1.000, 0.000},
|
287 |
+
{0.000, 0.000, 1.000},
|
288 |
+
{0.667, 0.000, 1.000},
|
289 |
+
{0.333, 0.333, 0.000},
|
290 |
+
{0.333, 0.667, 0.000},
|
291 |
+
{0.333, 1.000, 0.000},
|
292 |
+
{0.667, 0.333, 0.000},
|
293 |
+
{0.667, 0.667, 0.000},
|
294 |
+
{0.667, 1.000, 0.000},
|
295 |
+
{1.000, 0.333, 0.000},
|
296 |
+
{1.000, 0.667, 0.000},
|
297 |
+
{1.000, 1.000, 0.000},
|
298 |
+
{0.000, 0.333, 0.500},
|
299 |
+
{0.000, 0.667, 0.500},
|
300 |
+
{0.000, 1.000, 0.500},
|
301 |
+
{0.333, 0.000, 0.500},
|
302 |
+
{0.333, 0.333, 0.500},
|
303 |
+
{0.333, 0.667, 0.500},
|
304 |
+
{0.333, 1.000, 0.500},
|
305 |
+
{0.667, 0.000, 0.500},
|
306 |
+
{0.667, 0.333, 0.500},
|
307 |
+
{0.667, 0.667, 0.500},
|
308 |
+
{0.667, 1.000, 0.500},
|
309 |
+
{1.000, 0.000, 0.500},
|
310 |
+
{1.000, 0.333, 0.500},
|
311 |
+
{1.000, 0.667, 0.500},
|
312 |
+
{1.000, 1.000, 0.500},
|
313 |
+
{0.000, 0.333, 1.000},
|
314 |
+
{0.000, 0.667, 1.000},
|
315 |
+
{0.000, 1.000, 1.000},
|
316 |
+
{0.333, 0.000, 1.000},
|
317 |
+
{0.333, 0.333, 1.000},
|
318 |
+
{0.333, 0.667, 1.000},
|
319 |
+
{0.333, 1.000, 1.000},
|
320 |
+
{0.667, 0.000, 1.000},
|
321 |
+
{0.667, 0.333, 1.000},
|
322 |
+
{0.667, 0.667, 1.000},
|
323 |
+
{0.667, 1.000, 1.000},
|
324 |
+
{1.000, 0.000, 1.000},
|
325 |
+
{1.000, 0.333, 1.000},
|
326 |
+
{1.000, 0.667, 1.000},
|
327 |
+
{0.333, 0.000, 0.000},
|
328 |
+
{0.500, 0.000, 0.000},
|
329 |
+
{0.667, 0.000, 0.000},
|
330 |
+
{0.833, 0.000, 0.000},
|
331 |
+
{1.000, 0.000, 0.000},
|
332 |
+
{0.000, 0.167, 0.000},
|
333 |
+
{0.000, 0.333, 0.000},
|
334 |
+
{0.000, 0.500, 0.000},
|
335 |
+
{0.000, 0.667, 0.000},
|
336 |
+
{0.000, 0.833, 0.000},
|
337 |
+
{0.000, 1.000, 0.000},
|
338 |
+
{0.000, 0.000, 0.167},
|
339 |
+
{0.000, 0.000, 0.333},
|
340 |
+
{0.000, 0.000, 0.500},
|
341 |
+
{0.000, 0.000, 0.667},
|
342 |
+
{0.000, 0.000, 0.833},
|
343 |
+
{0.000, 0.000, 1.000},
|
344 |
+
{0.000, 0.000, 0.000},
|
345 |
+
{0.143, 0.143, 0.143},
|
346 |
+
{0.286, 0.286, 0.286},
|
347 |
+
{0.429, 0.429, 0.429},
|
348 |
+
{0.571, 0.571, 0.571},
|
349 |
+
{0.714, 0.714, 0.714},
|
350 |
+
{0.857, 0.857, 0.857},
|
351 |
+
{0.000, 0.447, 0.741},
|
352 |
+
{0.314, 0.717, 0.741},
|
353 |
+
{0.50, 0.5, 0}
|
354 |
+
};
|
355 |
+
|
356 |
+
static void draw_objects(const cv::Mat& bgr, const std::vector<Object>& objects)
|
357 |
+
{
|
358 |
+
static const char* class_names[] = {
|
359 |
+
"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
|
360 |
+
"fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
|
361 |
+
"elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
|
362 |
+
"skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
|
363 |
+
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
|
364 |
+
"sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
|
365 |
+
"potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
|
366 |
+
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
|
367 |
+
"hair drier", "toothbrush"
|
368 |
+
};
|
369 |
+
|
370 |
+
cv::Mat image = bgr.clone();
|
371 |
+
|
372 |
+
for (size_t i = 0; i < objects.size(); i++)
|
373 |
+
{
|
374 |
+
const Object& obj = objects[i];
|
375 |
+
|
376 |
+
fprintf(stderr, "%d = %.5f at %.2f %.2f %.2f x %.2f\n", obj.label, obj.prob,
|
377 |
+
obj.rect.x, obj.rect.y, obj.rect.width, obj.rect.height);
|
378 |
+
|
379 |
+
cv::Scalar color = cv::Scalar(color_list[obj.label][0], color_list[obj.label][1], color_list[obj.label][2]);
|
380 |
+
float c_mean = cv::mean(color)[0];
|
381 |
+
cv::Scalar txt_color;
|
382 |
+
if (c_mean > 0.5){
|
383 |
+
txt_color = cv::Scalar(0, 0, 0);
|
384 |
+
}else{
|
385 |
+
txt_color = cv::Scalar(255, 255, 255);
|
386 |
+
}
|
387 |
+
|
388 |
+
cv::rectangle(image, obj.rect, color * 255, 2);
|
389 |
+
|
390 |
+
char text[256];
|
391 |
+
sprintf(text, "%s %.1f%%", class_names[obj.label], obj.prob * 100);
|
392 |
+
|
393 |
+
int baseLine = 0;
|
394 |
+
cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_COMPLEX, 0.4, 1, &baseLine);
|
395 |
+
|
396 |
+
cv::Scalar txt_bk_color = color * 0.7 * 255;
|
397 |
+
|
398 |
+
int x = obj.rect.x;
|
399 |
+
int y = obj.rect.y + 1;
|
400 |
+
//int y = obj.rect.y - label_size.height - baseLine;
|
401 |
+
if (y > image.rows)
|
402 |
+
y = image.rows;
|
403 |
+
//if (x + label_size.width > image.cols)
|
404 |
+
//x = image.cols - label_size.width;
|
405 |
+
|
406 |
+
cv::rectangle(image, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
|
407 |
+
txt_bk_color, -1);
|
408 |
+
|
409 |
+
cv::putText(image, text, cv::Point(x, y + label_size.height),
|
410 |
+
cv::FONT_HERSHEY_COMPLEX, 0.4, txt_color, 1);
|
411 |
+
}
|
412 |
+
|
413 |
+
cv::imwrite("_demo.jpg" , image);
|
414 |
+
fprintf(stderr, "save vis file\n");
|
415 |
+
/* cv::imshow("image", image); */
|
416 |
+
/* cv::waitKey(0); */
|
417 |
+
}
|
418 |
+
|
419 |
+
|
420 |
+
int main(int argc, char* argv[]) {
|
421 |
+
try {
|
422 |
+
// ------------------------------ Parsing and validation of input arguments
|
423 |
+
// ---------------------------------
|
424 |
+
if (argc != 4) {
|
425 |
+
tcout << "Usage : " << argv[0] << " <path_to_model> <path_to_image> <device_name>" << std::endl;
|
426 |
+
return EXIT_FAILURE;
|
427 |
+
}
|
428 |
+
|
429 |
+
const file_name_t input_model {argv[1]};
|
430 |
+
const file_name_t input_image_path {argv[2]};
|
431 |
+
const std::string device_name {argv[3]};
|
432 |
+
// -----------------------------------------------------------------------------------------------------
|
433 |
+
|
434 |
+
// --------------------------- Step 1. Initialize inference engine core
|
435 |
+
// -------------------------------------
|
436 |
+
Core ie;
|
437 |
+
// -----------------------------------------------------------------------------------------------------
|
438 |
+
|
439 |
+
// Step 2. Read a model in OpenVINO Intermediate Representation (.xml and
|
440 |
+
// .bin files) or ONNX (.onnx file) format
|
441 |
+
CNNNetwork network = ie.ReadNetwork(input_model);
|
442 |
+
if (network.getOutputsInfo().size() != 1)
|
443 |
+
throw std::logic_error("Sample supports topologies with 1 output only");
|
444 |
+
if (network.getInputsInfo().size() != 1)
|
445 |
+
throw std::logic_error("Sample supports topologies with 1 input only");
|
446 |
+
// -----------------------------------------------------------------------------------------------------
|
447 |
+
|
448 |
+
// --------------------------- Step 3. Configure input & output
|
449 |
+
// ---------------------------------------------
|
450 |
+
// --------------------------- Prepare input blobs
|
451 |
+
// -----------------------------------------------------
|
452 |
+
InputInfo::Ptr input_info = network.getInputsInfo().begin()->second;
|
453 |
+
std::string input_name = network.getInputsInfo().begin()->first;
|
454 |
+
|
455 |
+
/* Mark input as resizable by setting of a resize algorithm.
|
456 |
+
* In this case we will be able to set an input blob of any shape to an
|
457 |
+
* infer request. Resize and layout conversions are executed automatically
|
458 |
+
* during inference */
|
459 |
+
//input_info->getPreProcess().setResizeAlgorithm(RESIZE_BILINEAR);
|
460 |
+
//input_info->setLayout(Layout::NHWC);
|
461 |
+
//input_info->setPrecision(Precision::FP32);
|
462 |
+
|
463 |
+
// --------------------------- Prepare output blobs
|
464 |
+
// ----------------------------------------------------
|
465 |
+
if (network.getOutputsInfo().empty()) {
|
466 |
+
std::cerr << "Network outputs info is empty" << std::endl;
|
467 |
+
return EXIT_FAILURE;
|
468 |
+
}
|
469 |
+
DataPtr output_info = network.getOutputsInfo().begin()->second;
|
470 |
+
std::string output_name = network.getOutputsInfo().begin()->first;
|
471 |
+
|
472 |
+
output_info->setPrecision(Precision::FP32);
|
473 |
+
// -----------------------------------------------------------------------------------------------------
|
474 |
+
|
475 |
+
// --------------------------- Step 4. Loading a model to the device
|
476 |
+
// ------------------------------------------
|
477 |
+
ExecutableNetwork executable_network = ie.LoadNetwork(network, device_name);
|
478 |
+
// -----------------------------------------------------------------------------------------------------
|
479 |
+
|
480 |
+
// --------------------------- Step 5. Create an infer request
|
481 |
+
// -------------------------------------------------
|
482 |
+
InferRequest infer_request = executable_network.CreateInferRequest();
|
483 |
+
// -----------------------------------------------------------------------------------------------------
|
484 |
+
|
485 |
+
// --------------------------- Step 6. Prepare input
|
486 |
+
// --------------------------------------------------------
|
487 |
+
/* Read input image to a blob and set it to an infer request without resize
|
488 |
+
* and layout conversions. */
|
489 |
+
cv::Mat image = imread_t(input_image_path);
|
490 |
+
cv::Mat pr_img = static_resize(image);
|
491 |
+
Blob::Ptr imgBlob = infer_request.GetBlob(input_name); // just wrap Mat data by Blob::Ptr
|
492 |
+
blobFromImage(pr_img, imgBlob);
|
493 |
+
|
494 |
+
// infer_request.SetBlob(input_name, imgBlob); // infer_request accepts input blob of any size
|
495 |
+
// -----------------------------------------------------------------------------------------------------
|
496 |
+
|
497 |
+
// --------------------------- Step 7. Do inference
|
498 |
+
// --------------------------------------------------------
|
499 |
+
/* Running the request synchronously */
|
500 |
+
infer_request.Infer();
|
501 |
+
// -----------------------------------------------------------------------------------------------------
|
502 |
+
|
503 |
+
// --------------------------- Step 8. Process output
|
504 |
+
// ------------------------------------------------------
|
505 |
+
const Blob::Ptr output_blob = infer_request.GetBlob(output_name);
|
506 |
+
MemoryBlob::CPtr moutput = as<MemoryBlob>(output_blob);
|
507 |
+
if (!moutput) {
|
508 |
+
throw std::logic_error("We expect output to be inherited from MemoryBlob, "
|
509 |
+
"but by fact we were not able to cast output to MemoryBlob");
|
510 |
+
}
|
511 |
+
// locked memory holder should be alive all time while access to its buffer
|
512 |
+
// happens
|
513 |
+
auto moutputHolder = moutput->rmap();
|
514 |
+
const float* net_pred = moutputHolder.as<const PrecisionTrait<Precision::FP32>::value_type*>();
|
515 |
+
|
516 |
+
const int image_size = 416;
|
517 |
+
int img_w = image.cols;
|
518 |
+
int img_h = image.rows;
|
519 |
+
float scale = std::min(INPUT_W / (image.cols*1.0), INPUT_H / (image.rows*1.0));
|
520 |
+
std::vector<Object> objects;
|
521 |
+
|
522 |
+
decode_outputs(net_pred, objects, scale, img_w, img_h);
|
523 |
+
draw_objects(image, objects);
|
524 |
+
|
525 |
+
// -----------------------------------------------------------------------------------------------------
|
526 |
+
} catch (const std::exception& ex) {
|
527 |
+
std::cerr << ex.what() << std::endl;
|
528 |
+
return EXIT_FAILURE;
|
529 |
+
}
|
530 |
+
return EXIT_SUCCESS;
|
531 |
+
}
|
demo/OpenVINO/python/README.md
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# User Guide for Deploy YOLOX on OpenVINO
|
2 |
+
|
3 |
+
This toturial includes a Python demo for OpenVINO, as well as some converted models.
|
4 |
+
|
5 |
+
### Download OpenVINO models.
|
6 |
+
| Model | Parameters | GFLOPs | Test Size | mAP |
|
7 |
+
|:------| :----: | :----: | :---: | :---: |
|
8 |
+
| [YOLOX-Nano](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.res101.fpn.coco.800size.1x) | 0.91M | 1.08 | 416x416 | 25.3 |
|
9 |
+
| [YOLOX-Tiny](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.fpn.coco.800size.1x) | 5.06M | 6.45 | 416x416 |31.7 |
|
10 |
+
| [YOLOX-S](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 9.0M | 26.8 | 640x640 |39.6 |
|
11 |
+
| [YOLOX-M](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 25.3M | 73.8 | 640x640 |46.4 |
|
12 |
+
| [YOLOX-L](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 54.2M | 155.6 | 640x640 |50.0 |
|
13 |
+
| [YOLOX-X](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 99.1M | 281.9 | 640x640 |51.2 |
|
14 |
+
| [YOLOX-Darknet53](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 63.72M | 185.3 | 640x640 |47.3 |
|
15 |
+
|
16 |
+
## Install OpenVINO Toolkit
|
17 |
+
|
18 |
+
Please visit [Openvino Homepage](https://docs.openvinotoolkit.org/latest/get_started_guides.html) for more details.
|
19 |
+
|
20 |
+
## Set up the Environment
|
21 |
+
|
22 |
+
### For Linux
|
23 |
+
|
24 |
+
**Option1. Set up the environment tempororally. You need to run this command everytime you start a new shell window.**
|
25 |
+
|
26 |
+
```shell
|
27 |
+
source /opt/intel/openvino_2021/bin/setupvars.sh
|
28 |
+
```
|
29 |
+
|
30 |
+
**Option2. Set up the environment permenantly.**
|
31 |
+
|
32 |
+
*Step1.* For Linux:
|
33 |
+
```shell
|
34 |
+
vim ~/.bashrc
|
35 |
+
```
|
36 |
+
|
37 |
+
*Step2.* Add the following line into your file:
|
38 |
+
|
39 |
+
```shell
|
40 |
+
source /opt/intel/openvino_2021/bin/setupvars.sh
|
41 |
+
```
|
42 |
+
|
43 |
+
*Step3.* Save and exit the file, then run:
|
44 |
+
|
45 |
+
```shell
|
46 |
+
source ~/.bashrc
|
47 |
+
```
|
48 |
+
|
49 |
+
|
50 |
+
## Convert model
|
51 |
+
|
52 |
+
1. Export ONNX model
|
53 |
+
|
54 |
+
Please refer to the [ONNX toturial]() for more details. **Note that you should set --opset to 10, otherwise your next step will fail.**
|
55 |
+
|
56 |
+
2. Convert ONNX to OpenVINO
|
57 |
+
|
58 |
+
``` shell
|
59 |
+
cd <INSTSLL_DIR>/openvino_2021/deployment_tools/model_optimizer
|
60 |
+
```
|
61 |
+
|
62 |
+
Install requirements for convert tool
|
63 |
+
|
64 |
+
```shell
|
65 |
+
sudo ./install_prerequisites/install_prerequisites_onnx.sh
|
66 |
+
```
|
67 |
+
|
68 |
+
Then convert model.
|
69 |
+
```shell
|
70 |
+
python3 mo.py --input_model <ONNX_MODEL> --input_shape <INPUT_SHAPE> [--data_type FP16]
|
71 |
+
```
|
72 |
+
For example:
|
73 |
+
```shell
|
74 |
+
python3 mo.py --input_model yolox.onnx --input_shape (1,3,640,640) --data_type FP16
|
75 |
+
```
|
76 |
+
|
77 |
+
## Demo
|
78 |
+
|
79 |
+
### python
|
80 |
+
|
81 |
+
```shell
|
82 |
+
python openvino_inference.py -m <XML_MODEL_PATH> -i <IMAGE_PATH>
|
83 |
+
```
|
84 |
+
or
|
85 |
+
```shell
|
86 |
+
python openvino_inference.py -m <XML_MODEL_PATH> -i <IMAGE_PATH> -o <OUTPUT_DIR> -s <SCORE_THR> -d <DEVICE>
|
87 |
+
```
|
88 |
+
|
demo/OpenVINO/python/demo_utils.py
ADDED
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import numpy as np
|
2 |
+
|
3 |
+
import os
|
4 |
+
|
5 |
+
|
6 |
+
def mkdir(path):
|
7 |
+
if not os.path.exists(path):
|
8 |
+
os.makedirs(path)
|
9 |
+
|
10 |
+
|
11 |
+
def nms(boxes, scores, nms_thr):
|
12 |
+
"""Single class NMS implemented in Numpy."""
|
13 |
+
x1 = boxes[:, 0]
|
14 |
+
y1 = boxes[:, 1]
|
15 |
+
x2 = boxes[:, 2]
|
16 |
+
y2 = boxes[:, 3]
|
17 |
+
|
18 |
+
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
|
19 |
+
order = scores.argsort()[::-1]
|
20 |
+
|
21 |
+
keep = []
|
22 |
+
while order.size > 0:
|
23 |
+
i = order[0]
|
24 |
+
keep.append(i)
|
25 |
+
xx1 = np.maximum(x1[i], x1[order[1:]])
|
26 |
+
yy1 = np.maximum(y1[i], y1[order[1:]])
|
27 |
+
xx2 = np.minimum(x2[i], x2[order[1:]])
|
28 |
+
yy2 = np.minimum(y2[i], y2[order[1:]])
|
29 |
+
|
30 |
+
w = np.maximum(0.0, xx2 - xx1 + 1)
|
31 |
+
h = np.maximum(0.0, yy2 - yy1 + 1)
|
32 |
+
inter = w * h
|
33 |
+
ovr = inter / (areas[i] + areas[order[1:]] - inter)
|
34 |
+
|
35 |
+
inds = np.where(ovr <= nms_thr)[0]
|
36 |
+
order = order[inds + 1]
|
37 |
+
|
38 |
+
return keep
|
39 |
+
|
40 |
+
|
41 |
+
def multiclass_nms(boxes, scores, nms_thr, score_thr):
|
42 |
+
"""Multiclass NMS implemented in Numpy"""
|
43 |
+
final_dets = []
|
44 |
+
num_classes = scores.shape[1]
|
45 |
+
for cls_ind in range(num_classes):
|
46 |
+
cls_scores = scores[:, cls_ind]
|
47 |
+
valid_score_mask = cls_scores > score_thr
|
48 |
+
if valid_score_mask.sum() == 0:
|
49 |
+
continue
|
50 |
+
else:
|
51 |
+
valid_scores = cls_scores[valid_score_mask]
|
52 |
+
valid_boxes = boxes[valid_score_mask]
|
53 |
+
keep = nms(valid_boxes, valid_scores, nms_thr)
|
54 |
+
if len(keep) > 0:
|
55 |
+
cls_inds = np.ones((len(keep), 1)) * cls_ind
|
56 |
+
dets = np.concatenate([valid_boxes[keep], valid_scores[keep, None], cls_inds], 1)
|
57 |
+
final_dets.append(dets)
|
58 |
+
return np.concatenate(final_dets, 0)
|
59 |
+
|
60 |
+
|
61 |
+
def postprocess(outputs, img_size, p6=False):
|
62 |
+
|
63 |
+
grids = []
|
64 |
+
expanded_strides = []
|
65 |
+
|
66 |
+
if not p6:
|
67 |
+
strides = [8, 16, 32]
|
68 |
+
else:
|
69 |
+
strides = [8, 16, 32, 64]
|
70 |
+
|
71 |
+
hsizes = [img_size[0]//stride for stride in strides]
|
72 |
+
wsizes = [img_size[1]//stride for stride in strides]
|
73 |
+
|
74 |
+
for hsize, wsize, stride in zip(hsizes, wsizes, strides):
|
75 |
+
xv, yv = np.meshgrid(np.arange(hsize), np.arange(wsize))
|
76 |
+
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
|
77 |
+
grids.append(grid)
|
78 |
+
shape = grid.shape[:2]
|
79 |
+
expanded_strides.append(np.full((*shape, 1), stride))
|
80 |
+
|
81 |
+
grids = np.concatenate(grids, 1)
|
82 |
+
expanded_strides = np.concatenate(expanded_strides, 1)
|
83 |
+
outputs[..., :2] = (outputs[..., :2] + grids) * expanded_strides
|
84 |
+
outputs[..., 2:4] = np.exp(outputs[..., 2:4]) * expanded_strides
|
85 |
+
|
86 |
+
return outputs
|
demo/OpenVINO/python/openvino_inference.py
ADDED
@@ -0,0 +1,155 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
# -*- coding: utf-8 -*-
|
3 |
+
# Copyright (C) 2018-2021 Intel Corporation
|
4 |
+
# SPDX-License-Identifier: Apache-2.0
|
5 |
+
import argparse
|
6 |
+
import logging as log
|
7 |
+
import os
|
8 |
+
import sys
|
9 |
+
|
10 |
+
import cv2
|
11 |
+
import numpy as np
|
12 |
+
|
13 |
+
from demo_utils import mkdir, multiclass_nms, postprocess
|
14 |
+
from openvino.inference_engine import IECore
|
15 |
+
from yolox.data.data_augment import preproc as preprocess
|
16 |
+
from yolox.data.datasets import COCO_CLASSES
|
17 |
+
from yolox.utils.visualize import vis
|
18 |
+
|
19 |
+
|
20 |
+
def parse_args() -> argparse.Namespace:
|
21 |
+
"""Parse and return command line arguments"""
|
22 |
+
parser = argparse.ArgumentParser(add_help=False)
|
23 |
+
args = parser.add_argument_group('Options')
|
24 |
+
args.add_argument(
|
25 |
+
'-h',
|
26 |
+
'--help',
|
27 |
+
action='help',
|
28 |
+
help='Show this help message and exit.')
|
29 |
+
args.add_argument(
|
30 |
+
'-m',
|
31 |
+
'--model',
|
32 |
+
required=True,
|
33 |
+
type=str,
|
34 |
+
help='Required. Path to an .xml or .onnx file with a trained model.')
|
35 |
+
args.add_argument(
|
36 |
+
'-i',
|
37 |
+
'--input',
|
38 |
+
required=True,
|
39 |
+
type=str,
|
40 |
+
help='Required. Path to an image file.')
|
41 |
+
args.add_argument(
|
42 |
+
'-o',
|
43 |
+
'--output_dir',
|
44 |
+
type=str,
|
45 |
+
default='demo_output',
|
46 |
+
help='Path to your output dir.')
|
47 |
+
args.add_argument(
|
48 |
+
'-s',
|
49 |
+
'--score_thr',
|
50 |
+
type=float,
|
51 |
+
default=0.3,
|
52 |
+
help="Score threshould to visualize the result.")
|
53 |
+
args.add_argument(
|
54 |
+
'-d',
|
55 |
+
'--device',
|
56 |
+
default='CPU',
|
57 |
+
type=str,
|
58 |
+
help='Optional. Specify the target device to infer on; CPU, GPU, \
|
59 |
+
MYRIAD, HDDL or HETERO: is acceptable. The sample will look \
|
60 |
+
for a suitable plugin for device specified. Default value \
|
61 |
+
is CPU.')
|
62 |
+
args.add_argument(
|
63 |
+
'--labels',
|
64 |
+
default=None,
|
65 |
+
type=str,
|
66 |
+
help='Option:al. Path to a labels mapping file.')
|
67 |
+
args.add_argument(
|
68 |
+
'-nt',
|
69 |
+
'--number_top',
|
70 |
+
default=10,
|
71 |
+
type=int,
|
72 |
+
help='Optional. Number of top results.')
|
73 |
+
return parser.parse_args()
|
74 |
+
|
75 |
+
|
76 |
+
def main():
|
77 |
+
log.basicConfig(format='[ %(levelname)s ] %(message)s', level=log.INFO, stream=sys.stdout)
|
78 |
+
args = parse_args()
|
79 |
+
|
80 |
+
# ---------------------------Step 1. Initialize inference engine core--------------------------------------------------
|
81 |
+
log.info('Creating Inference Engine')
|
82 |
+
ie = IECore()
|
83 |
+
|
84 |
+
# ---------------------------Step 2. Read a model in OpenVINO Intermediate Representation or ONNX format---------------
|
85 |
+
log.info(f'Reading the network: {args.model}')
|
86 |
+
# (.xml and .bin files) or (.onnx file)
|
87 |
+
net = ie.read_network(model=args.model)
|
88 |
+
|
89 |
+
if len(net.input_info) != 1:
|
90 |
+
log.error('Sample supports only single input topologies')
|
91 |
+
return -1
|
92 |
+
if len(net.outputs) != 1:
|
93 |
+
log.error('Sample supports only single output topologies')
|
94 |
+
return -1
|
95 |
+
|
96 |
+
# ---------------------------Step 3. Configure input & output----------------------------------------------------------
|
97 |
+
log.info('Configuring input and output blobs')
|
98 |
+
# Get names of input and output blobs
|
99 |
+
input_blob = next(iter(net.input_info))
|
100 |
+
out_blob = next(iter(net.outputs))
|
101 |
+
|
102 |
+
# Set input and output precision manually
|
103 |
+
net.input_info[input_blob].precision = 'FP32'
|
104 |
+
net.outputs[out_blob].precision = 'FP16'
|
105 |
+
|
106 |
+
# Get a number of classes recognized by a model
|
107 |
+
num_of_classes = max(net.outputs[out_blob].shape)
|
108 |
+
|
109 |
+
# ---------------------------Step 4. Loading model to the device-------------------------------------------------------
|
110 |
+
log.info('Loading the model to the plugin')
|
111 |
+
exec_net = ie.load_network(network=net, device_name=args.device)
|
112 |
+
|
113 |
+
# ---------------------------Step 5. Create infer request--------------------------------------------------------------
|
114 |
+
# load_network() method of the IECore class with a specified number of requests (default 1) returns an ExecutableNetwork
|
115 |
+
# instance which stores infer requests. So you already created Infer requests in the previous step.
|
116 |
+
|
117 |
+
# ---------------------------Step 6. Prepare input---------------------------------------------------------------------
|
118 |
+
origin_img = cv2.imread(args.input)
|
119 |
+
_, _, h, w = net.input_info[input_blob].input_data.shape
|
120 |
+
mean = (0.485, 0.456, 0.406)
|
121 |
+
std = (0.229, 0.224, 0.225)
|
122 |
+
image, ratio = preprocess(origin_img, (h, w), mean, std)
|
123 |
+
|
124 |
+
# ---------------------------Step 7. Do inference----------------------------------------------------------------------
|
125 |
+
log.info('Starting inference in synchronous mode')
|
126 |
+
res = exec_net.infer(inputs={input_blob: image})
|
127 |
+
|
128 |
+
# ---------------------------Step 8. Process output--------------------------------------------------------------------
|
129 |
+
res = res[out_blob]
|
130 |
+
|
131 |
+
predictions = postprocess(res, (h, w), p6=False)[0]
|
132 |
+
|
133 |
+
boxes = predictions[:, :4]
|
134 |
+
scores = predictions[:, 4, None] * predictions[:, 5:]
|
135 |
+
|
136 |
+
boxes_xyxy = np.ones_like(boxes)
|
137 |
+
boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
|
138 |
+
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
|
139 |
+
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
|
140 |
+
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
|
141 |
+
boxes_xyxy /= ratio
|
142 |
+
dets = multiclass_nms(boxes_xyxy, scores, nms_thr=0.65, score_thr=0.1)
|
143 |
+
|
144 |
+
final_boxes = dets[:, :4]
|
145 |
+
final_scores, final_cls_inds = dets[:, 4], dets[:, 5]
|
146 |
+
origin_img = vis(origin_img, final_boxes, final_scores, final_cls_inds,
|
147 |
+
conf=args.score_thr, class_names=COCO_CLASSES)
|
148 |
+
|
149 |
+
mkdir(args.output_dir)
|
150 |
+
output_path = os.path.join(args.output_dir, args.image_path.split("/")[-1])
|
151 |
+
cv2.imwrite(output_path, origin_img)
|
152 |
+
|
153 |
+
|
154 |
+
if __name__ == '__main__':
|
155 |
+
sys.exit(main())
|
demo/TensorRT/cpp/CMakeLists.txt
ADDED
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
cmake_minimum_required(VERSION 2.6)
|
2 |
+
|
3 |
+
project(yolox)
|
4 |
+
|
5 |
+
add_definitions(-std=c++11)
|
6 |
+
|
7 |
+
option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
|
8 |
+
set(CMAKE_CXX_STANDARD 11)
|
9 |
+
set(CMAKE_BUILD_TYPE Debug)
|
10 |
+
|
11 |
+
find_package(CUDA REQUIRED)
|
12 |
+
|
13 |
+
include_directories(${PROJECT_SOURCE_DIR}/include)
|
14 |
+
# include and link dirs of cuda and tensorrt, you need adapt them if yours are different
|
15 |
+
# cuda
|
16 |
+
include_directories(/data/cuda/cuda-10.2/cuda/include)
|
17 |
+
link_directories(/data/cuda/cuda-10.2/cuda/lib64)
|
18 |
+
# cudnn
|
19 |
+
include_directories(/data/cuda/cuda-10.2/cudnn/v8.0.4/include)
|
20 |
+
link_directories(/data/cuda/cuda-10.2/cudnn/v8.0.4/lib64)
|
21 |
+
# tensorrt
|
22 |
+
include_directories(/data/cuda/cuda-10.2/TensorRT/v7.2.1.6/include)
|
23 |
+
link_directories(/data/cuda/cuda-10.2/TensorRT/v7.2.1.6/lib)
|
24 |
+
|
25 |
+
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wall -Ofast -Wfatal-errors -D_MWAITXINTRIN_H_INCLUDED")
|
26 |
+
|
27 |
+
find_package(OpenCV)
|
28 |
+
include_directories(${OpenCV_INCLUDE_DIRS})
|
29 |
+
|
30 |
+
add_executable(yolox ${PROJECT_SOURCE_DIR}/yolox.cpp)
|
31 |
+
target_link_libraries(yolox nvinfer)
|
32 |
+
target_link_libraries(yolox cudart)
|
33 |
+
target_link_libraries(yolox ${OpenCV_LIBS})
|
34 |
+
|
35 |
+
add_definitions(-O2 -pthread)
|
36 |
+
|
demo/TensorRT/cpp/README.md
ADDED
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# User Guide for Deploy YOLOX on TensorRT C++
|
2 |
+
|
3 |
+
As YOLOX models is easy to converted to tensorrt using [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt),
|
4 |
+
our C++ demo will not include the model converting or constructing like other tenorrt demos.
|
5 |
+
|
6 |
+
|
7 |
+
## Step 1: Prepare serialized engine file
|
8 |
+
|
9 |
+
Follow the trt [python demo README](../Python/README.md) to convert and save the serialized engine file.
|
10 |
+
|
11 |
+
|
12 |
+
## Step 2: build the demo
|
13 |
+
|
14 |
+
Please follow the [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) to install TensorRT.
|
15 |
+
|
16 |
+
Install opencv with ```sudo apt-get install libopencv-dev```.
|
17 |
+
|
18 |
+
build the demo:
|
19 |
+
|
20 |
+
```shell
|
21 |
+
mkdir build
|
22 |
+
cd build
|
23 |
+
cmake ..
|
24 |
+
make
|
25 |
+
```
|
26 |
+
|
27 |
+
Move the 'model_trt.engine' file generated from Step 1 (saved at the exp output dir) to the build dir:
|
28 |
+
|
29 |
+
```shell
|
30 |
+
mv /path/to/your/exp/output/dir/model_trt.engine .
|
31 |
+
```
|
32 |
+
|
33 |
+
Then run the demo:
|
34 |
+
|
35 |
+
```shell
|
36 |
+
./yolox -d /your/path/to/yolox/assets
|
37 |
+
```
|
38 |
+
|
39 |
+
or
|
40 |
+
|
41 |
+
```shell
|
42 |
+
./yolox -d <img dir>
|
43 |
+
```
|
demo/TensorRT/cpp/logging.h
ADDED
@@ -0,0 +1,503 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
/*
|
2 |
+
* Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
|
3 |
+
*
|
4 |
+
* Licensed under the Apache License, Version 2.0 (the "License");
|
5 |
+
* you may not use this file except in compliance with the License.
|
6 |
+
* You may obtain a copy of the License at
|
7 |
+
*
|
8 |
+
* http://www.apache.org/licenses/LICENSE-2.0
|
9 |
+
*
|
10 |
+
* Unless required by applicable law or agreed to in writing, software
|
11 |
+
* distributed under the License is distributed on an "AS IS" BASIS,
|
12 |
+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
13 |
+
* See the License for the specific language governing permissions and
|
14 |
+
* limitations under the License.
|
15 |
+
*/
|
16 |
+
|
17 |
+
#ifndef TENSORRT_LOGGING_H
|
18 |
+
#define TENSORRT_LOGGING_H
|
19 |
+
|
20 |
+
#include "NvInferRuntimeCommon.h"
|
21 |
+
#include <cassert>
|
22 |
+
#include <ctime>
|
23 |
+
#include <iomanip>
|
24 |
+
#include <iostream>
|
25 |
+
#include <ostream>
|
26 |
+
#include <sstream>
|
27 |
+
#include <string>
|
28 |
+
|
29 |
+
using Severity = nvinfer1::ILogger::Severity;
|
30 |
+
|
31 |
+
class LogStreamConsumerBuffer : public std::stringbuf
|
32 |
+
{
|
33 |
+
public:
|
34 |
+
LogStreamConsumerBuffer(std::ostream& stream, const std::string& prefix, bool shouldLog)
|
35 |
+
: mOutput(stream)
|
36 |
+
, mPrefix(prefix)
|
37 |
+
, mShouldLog(shouldLog)
|
38 |
+
{
|
39 |
+
}
|
40 |
+
|
41 |
+
LogStreamConsumerBuffer(LogStreamConsumerBuffer&& other)
|
42 |
+
: mOutput(other.mOutput)
|
43 |
+
{
|
44 |
+
}
|
45 |
+
|
46 |
+
~LogStreamConsumerBuffer()
|
47 |
+
{
|
48 |
+
// std::streambuf::pbase() gives a pointer to the beginning of the buffered part of the output sequence
|
49 |
+
// std::streambuf::pptr() gives a pointer to the current position of the output sequence
|
50 |
+
// if the pointer to the beginning is not equal to the pointer to the current position,
|
51 |
+
// call putOutput() to log the output to the stream
|
52 |
+
if (pbase() != pptr())
|
53 |
+
{
|
54 |
+
putOutput();
|
55 |
+
}
|
56 |
+
}
|
57 |
+
|
58 |
+
// synchronizes the stream buffer and returns 0 on success
|
59 |
+
// synchronizing the stream buffer consists of inserting the buffer contents into the stream,
|
60 |
+
// resetting the buffer and flushing the stream
|
61 |
+
virtual int sync()
|
62 |
+
{
|
63 |
+
putOutput();
|
64 |
+
return 0;
|
65 |
+
}
|
66 |
+
|
67 |
+
void putOutput()
|
68 |
+
{
|
69 |
+
if (mShouldLog)
|
70 |
+
{
|
71 |
+
// prepend timestamp
|
72 |
+
std::time_t timestamp = std::time(nullptr);
|
73 |
+
tm* tm_local = std::localtime(×tamp);
|
74 |
+
std::cout << "[";
|
75 |
+
std::cout << std::setw(2) << std::setfill('0') << 1 + tm_local->tm_mon << "/";
|
76 |
+
std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_mday << "/";
|
77 |
+
std::cout << std::setw(4) << std::setfill('0') << 1900 + tm_local->tm_year << "-";
|
78 |
+
std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_hour << ":";
|
79 |
+
std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_min << ":";
|
80 |
+
std::cout << std::setw(2) << std::setfill('0') << tm_local->tm_sec << "] ";
|
81 |
+
// std::stringbuf::str() gets the string contents of the buffer
|
82 |
+
// insert the buffer contents pre-appended by the appropriate prefix into the stream
|
83 |
+
mOutput << mPrefix << str();
|
84 |
+
// set the buffer to empty
|
85 |
+
str("");
|
86 |
+
// flush the stream
|
87 |
+
mOutput.flush();
|
88 |
+
}
|
89 |
+
}
|
90 |
+
|
91 |
+
void setShouldLog(bool shouldLog)
|
92 |
+
{
|
93 |
+
mShouldLog = shouldLog;
|
94 |
+
}
|
95 |
+
|
96 |
+
private:
|
97 |
+
std::ostream& mOutput;
|
98 |
+
std::string mPrefix;
|
99 |
+
bool mShouldLog;
|
100 |
+
};
|
101 |
+
|
102 |
+
//!
|
103 |
+
//! \class LogStreamConsumerBase
|
104 |
+
//! \brief Convenience object used to initialize LogStreamConsumerBuffer before std::ostream in LogStreamConsumer
|
105 |
+
//!
|
106 |
+
class LogStreamConsumerBase
|
107 |
+
{
|
108 |
+
public:
|
109 |
+
LogStreamConsumerBase(std::ostream& stream, const std::string& prefix, bool shouldLog)
|
110 |
+
: mBuffer(stream, prefix, shouldLog)
|
111 |
+
{
|
112 |
+
}
|
113 |
+
|
114 |
+
protected:
|
115 |
+
LogStreamConsumerBuffer mBuffer;
|
116 |
+
};
|
117 |
+
|
118 |
+
//!
|
119 |
+
//! \class LogStreamConsumer
|
120 |
+
//! \brief Convenience object used to facilitate use of C++ stream syntax when logging messages.
|
121 |
+
//! Order of base classes is LogStreamConsumerBase and then std::ostream.
|
122 |
+
//! This is because the LogStreamConsumerBase class is used to initialize the LogStreamConsumerBuffer member field
|
123 |
+
//! in LogStreamConsumer and then the address of the buffer is passed to std::ostream.
|
124 |
+
//! This is necessary to prevent the address of an uninitialized buffer from being passed to std::ostream.
|
125 |
+
//! Please do not change the order of the parent classes.
|
126 |
+
//!
|
127 |
+
class LogStreamConsumer : protected LogStreamConsumerBase, public std::ostream
|
128 |
+
{
|
129 |
+
public:
|
130 |
+
//! \brief Creates a LogStreamConsumer which logs messages with level severity.
|
131 |
+
//! Reportable severity determines if the messages are severe enough to be logged.
|
132 |
+
LogStreamConsumer(Severity reportableSeverity, Severity severity)
|
133 |
+
: LogStreamConsumerBase(severityOstream(severity), severityPrefix(severity), severity <= reportableSeverity)
|
134 |
+
, std::ostream(&mBuffer) // links the stream buffer with the stream
|
135 |
+
, mShouldLog(severity <= reportableSeverity)
|
136 |
+
, mSeverity(severity)
|
137 |
+
{
|
138 |
+
}
|
139 |
+
|
140 |
+
LogStreamConsumer(LogStreamConsumer&& other)
|
141 |
+
: LogStreamConsumerBase(severityOstream(other.mSeverity), severityPrefix(other.mSeverity), other.mShouldLog)
|
142 |
+
, std::ostream(&mBuffer) // links the stream buffer with the stream
|
143 |
+
, mShouldLog(other.mShouldLog)
|
144 |
+
, mSeverity(other.mSeverity)
|
145 |
+
{
|
146 |
+
}
|
147 |
+
|
148 |
+
void setReportableSeverity(Severity reportableSeverity)
|
149 |
+
{
|
150 |
+
mShouldLog = mSeverity <= reportableSeverity;
|
151 |
+
mBuffer.setShouldLog(mShouldLog);
|
152 |
+
}
|
153 |
+
|
154 |
+
private:
|
155 |
+
static std::ostream& severityOstream(Severity severity)
|
156 |
+
{
|
157 |
+
return severity >= Severity::kINFO ? std::cout : std::cerr;
|
158 |
+
}
|
159 |
+
|
160 |
+
static std::string severityPrefix(Severity severity)
|
161 |
+
{
|
162 |
+
switch (severity)
|
163 |
+
{
|
164 |
+
case Severity::kINTERNAL_ERROR: return "[F] ";
|
165 |
+
case Severity::kERROR: return "[E] ";
|
166 |
+
case Severity::kWARNING: return "[W] ";
|
167 |
+
case Severity::kINFO: return "[I] ";
|
168 |
+
case Severity::kVERBOSE: return "[V] ";
|
169 |
+
default: assert(0); return "";
|
170 |
+
}
|
171 |
+
}
|
172 |
+
|
173 |
+
bool mShouldLog;
|
174 |
+
Severity mSeverity;
|
175 |
+
};
|
176 |
+
|
177 |
+
//! \class Logger
|
178 |
+
//!
|
179 |
+
//! \brief Class which manages logging of TensorRT tools and samples
|
180 |
+
//!
|
181 |
+
//! \details This class provides a common interface for TensorRT tools and samples to log information to the console,
|
182 |
+
//! and supports logging two types of messages:
|
183 |
+
//!
|
184 |
+
//! - Debugging messages with an associated severity (info, warning, error, or internal error/fatal)
|
185 |
+
//! - Test pass/fail messages
|
186 |
+
//!
|
187 |
+
//! The advantage of having all samples use this class for logging as opposed to emitting directly to stdout/stderr is
|
188 |
+
//! that the logic for controlling the verbosity and formatting of sample output is centralized in one location.
|
189 |
+
//!
|
190 |
+
//! In the future, this class could be extended to support dumping test results to a file in some standard format
|
191 |
+
//! (for example, JUnit XML), and providing additional metadata (e.g. timing the duration of a test run).
|
192 |
+
//!
|
193 |
+
//! TODO: For backwards compatibility with existing samples, this class inherits directly from the nvinfer1::ILogger
|
194 |
+
//! interface, which is problematic since there isn't a clean separation between messages coming from the TensorRT
|
195 |
+
//! library and messages coming from the sample.
|
196 |
+
//!
|
197 |
+
//! In the future (once all samples are updated to use Logger::getTRTLogger() to access the ILogger) we can refactor the
|
198 |
+
//! class to eliminate the inheritance and instead make the nvinfer1::ILogger implementation a member of the Logger
|
199 |
+
//! object.
|
200 |
+
|
201 |
+
class Logger : public nvinfer1::ILogger
|
202 |
+
{
|
203 |
+
public:
|
204 |
+
Logger(Severity severity = Severity::kWARNING)
|
205 |
+
: mReportableSeverity(severity)
|
206 |
+
{
|
207 |
+
}
|
208 |
+
|
209 |
+
//!
|
210 |
+
//! \enum TestResult
|
211 |
+
//! \brief Represents the state of a given test
|
212 |
+
//!
|
213 |
+
enum class TestResult
|
214 |
+
{
|
215 |
+
kRUNNING, //!< The test is running
|
216 |
+
kPASSED, //!< The test passed
|
217 |
+
kFAILED, //!< The test failed
|
218 |
+
kWAIVED //!< The test was waived
|
219 |
+
};
|
220 |
+
|
221 |
+
//!
|
222 |
+
//! \brief Forward-compatible method for retrieving the nvinfer::ILogger associated with this Logger
|
223 |
+
//! \return The nvinfer1::ILogger associated with this Logger
|
224 |
+
//!
|
225 |
+
//! TODO Once all samples are updated to use this method to register the logger with TensorRT,
|
226 |
+
//! we can eliminate the inheritance of Logger from ILogger
|
227 |
+
//!
|
228 |
+
nvinfer1::ILogger& getTRTLogger()
|
229 |
+
{
|
230 |
+
return *this;
|
231 |
+
}
|
232 |
+
|
233 |
+
//!
|
234 |
+
//! \brief Implementation of the nvinfer1::ILogger::log() virtual method
|
235 |
+
//!
|
236 |
+
//! Note samples should not be calling this function directly; it will eventually go away once we eliminate the
|
237 |
+
//! inheritance from nvinfer1::ILogger
|
238 |
+
//!
|
239 |
+
void log(Severity severity, const char* msg) override
|
240 |
+
{
|
241 |
+
LogStreamConsumer(mReportableSeverity, severity) << "[TRT] " << std::string(msg) << std::endl;
|
242 |
+
}
|
243 |
+
|
244 |
+
//!
|
245 |
+
//! \brief Method for controlling the verbosity of logging output
|
246 |
+
//!
|
247 |
+
//! \param severity The logger will only emit messages that have severity of this level or higher.
|
248 |
+
//!
|
249 |
+
void setReportableSeverity(Severity severity)
|
250 |
+
{
|
251 |
+
mReportableSeverity = severity;
|
252 |
+
}
|
253 |
+
|
254 |
+
//!
|
255 |
+
//! \brief Opaque handle that holds logging information for a particular test
|
256 |
+
//!
|
257 |
+
//! This object is an opaque handle to information used by the Logger to print test results.
|
258 |
+
//! The sample must call Logger::defineTest() in order to obtain a TestAtom that can be used
|
259 |
+
//! with Logger::reportTest{Start,End}().
|
260 |
+
//!
|
261 |
+
class TestAtom
|
262 |
+
{
|
263 |
+
public:
|
264 |
+
TestAtom(TestAtom&&) = default;
|
265 |
+
|
266 |
+
private:
|
267 |
+
friend class Logger;
|
268 |
+
|
269 |
+
TestAtom(bool started, const std::string& name, const std::string& cmdline)
|
270 |
+
: mStarted(started)
|
271 |
+
, mName(name)
|
272 |
+
, mCmdline(cmdline)
|
273 |
+
{
|
274 |
+
}
|
275 |
+
|
276 |
+
bool mStarted;
|
277 |
+
std::string mName;
|
278 |
+
std::string mCmdline;
|
279 |
+
};
|
280 |
+
|
281 |
+
//!
|
282 |
+
//! \brief Define a test for logging
|
283 |
+
//!
|
284 |
+
//! \param[in] name The name of the test. This should be a string starting with
|
285 |
+
//! "TensorRT" and containing dot-separated strings containing
|
286 |
+
//! the characters [A-Za-z0-9_].
|
287 |
+
//! For example, "TensorRT.sample_googlenet"
|
288 |
+
//! \param[in] cmdline The command line used to reproduce the test
|
289 |
+
//
|
290 |
+
//! \return a TestAtom that can be used in Logger::reportTest{Start,End}().
|
291 |
+
//!
|
292 |
+
static TestAtom defineTest(const std::string& name, const std::string& cmdline)
|
293 |
+
{
|
294 |
+
return TestAtom(false, name, cmdline);
|
295 |
+
}
|
296 |
+
|
297 |
+
//!
|
298 |
+
//! \brief A convenience overloaded version of defineTest() that accepts an array of command-line arguments
|
299 |
+
//! as input
|
300 |
+
//!
|
301 |
+
//! \param[in] name The name of the test
|
302 |
+
//! \param[in] argc The number of command-line arguments
|
303 |
+
//! \param[in] argv The array of command-line arguments (given as C strings)
|
304 |
+
//!
|
305 |
+
//! \return a TestAtom that can be used in Logger::reportTest{Start,End}().
|
306 |
+
static TestAtom defineTest(const std::string& name, int argc, char const* const* argv)
|
307 |
+
{
|
308 |
+
auto cmdline = genCmdlineString(argc, argv);
|
309 |
+
return defineTest(name, cmdline);
|
310 |
+
}
|
311 |
+
|
312 |
+
//!
|
313 |
+
//! \brief Report that a test has started.
|
314 |
+
//!
|
315 |
+
//! \pre reportTestStart() has not been called yet for the given testAtom
|
316 |
+
//!
|
317 |
+
//! \param[in] testAtom The handle to the test that has started
|
318 |
+
//!
|
319 |
+
static void reportTestStart(TestAtom& testAtom)
|
320 |
+
{
|
321 |
+
reportTestResult(testAtom, TestResult::kRUNNING);
|
322 |
+
assert(!testAtom.mStarted);
|
323 |
+
testAtom.mStarted = true;
|
324 |
+
}
|
325 |
+
|
326 |
+
//!
|
327 |
+
//! \brief Report that a test has ended.
|
328 |
+
//!
|
329 |
+
//! \pre reportTestStart() has been called for the given testAtom
|
330 |
+
//!
|
331 |
+
//! \param[in] testAtom The handle to the test that has ended
|
332 |
+
//! \param[in] result The result of the test. Should be one of TestResult::kPASSED,
|
333 |
+
//! TestResult::kFAILED, TestResult::kWAIVED
|
334 |
+
//!
|
335 |
+
static void reportTestEnd(const TestAtom& testAtom, TestResult result)
|
336 |
+
{
|
337 |
+
assert(result != TestResult::kRUNNING);
|
338 |
+
assert(testAtom.mStarted);
|
339 |
+
reportTestResult(testAtom, result);
|
340 |
+
}
|
341 |
+
|
342 |
+
static int reportPass(const TestAtom& testAtom)
|
343 |
+
{
|
344 |
+
reportTestEnd(testAtom, TestResult::kPASSED);
|
345 |
+
return EXIT_SUCCESS;
|
346 |
+
}
|
347 |
+
|
348 |
+
static int reportFail(const TestAtom& testAtom)
|
349 |
+
{
|
350 |
+
reportTestEnd(testAtom, TestResult::kFAILED);
|
351 |
+
return EXIT_FAILURE;
|
352 |
+
}
|
353 |
+
|
354 |
+
static int reportWaive(const TestAtom& testAtom)
|
355 |
+
{
|
356 |
+
reportTestEnd(testAtom, TestResult::kWAIVED);
|
357 |
+
return EXIT_SUCCESS;
|
358 |
+
}
|
359 |
+
|
360 |
+
static int reportTest(const TestAtom& testAtom, bool pass)
|
361 |
+
{
|
362 |
+
return pass ? reportPass(testAtom) : reportFail(testAtom);
|
363 |
+
}
|
364 |
+
|
365 |
+
Severity getReportableSeverity() const
|
366 |
+
{
|
367 |
+
return mReportableSeverity;
|
368 |
+
}
|
369 |
+
|
370 |
+
private:
|
371 |
+
//!
|
372 |
+
//! \brief returns an appropriate string for prefixing a log message with the given severity
|
373 |
+
//!
|
374 |
+
static const char* severityPrefix(Severity severity)
|
375 |
+
{
|
376 |
+
switch (severity)
|
377 |
+
{
|
378 |
+
case Severity::kINTERNAL_ERROR: return "[F] ";
|
379 |
+
case Severity::kERROR: return "[E] ";
|
380 |
+
case Severity::kWARNING: return "[W] ";
|
381 |
+
case Severity::kINFO: return "[I] ";
|
382 |
+
case Severity::kVERBOSE: return "[V] ";
|
383 |
+
default: assert(0); return "";
|
384 |
+
}
|
385 |
+
}
|
386 |
+
|
387 |
+
//!
|
388 |
+
//! \brief returns an appropriate string for prefixing a test result message with the given result
|
389 |
+
//!
|
390 |
+
static const char* testResultString(TestResult result)
|
391 |
+
{
|
392 |
+
switch (result)
|
393 |
+
{
|
394 |
+
case TestResult::kRUNNING: return "RUNNING";
|
395 |
+
case TestResult::kPASSED: return "PASSED";
|
396 |
+
case TestResult::kFAILED: return "FAILED";
|
397 |
+
case TestResult::kWAIVED: return "WAIVED";
|
398 |
+
default: assert(0); return "";
|
399 |
+
}
|
400 |
+
}
|
401 |
+
|
402 |
+
//!
|
403 |
+
//! \brief returns an appropriate output stream (cout or cerr) to use with the given severity
|
404 |
+
//!
|
405 |
+
static std::ostream& severityOstream(Severity severity)
|
406 |
+
{
|
407 |
+
return severity >= Severity::kINFO ? std::cout : std::cerr;
|
408 |
+
}
|
409 |
+
|
410 |
+
//!
|
411 |
+
//! \brief method that implements logging test results
|
412 |
+
//!
|
413 |
+
static void reportTestResult(const TestAtom& testAtom, TestResult result)
|
414 |
+
{
|
415 |
+
severityOstream(Severity::kINFO) << "&&&& " << testResultString(result) << " " << testAtom.mName << " # "
|
416 |
+
<< testAtom.mCmdline << std::endl;
|
417 |
+
}
|
418 |
+
|
419 |
+
//!
|
420 |
+
//! \brief generate a command line string from the given (argc, argv) values
|
421 |
+
//!
|
422 |
+
static std::string genCmdlineString(int argc, char const* const* argv)
|
423 |
+
{
|
424 |
+
std::stringstream ss;
|
425 |
+
for (int i = 0; i < argc; i++)
|
426 |
+
{
|
427 |
+
if (i > 0)
|
428 |
+
ss << " ";
|
429 |
+
ss << argv[i];
|
430 |
+
}
|
431 |
+
return ss.str();
|
432 |
+
}
|
433 |
+
|
434 |
+
Severity mReportableSeverity;
|
435 |
+
};
|
436 |
+
|
437 |
+
namespace
|
438 |
+
{
|
439 |
+
|
440 |
+
//!
|
441 |
+
//! \brief produces a LogStreamConsumer object that can be used to log messages of severity kVERBOSE
|
442 |
+
//!
|
443 |
+
//! Example usage:
|
444 |
+
//!
|
445 |
+
//! LOG_VERBOSE(logger) << "hello world" << std::endl;
|
446 |
+
//!
|
447 |
+
inline LogStreamConsumer LOG_VERBOSE(const Logger& logger)
|
448 |
+
{
|
449 |
+
return LogStreamConsumer(logger.getReportableSeverity(), Severity::kVERBOSE);
|
450 |
+
}
|
451 |
+
|
452 |
+
//!
|
453 |
+
//! \brief produces a LogStreamConsumer object that can be used to log messages of severity kINFO
|
454 |
+
//!
|
455 |
+
//! Example usage:
|
456 |
+
//!
|
457 |
+
//! LOG_INFO(logger) << "hello world" << std::endl;
|
458 |
+
//!
|
459 |
+
inline LogStreamConsumer LOG_INFO(const Logger& logger)
|
460 |
+
{
|
461 |
+
return LogStreamConsumer(logger.getReportableSeverity(), Severity::kINFO);
|
462 |
+
}
|
463 |
+
|
464 |
+
//!
|
465 |
+
//! \brief produces a LogStreamConsumer object that can be used to log messages of severity kWARNING
|
466 |
+
//!
|
467 |
+
//! Example usage:
|
468 |
+
//!
|
469 |
+
//! LOG_WARN(logger) << "hello world" << std::endl;
|
470 |
+
//!
|
471 |
+
inline LogStreamConsumer LOG_WARN(const Logger& logger)
|
472 |
+
{
|
473 |
+
return LogStreamConsumer(logger.getReportableSeverity(), Severity::kWARNING);
|
474 |
+
}
|
475 |
+
|
476 |
+
//!
|
477 |
+
//! \brief produces a LogStreamConsumer object that can be used to log messages of severity kERROR
|
478 |
+
//!
|
479 |
+
//! Example usage:
|
480 |
+
//!
|
481 |
+
//! LOG_ERROR(logger) << "hello world" << std::endl;
|
482 |
+
//!
|
483 |
+
inline LogStreamConsumer LOG_ERROR(const Logger& logger)
|
484 |
+
{
|
485 |
+
return LogStreamConsumer(logger.getReportableSeverity(), Severity::kERROR);
|
486 |
+
}
|
487 |
+
|
488 |
+
//!
|
489 |
+
//! \brief produces a LogStreamConsumer object that can be used to log messages of severity kINTERNAL_ERROR
|
490 |
+
// ("fatal" severity)
|
491 |
+
//!
|
492 |
+
//! Example usage:
|
493 |
+
//!
|
494 |
+
//! LOG_FATAL(logger) << "hello world" << std::endl;
|
495 |
+
//!
|
496 |
+
inline LogStreamConsumer LOG_FATAL(const Logger& logger)
|
497 |
+
{
|
498 |
+
return LogStreamConsumer(logger.getReportableSeverity(), Severity::kINTERNAL_ERROR);
|
499 |
+
}
|
500 |
+
|
501 |
+
} // anonymous namespace
|
502 |
+
|
503 |
+
#endif // TENSORRT_LOGGING_H
|
demo/TensorRT/cpp/yolox.cpp
ADDED
@@ -0,0 +1,554 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#include <fstream>
|
2 |
+
#include <iostream>
|
3 |
+
#include <sstream>
|
4 |
+
#include <numeric>
|
5 |
+
#include <chrono>
|
6 |
+
#include <vector>
|
7 |
+
#include <opencv2/opencv.hpp>
|
8 |
+
#include <dirent.h>
|
9 |
+
#include "NvInfer.h"
|
10 |
+
#include "cuda_runtime_api.h"
|
11 |
+
#include "logging.h"
|
12 |
+
|
13 |
+
#define CHECK(status) \
|
14 |
+
do\
|
15 |
+
{\
|
16 |
+
auto ret = (status);\
|
17 |
+
if (ret != 0)\
|
18 |
+
{\
|
19 |
+
std::cerr << "Cuda failure: " << ret << std::endl;\
|
20 |
+
abort();\
|
21 |
+
}\
|
22 |
+
} while (0)
|
23 |
+
|
24 |
+
#define DEVICE 0 // GPU id
|
25 |
+
#define NMS_THRESH 0.65
|
26 |
+
#define BBOX_CONF_THRESH 0.3
|
27 |
+
|
28 |
+
using namespace nvinfer1;
|
29 |
+
|
30 |
+
// stuff we know about the network and the input/output blobs
|
31 |
+
static const int INPUT_W = 640;
|
32 |
+
static const int INPUT_H = 640;
|
33 |
+
const char* INPUT_BLOB_NAME = "input_0";
|
34 |
+
const char* OUTPUT_BLOB_NAME = "output_0";
|
35 |
+
static Logger gLogger;
|
36 |
+
|
37 |
+
cv::Mat static_resize(cv::Mat& img) {
|
38 |
+
float r = std::min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
|
39 |
+
// r = std::min(r, 1.0f);
|
40 |
+
int unpad_w = r * img.cols;
|
41 |
+
int unpad_h = r * img.rows;
|
42 |
+
cv::Mat re(unpad_h, unpad_w, CV_8UC3);
|
43 |
+
cv::resize(img, re, re.size());
|
44 |
+
cv::Mat out(INPUT_W, INPUT_H, CV_8UC3, cv::Scalar(114, 114, 114));
|
45 |
+
re.copyTo(out(cv::Rect(0, 0, re.cols, re.rows)));
|
46 |
+
return out;
|
47 |
+
}
|
48 |
+
|
49 |
+
struct Object
|
50 |
+
{
|
51 |
+
cv::Rect_<float> rect;
|
52 |
+
int label;
|
53 |
+
float prob;
|
54 |
+
};
|
55 |
+
|
56 |
+
struct GridAndStride
|
57 |
+
{
|
58 |
+
int grid0;
|
59 |
+
int grid1;
|
60 |
+
int stride;
|
61 |
+
};
|
62 |
+
|
63 |
+
static int generate_grids_and_stride(const int target_size, std::vector<int>& strides, std::vector<GridAndStride>& grid_strides)
|
64 |
+
{
|
65 |
+
for (auto stride : strides)
|
66 |
+
{
|
67 |
+
int num_grid = target_size / stride;
|
68 |
+
for (int g1 = 0; g1 < num_grid; g1++)
|
69 |
+
{
|
70 |
+
for (int g0 = 0; g0 < num_grid; g0++)
|
71 |
+
{
|
72 |
+
grid_strides.push_back((GridAndStride){g0, g1, stride});
|
73 |
+
}
|
74 |
+
}
|
75 |
+
}
|
76 |
+
}
|
77 |
+
|
78 |
+
static inline float intersection_area(const Object& a, const Object& b)
|
79 |
+
{
|
80 |
+
cv::Rect_<float> inter = a.rect & b.rect;
|
81 |
+
return inter.area();
|
82 |
+
}
|
83 |
+
|
84 |
+
static void qsort_descent_inplace(std::vector<Object>& faceobjects, int left, int right)
|
85 |
+
{
|
86 |
+
int i = left;
|
87 |
+
int j = right;
|
88 |
+
float p = faceobjects[(left + right) / 2].prob;
|
89 |
+
|
90 |
+
while (i <= j)
|
91 |
+
{
|
92 |
+
while (faceobjects[i].prob > p)
|
93 |
+
i++;
|
94 |
+
|
95 |
+
while (faceobjects[j].prob < p)
|
96 |
+
j--;
|
97 |
+
|
98 |
+
if (i <= j)
|
99 |
+
{
|
100 |
+
// swap
|
101 |
+
std::swap(faceobjects[i], faceobjects[j]);
|
102 |
+
|
103 |
+
i++;
|
104 |
+
j--;
|
105 |
+
}
|
106 |
+
}
|
107 |
+
|
108 |
+
#pragma omp parallel sections
|
109 |
+
{
|
110 |
+
#pragma omp section
|
111 |
+
{
|
112 |
+
if (left < j) qsort_descent_inplace(faceobjects, left, j);
|
113 |
+
}
|
114 |
+
#pragma omp section
|
115 |
+
{
|
116 |
+
if (i < right) qsort_descent_inplace(faceobjects, i, right);
|
117 |
+
}
|
118 |
+
}
|
119 |
+
}
|
120 |
+
|
121 |
+
static void qsort_descent_inplace(std::vector<Object>& objects)
|
122 |
+
{
|
123 |
+
if (objects.empty())
|
124 |
+
return;
|
125 |
+
|
126 |
+
qsort_descent_inplace(objects, 0, objects.size() - 1);
|
127 |
+
}
|
128 |
+
|
129 |
+
static void nms_sorted_bboxes(const std::vector<Object>& faceobjects, std::vector<int>& picked, float nms_threshold)
|
130 |
+
{
|
131 |
+
picked.clear();
|
132 |
+
|
133 |
+
const int n = faceobjects.size();
|
134 |
+
|
135 |
+
std::vector<float> areas(n);
|
136 |
+
for (int i = 0; i < n; i++)
|
137 |
+
{
|
138 |
+
areas[i] = faceobjects[i].rect.area();
|
139 |
+
}
|
140 |
+
|
141 |
+
for (int i = 0; i < n; i++)
|
142 |
+
{
|
143 |
+
const Object& a = faceobjects[i];
|
144 |
+
|
145 |
+
int keep = 1;
|
146 |
+
for (int j = 0; j < (int)picked.size(); j++)
|
147 |
+
{
|
148 |
+
const Object& b = faceobjects[picked[j]];
|
149 |
+
|
150 |
+
// intersection over union
|
151 |
+
float inter_area = intersection_area(a, b);
|
152 |
+
float union_area = areas[i] + areas[picked[j]] - inter_area;
|
153 |
+
// float IoU = inter_area / union_area
|
154 |
+
if (inter_area / union_area > nms_threshold)
|
155 |
+
keep = 0;
|
156 |
+
}
|
157 |
+
|
158 |
+
if (keep)
|
159 |
+
picked.push_back(i);
|
160 |
+
}
|
161 |
+
}
|
162 |
+
|
163 |
+
|
164 |
+
static void generate_yolox_proposals(std::vector<GridAndStride> grid_strides, float* feat_blob, float prob_threshold, std::vector<Object>& objects)
|
165 |
+
{
|
166 |
+
const int num_class = 80;
|
167 |
+
|
168 |
+
const int num_anchors = grid_strides.size();
|
169 |
+
|
170 |
+
for (int anchor_idx = 0; anchor_idx < num_anchors; anchor_idx++)
|
171 |
+
{
|
172 |
+
const int grid0 = grid_strides[anchor_idx].grid0;
|
173 |
+
const int grid1 = grid_strides[anchor_idx].grid1;
|
174 |
+
const int stride = grid_strides[anchor_idx].stride;
|
175 |
+
|
176 |
+
const int basic_pos = anchor_idx * 85;
|
177 |
+
|
178 |
+
// yolox/models/yolo_head.py decode logic
|
179 |
+
float x_center = (feat_blob[basic_pos+0] + grid0) * stride;
|
180 |
+
float y_center = (feat_blob[basic_pos+1] + grid1) * stride;
|
181 |
+
float w = exp(feat_blob[basic_pos+2]) * stride;
|
182 |
+
float h = exp(feat_blob[basic_pos+3]) * stride;
|
183 |
+
float x0 = x_center - w * 0.5f;
|
184 |
+
float y0 = y_center - h * 0.5f;
|
185 |
+
|
186 |
+
float box_objectness = feat_blob[basic_pos+4];
|
187 |
+
for (int class_idx = 0; class_idx < num_class; class_idx++)
|
188 |
+
{
|
189 |
+
float box_cls_score = feat_blob[basic_pos + 5 + class_idx];
|
190 |
+
float box_prob = box_objectness * box_cls_score;
|
191 |
+
if (box_prob > prob_threshold)
|
192 |
+
{
|
193 |
+
Object obj;
|
194 |
+
obj.rect.x = x0;
|
195 |
+
obj.rect.y = y0;
|
196 |
+
obj.rect.width = w;
|
197 |
+
obj.rect.height = h;
|
198 |
+
obj.label = class_idx;
|
199 |
+
obj.prob = box_prob;
|
200 |
+
|
201 |
+
objects.push_back(obj);
|
202 |
+
}
|
203 |
+
|
204 |
+
} // class loop
|
205 |
+
|
206 |
+
} // point anchor loop
|
207 |
+
}
|
208 |
+
|
209 |
+
float* blobFromImage(cv::Mat& img){
|
210 |
+
cv::cvtColor(img, img, cv::COLOR_BGR2RGB);
|
211 |
+
|
212 |
+
float* blob = new float[img.total()*3];
|
213 |
+
int channels = 3;
|
214 |
+
int img_h = 640;
|
215 |
+
int img_w = 640;
|
216 |
+
std::vector<float> mean = {0.485, 0.456, 0.406};
|
217 |
+
std::vector<float> std = {0.229, 0.224, 0.225};
|
218 |
+
for (size_t c = 0; c < channels; c++)
|
219 |
+
{
|
220 |
+
for (size_t h = 0; h < img_h; h++)
|
221 |
+
{
|
222 |
+
for (size_t w = 0; w < img_w; w++)
|
223 |
+
{
|
224 |
+
blob[c * img_w * img_h + h * img_w + w] =
|
225 |
+
(((float)img.at<cv::Vec3b>(h, w)[c]) / 255.0f - mean[c]) / std[c];
|
226 |
+
}
|
227 |
+
}
|
228 |
+
}
|
229 |
+
return blob;
|
230 |
+
}
|
231 |
+
|
232 |
+
|
233 |
+
int read_files_in_dir(const char *p_dir_name, std::vector<std::string> &file_names) {
|
234 |
+
DIR *p_dir = opendir(p_dir_name);
|
235 |
+
if (p_dir == nullptr) {
|
236 |
+
return -1;
|
237 |
+
}
|
238 |
+
|
239 |
+
struct dirent* p_file = nullptr;
|
240 |
+
while ((p_file = readdir(p_dir)) != nullptr) {
|
241 |
+
if (strcmp(p_file->d_name, ".") != 0 &&
|
242 |
+
strcmp(p_file->d_name, "..") != 0) {
|
243 |
+
std::string cur_file_name(p_file->d_name);
|
244 |
+
file_names.push_back(cur_file_name);
|
245 |
+
}
|
246 |
+
}
|
247 |
+
|
248 |
+
closedir(p_dir);
|
249 |
+
return 0;
|
250 |
+
}
|
251 |
+
|
252 |
+
static void decode_outputs(float* prob, std::vector<Object>& objects, float scale, const int img_w, const int img_h) {
|
253 |
+
std::vector<Object> proposals;
|
254 |
+
std::vector<int> strides = {8, 16, 32};
|
255 |
+
std::vector<GridAndStride> grid_strides;
|
256 |
+
generate_grids_and_stride(INPUT_W, strides, grid_strides);
|
257 |
+
generate_yolox_proposals(grid_strides, prob, BBOX_CONF_THRESH, proposals);
|
258 |
+
std::cout << "num of boxes before nms: " << proposals.size() << std::endl;
|
259 |
+
|
260 |
+
qsort_descent_inplace(proposals);
|
261 |
+
|
262 |
+
std::vector<int> picked;
|
263 |
+
nms_sorted_bboxes(proposals, picked, NMS_THRESH);
|
264 |
+
|
265 |
+
|
266 |
+
int count = picked.size();
|
267 |
+
|
268 |
+
std::cout << "num of boxes: " << count << std::endl;
|
269 |
+
|
270 |
+
objects.resize(count);
|
271 |
+
for (int i = 0; i < count; i++)
|
272 |
+
{
|
273 |
+
objects[i] = proposals[picked[i]];
|
274 |
+
|
275 |
+
// adjust offset to original unpadded
|
276 |
+
float x0 = (objects[i].rect.x) / scale;
|
277 |
+
float y0 = (objects[i].rect.y) / scale;
|
278 |
+
float x1 = (objects[i].rect.x + objects[i].rect.width) / scale;
|
279 |
+
float y1 = (objects[i].rect.y + objects[i].rect.height) / scale;
|
280 |
+
|
281 |
+
// clip
|
282 |
+
x0 = std::max(std::min(x0, (float)(img_w - 1)), 0.f);
|
283 |
+
y0 = std::max(std::min(y0, (float)(img_h - 1)), 0.f);
|
284 |
+
x1 = std::max(std::min(x1, (float)(img_w - 1)), 0.f);
|
285 |
+
y1 = std::max(std::min(y1, (float)(img_h - 1)), 0.f);
|
286 |
+
|
287 |
+
objects[i].rect.x = x0;
|
288 |
+
objects[i].rect.y = y0;
|
289 |
+
objects[i].rect.width = x1 - x0;
|
290 |
+
objects[i].rect.height = y1 - y0;
|
291 |
+
}
|
292 |
+
}
|
293 |
+
|
294 |
+
const float color_list[80][3] =
|
295 |
+
{
|
296 |
+
{0.000, 0.447, 0.741},
|
297 |
+
{0.850, 0.325, 0.098},
|
298 |
+
{0.929, 0.694, 0.125},
|
299 |
+
{0.494, 0.184, 0.556},
|
300 |
+
{0.466, 0.674, 0.188},
|
301 |
+
{0.301, 0.745, 0.933},
|
302 |
+
{0.635, 0.078, 0.184},
|
303 |
+
{0.300, 0.300, 0.300},
|
304 |
+
{0.600, 0.600, 0.600},
|
305 |
+
{1.000, 0.000, 0.000},
|
306 |
+
{1.000, 0.500, 0.000},
|
307 |
+
{0.749, 0.749, 0.000},
|
308 |
+
{0.000, 1.000, 0.000},
|
309 |
+
{0.000, 0.000, 1.000},
|
310 |
+
{0.667, 0.000, 1.000},
|
311 |
+
{0.333, 0.333, 0.000},
|
312 |
+
{0.333, 0.667, 0.000},
|
313 |
+
{0.333, 1.000, 0.000},
|
314 |
+
{0.667, 0.333, 0.000},
|
315 |
+
{0.667, 0.667, 0.000},
|
316 |
+
{0.667, 1.000, 0.000},
|
317 |
+
{1.000, 0.333, 0.000},
|
318 |
+
{1.000, 0.667, 0.000},
|
319 |
+
{1.000, 1.000, 0.000},
|
320 |
+
{0.000, 0.333, 0.500},
|
321 |
+
{0.000, 0.667, 0.500},
|
322 |
+
{0.000, 1.000, 0.500},
|
323 |
+
{0.333, 0.000, 0.500},
|
324 |
+
{0.333, 0.333, 0.500},
|
325 |
+
{0.333, 0.667, 0.500},
|
326 |
+
{0.333, 1.000, 0.500},
|
327 |
+
{0.667, 0.000, 0.500},
|
328 |
+
{0.667, 0.333, 0.500},
|
329 |
+
{0.667, 0.667, 0.500},
|
330 |
+
{0.667, 1.000, 0.500},
|
331 |
+
{1.000, 0.000, 0.500},
|
332 |
+
{1.000, 0.333, 0.500},
|
333 |
+
{1.000, 0.667, 0.500},
|
334 |
+
{1.000, 1.000, 0.500},
|
335 |
+
{0.000, 0.333, 1.000},
|
336 |
+
{0.000, 0.667, 1.000},
|
337 |
+
{0.000, 1.000, 1.000},
|
338 |
+
{0.333, 0.000, 1.000},
|
339 |
+
{0.333, 0.333, 1.000},
|
340 |
+
{0.333, 0.667, 1.000},
|
341 |
+
{0.333, 1.000, 1.000},
|
342 |
+
{0.667, 0.000, 1.000},
|
343 |
+
{0.667, 0.333, 1.000},
|
344 |
+
{0.667, 0.667, 1.000},
|
345 |
+
{0.667, 1.000, 1.000},
|
346 |
+
{1.000, 0.000, 1.000},
|
347 |
+
{1.000, 0.333, 1.000},
|
348 |
+
{1.000, 0.667, 1.000},
|
349 |
+
{0.333, 0.000, 0.000},
|
350 |
+
{0.500, 0.000, 0.000},
|
351 |
+
{0.667, 0.000, 0.000},
|
352 |
+
{0.833, 0.000, 0.000},
|
353 |
+
{1.000, 0.000, 0.000},
|
354 |
+
{0.000, 0.167, 0.000},
|
355 |
+
{0.000, 0.333, 0.000},
|
356 |
+
{0.000, 0.500, 0.000},
|
357 |
+
{0.000, 0.667, 0.000},
|
358 |
+
{0.000, 0.833, 0.000},
|
359 |
+
{0.000, 1.000, 0.000},
|
360 |
+
{0.000, 0.000, 0.167},
|
361 |
+
{0.000, 0.000, 0.333},
|
362 |
+
{0.000, 0.000, 0.500},
|
363 |
+
{0.000, 0.000, 0.667},
|
364 |
+
{0.000, 0.000, 0.833},
|
365 |
+
{0.000, 0.000, 1.000},
|
366 |
+
{0.000, 0.000, 0.000},
|
367 |
+
{0.143, 0.143, 0.143},
|
368 |
+
{0.286, 0.286, 0.286},
|
369 |
+
{0.429, 0.429, 0.429},
|
370 |
+
{0.571, 0.571, 0.571},
|
371 |
+
{0.714, 0.714, 0.714},
|
372 |
+
{0.857, 0.857, 0.857},
|
373 |
+
{0.000, 0.447, 0.741},
|
374 |
+
{0.314, 0.717, 0.741},
|
375 |
+
{0.50, 0.5, 0}
|
376 |
+
};
|
377 |
+
|
378 |
+
static void draw_objects(const cv::Mat& bgr, const std::vector<Object>& objects, std::string f)
|
379 |
+
{
|
380 |
+
static const char* class_names[] = {
|
381 |
+
"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
|
382 |
+
"fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
|
383 |
+
"elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
|
384 |
+
"skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
|
385 |
+
"tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
|
386 |
+
"sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
|
387 |
+
"potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
|
388 |
+
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
|
389 |
+
"hair drier", "toothbrush"
|
390 |
+
};
|
391 |
+
|
392 |
+
cv::Mat image = bgr.clone();
|
393 |
+
|
394 |
+
for (size_t i = 0; i < objects.size(); i++)
|
395 |
+
{
|
396 |
+
const Object& obj = objects[i];
|
397 |
+
|
398 |
+
fprintf(stderr, "%d = %.5f at %.2f %.2f %.2f x %.2f\n", obj.label, obj.prob,
|
399 |
+
obj.rect.x, obj.rect.y, obj.rect.width, obj.rect.height);
|
400 |
+
|
401 |
+
cv::Scalar color = cv::Scalar(color_list[obj.label][0], color_list[obj.label][1], color_list[obj.label][2]);
|
402 |
+
float c_mean = cv::mean(color)[0];
|
403 |
+
cv::Scalar txt_color;
|
404 |
+
if (c_mean > 0.5){
|
405 |
+
txt_color = cv::Scalar(0, 0, 0);
|
406 |
+
}else{
|
407 |
+
txt_color = cv::Scalar(255, 255, 255);
|
408 |
+
}
|
409 |
+
|
410 |
+
cv::rectangle(image, obj.rect, color * 255, 2);
|
411 |
+
|
412 |
+
char text[256];
|
413 |
+
sprintf(text, "%s %.1f%%", class_names[obj.label], obj.prob * 100);
|
414 |
+
|
415 |
+
int baseLine = 0;
|
416 |
+
cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_COMPLEX, 0.4, 1, &baseLine);
|
417 |
+
|
418 |
+
cv::Scalar txt_bk_color = color * 0.7 * 255;
|
419 |
+
|
420 |
+
int x = obj.rect.x;
|
421 |
+
int y = obj.rect.y + 1;
|
422 |
+
//int y = obj.rect.y - label_size.height - baseLine;
|
423 |
+
if (y > image.rows)
|
424 |
+
y = image.rows;
|
425 |
+
//if (x + label_size.width > image.cols)
|
426 |
+
//x = image.cols - label_size.width;
|
427 |
+
|
428 |
+
cv::rectangle(image, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)),
|
429 |
+
txt_bk_color, -1);
|
430 |
+
|
431 |
+
cv::putText(image, text, cv::Point(x, y + label_size.height),
|
432 |
+
cv::FONT_HERSHEY_COMPLEX, 0.4, txt_color, 1);
|
433 |
+
}
|
434 |
+
|
435 |
+
cv::imwrite("_" + f, image);
|
436 |
+
fprintf(stderr, "save vis file\n");
|
437 |
+
/* cv::imshow("image", image); */
|
438 |
+
/* cv::waitKey(0); */
|
439 |
+
}
|
440 |
+
|
441 |
+
|
442 |
+
void doInference(IExecutionContext& context, float* input, float* output, const int output_size, cv::Size input_shape) {
|
443 |
+
const ICudaEngine& engine = context.getEngine();
|
444 |
+
|
445 |
+
// Pointers to input and output device buffers to pass to engine.
|
446 |
+
// Engine requires exactly IEngine::getNbBindings() number of buffers.
|
447 |
+
assert(engine.getNbBindings() == 2);
|
448 |
+
void* buffers[2];
|
449 |
+
|
450 |
+
// In order to bind the buffers, we need to know the names of the input and output tensors.
|
451 |
+
// Note that indices are guaranteed to be less than IEngine::getNbBindings()
|
452 |
+
const int inputIndex = engine.getBindingIndex(INPUT_BLOB_NAME);
|
453 |
+
|
454 |
+
assert(engine.getBindingDataType(inputIndex) == nvinfer1::DataType::kFLOAT);
|
455 |
+
const int outputIndex = engine.getBindingIndex(OUTPUT_BLOB_NAME);
|
456 |
+
assert(engine.getBindingDataType(outputIndex) == nvinfer1::DataType::kFLOAT);
|
457 |
+
int mBatchSize = engine.getMaxBatchSize();
|
458 |
+
|
459 |
+
// Create GPU buffers on device
|
460 |
+
CHECK(cudaMalloc(&buffers[inputIndex], 3 * input_shape.height * input_shape.width * sizeof(float)));
|
461 |
+
CHECK(cudaMalloc(&buffers[outputIndex], output_size*sizeof(float)));
|
462 |
+
|
463 |
+
// Create stream
|
464 |
+
cudaStream_t stream;
|
465 |
+
CHECK(cudaStreamCreate(&stream));
|
466 |
+
|
467 |
+
// DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
|
468 |
+
CHECK(cudaMemcpyAsync(buffers[inputIndex], input, 3 * input_shape.height * input_shape.width * sizeof(float), cudaMemcpyHostToDevice, stream));
|
469 |
+
context.enqueue(1, buffers, stream, nullptr);
|
470 |
+
CHECK(cudaMemcpyAsync(output, buffers[outputIndex], output_size * sizeof(float), cudaMemcpyDeviceToHost, stream));
|
471 |
+
cudaStreamSynchronize(stream);
|
472 |
+
|
473 |
+
// Release stream and buffers
|
474 |
+
cudaStreamDestroy(stream);
|
475 |
+
CHECK(cudaFree(buffers[inputIndex]));
|
476 |
+
CHECK(cudaFree(buffers[outputIndex]));
|
477 |
+
}
|
478 |
+
|
479 |
+
int main(int argc, char** argv) {
|
480 |
+
cudaSetDevice(DEVICE);
|
481 |
+
// create a model using the API directly and serialize it to a stream
|
482 |
+
char *trtModelStream{nullptr};
|
483 |
+
size_t size{0};
|
484 |
+
|
485 |
+
if (argc == 3 && std::string(argv[1]) == "-d") {
|
486 |
+
std::ifstream file("model_trt.engine", std::ios::binary);
|
487 |
+
if (file.good()) {
|
488 |
+
file.seekg(0, file.end);
|
489 |
+
size = file.tellg();
|
490 |
+
file.seekg(0, file.beg);
|
491 |
+
trtModelStream = new char[size];
|
492 |
+
assert(trtModelStream);
|
493 |
+
file.read(trtModelStream, size);
|
494 |
+
file.close();
|
495 |
+
}
|
496 |
+
} else {
|
497 |
+
std::cerr << "arguments not right!" << std::endl;
|
498 |
+
std::cerr << "run 'python3 yolox/deploy/trt.py -n yolox-{tiny, s, m, l, x}' to serialize model first!" << std::endl;
|
499 |
+
std::cerr << "./yolox -d ../samples // deserialize file and run inference" << std::endl;
|
500 |
+
return -1;
|
501 |
+
}
|
502 |
+
|
503 |
+
std::vector<std::string> file_names;
|
504 |
+
if (read_files_in_dir(argv[2], file_names) < 0) {
|
505 |
+
std::cout << "read_files_in_dir failed." << std::endl;
|
506 |
+
return -1;
|
507 |
+
}
|
508 |
+
|
509 |
+
IRuntime* runtime = createInferRuntime(gLogger);
|
510 |
+
assert(runtime != nullptr);
|
511 |
+
ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream, size);
|
512 |
+
assert(engine != nullptr);
|
513 |
+
IExecutionContext* context = engine->createExecutionContext();
|
514 |
+
assert(context != nullptr);
|
515 |
+
delete[] trtModelStream;
|
516 |
+
auto out_dims = engine->getBindingDimensions(1);
|
517 |
+
auto output_size = 1;
|
518 |
+
for(int j=0;j<out_dims.nbDims;j++) {
|
519 |
+
output_size *= out_dims.d[j];
|
520 |
+
}
|
521 |
+
static float* prob = new float[output_size];
|
522 |
+
|
523 |
+
int fcount = 0;
|
524 |
+
for (auto f: file_names) {
|
525 |
+
fcount++;
|
526 |
+
std::cout << fcount << " " << f << std::endl;
|
527 |
+
cv::Mat img = cv::imread(std::string(argv[2]) + "/" + f);
|
528 |
+
if (img.empty()) continue;
|
529 |
+
int img_w = img.cols;
|
530 |
+
int img_h = img.rows;
|
531 |
+
cv::Mat pr_img = static_resize(img);
|
532 |
+
std::cout << "blob image" << std::endl;
|
533 |
+
|
534 |
+
float* blob;
|
535 |
+
blob = blobFromImage(pr_img);
|
536 |
+
float scale = std::min(INPUT_W / (img.cols*1.0), INPUT_H / (img.rows*1.0));
|
537 |
+
|
538 |
+
// Run inference
|
539 |
+
auto start = std::chrono::system_clock::now();
|
540 |
+
doInference(*context, blob, prob, output_size, pr_img.size());
|
541 |
+
auto end = std::chrono::system_clock::now();
|
542 |
+
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
|
543 |
+
|
544 |
+
std::vector<Object> objects;
|
545 |
+
decode_outputs(prob, objects, scale, img_w, img_h);
|
546 |
+
draw_objects(img, objects, f);
|
547 |
+
}
|
548 |
+
|
549 |
+
// Destroy the engine
|
550 |
+
context->destroy();
|
551 |
+
engine->destroy();
|
552 |
+
runtime->destroy();
|
553 |
+
return 0;
|
554 |
+
}
|
demo/TensorRT/python/README.md
ADDED
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# User Guide for Deploy YOLOX on TensorRT
|
2 |
+
|
3 |
+
This toturial includes a Python demo for TensorRT.
|
4 |
+
|
5 |
+
## Install TensorRT Toolkit
|
6 |
+
|
7 |
+
Please follow the [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html) and [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt) to install TensorRT and torch2trt.
|
8 |
+
|
9 |
+
## Convert model
|
10 |
+
|
11 |
+
YOLOX models can be easily conveted to TensorRT models using torch2trt
|
12 |
+
|
13 |
+
If you want to convert our model, use the flag -n to specify a model name:
|
14 |
+
```shell
|
15 |
+
python tools/deploy/trt.py -n <YOLOX_MODEL_NAME> -c <YOLOX_CHECKPOINT>
|
16 |
+
```
|
17 |
+
For example:
|
18 |
+
```shell
|
19 |
+
python tools/deploy/trt.py -n yolox-s -c your_ckpt.pth.tar
|
20 |
+
```
|
21 |
+
<YOLOX_MODEL_NAME> can be: yolox-nano, yolox-tiny. yolox-s, yolox-m, yolox-l, yolox-x.
|
22 |
+
|
23 |
+
If you want to convert your customized model, use the flag -f to specify you exp file:
|
24 |
+
```shell
|
25 |
+
python tools/deploy/trt.py -f <YOLOX_EXP_FILE> -c <YOLOX_CHECKPOINT>
|
26 |
+
```
|
27 |
+
For example:
|
28 |
+
```shell
|
29 |
+
python tools/deploy/trt.py -f /path/to/your/yolox/exps/yolox_s.py -c your_ckpt.pth.tar
|
30 |
+
```
|
31 |
+
*yolox_s.py* can be any exp file modified by you.
|
32 |
+
|
33 |
+
The converted model and the serialized engine file (for C++ demo) will be saved on your experiment output dir.
|
34 |
+
|
35 |
+
## Demo
|
36 |
+
|
37 |
+
The TensorRT python demo is merged on our pytorch demo file, so you can run the pytorch demo command with ```--trt```.
|
38 |
+
|
39 |
+
```shell
|
40 |
+
python tools/demo.py -n yolox-s --trt --conf 0.3 --nms 0.65 --tsize 640
|
41 |
+
```
|
42 |
+
or
|
43 |
+
```shell
|
44 |
+
python tools/demo.py -f exps/base/yolox_s.py --trt --conf 0.3 --nms 0.65 --tsize 640
|
45 |
+
```
|
46 |
+
|