ηζ«02-Base Detection
commited on
Commit
Β·
5b8ab8f
1
Parent(s):
e9faa7e
feat(YOLOX): update README and fix serveral bugs.
Browse files- README.md +67 -47
- demo/ONNXRuntime/README.md +14 -13
- demo/OpenVINO/README.md +3 -3
- demo/OpenVINO/cpp/README.md +11 -11
- demo/OpenVINO/python/README.md +12 -12
- demo/TensorRT/cpp/README.md +2 -2
- demo/TensorRT/python/README.md +5 -5
- docs/.gitkeep +0 -0
- docs/train_custom_data.md +118 -0
- exps/example/yolox_voc/yolox_voc_s.py +124 -0
- requirements.txt +3 -0
- tools/demo.py +4 -9
- yolox/data/datasets/coco.py +23 -13
- yolox/data/datasets/mosaicdetection.py +1 -4
- yolox/data/datasets/voc.py +24 -69
- yolox/{evalutors β evaluators}/__init__.py +0 -0
- yolox/{evalutors β evaluators}/coco_evaluator.py +0 -0
- yolox/{evalutors β evaluators}/voc_eval.py +0 -0
- yolox/evaluators/voc_evaluator.py +183 -0
- yolox/evalutors/voc_evaluator.py +0 -202
- yolox/models/yolo_head.py +8 -1
- yolox/utils/visualize.py +2 -2
README.md
CHANGED
@@ -1,37 +1,35 @@
|
|
1 |
-
<div align="center"><img src="assets/logo.png" width="
|
2 |
-
|
3 |
<img src="assets/demo.png" >
|
4 |
|
5 |
-
##
|
6 |
YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities.
|
7 |
|
|
|
8 |
|
9 |
-
##
|
10 |
-
|
11 |
-
<div align="center"><img src="assets/fig1.png" width="400" ><img src="assets/fig2.png" width="400"></div>
|
12 |
-
|
13 |
-
## <div align="center">News!!</div>
|
14 |
-
* γ2020/07/19γ We have released our technical report on [Arxiv](xxx)!!
|
15 |
|
16 |
-
##
|
17 |
|
18 |
-
|
19 |
|Model |size |mAP<sup>test<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
|
20 |
| ------ |:---: | :---: |:---: |:---: | :---: | :----: |
|
21 |
-
|[YOLOX-s]() |640 |39.6 |9.8 |9.0 | 26.8 | - |
|
22 |
-
|[YOLOX-m]() |640 |46.4 |12.3 |25.3 |73.8| - |
|
23 |
-
|[YOLOX-l]() |640 |50.0 |14.5 |54.2| 155.6 | - |
|
24 |
-
|[YOLOX-x]() |640 |**51.2** | 17.3 |99.1 |281.9 | - |
|
|
|
25 |
|
26 |
-
|
27 |
-
|Model |size |mAP<sup>val<br>0.5:0.95 |
|
28 |
-
| ------ |:---: | :---: |:---: |:---: | :---: |
|
29 |
-
|[YOLOX-Nano]() |416 |25.3
|
30 |
-
|[YOLOX-Tiny]() |416 |31.7
|
31 |
|
32 |
-
##
|
33 |
|
34 |
-
|
|
|
35 |
|
36 |
Step1. Install [apex](https://github.com/NVIDIA/apex).
|
37 |
|
@@ -47,25 +45,41 @@ $ cd yolox
|
|
47 |
$ pip3 install -v -e . # or "python3 setup.py develop
|
48 |
```
|
49 |
|
50 |
-
|
|
|
|
|
|
|
|
|
|
|
51 |
|
52 |
-
|
53 |
|
54 |
```shell
|
55 |
-
python tools/demo.py -n yolox-s -c
|
56 |
```
|
57 |
or
|
58 |
```shell
|
59 |
-
python tools/demo.py -f exps/
|
|
|
|
|
|
|
|
|
60 |
```
|
61 |
|
62 |
|
63 |
-
|
|
|
|
|
64 |
<summary>Reproduce our results on COCO</summary>
|
65 |
|
66 |
-
Step1.
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
-
|
69 |
|
70 |
```shell
|
71 |
python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
|
@@ -73,12 +87,11 @@ python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
|
|
73 |
yolox-l
|
74 |
yolox-x
|
75 |
```
|
76 |
-
Notes:
|
77 |
* -d: number of gpu devices
|
78 |
-
* -b: total batch size, the recommended number for -b
|
79 |
* --fp16: mixed precision training
|
80 |
|
81 |
-
|
82 |
|
83 |
```shell
|
84 |
python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
|
@@ -87,42 +100,49 @@ python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
|
|
87 |
exps/base/yolox-x.py
|
88 |
```
|
89 |
|
90 |
-
* Customize your training.
|
91 |
-
|
92 |
-
* Finetune your datset on COCO pretrained models.
|
93 |
</details>
|
94 |
|
95 |
-
|
|
|
96 |
<summary>Evaluation</summary>
|
|
|
97 |
We support batch testing for fast evaluation:
|
98 |
|
99 |
```shell
|
100 |
-
python tools/eval.py -n yolox-s -b 64 --conf 0.001 --fp16
|
101 |
yolox-m
|
102 |
yolox-l
|
103 |
yolox-x
|
104 |
```
|
|
|
|
|
|
|
105 |
|
106 |
To reproduce speed test, we use the following command:
|
107 |
```shell
|
108 |
-
python tools/eval.py -n yolox-s -b 1 -d
|
109 |
yolox-m
|
110 |
yolox-l
|
111 |
yolox-x
|
112 |
```
|
113 |
|
114 |
-
## <div align="center">Deployment</div>
|
115 |
-
|
116 |
</details>
|
117 |
|
118 |
-
1. [ONNX: Including ONNX export and an ONNXRuntime demo.]()
|
119 |
-
2. [TensorRT in both C++ and Python]()
|
120 |
-
3. [NCNN in C++]()
|
121 |
-
4. [OpenVINO in both C++ and Python]()
|
122 |
|
123 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
124 |
|
125 |
|
126 |
-
|
|
|
|
|
|
|
127 |
|
128 |
-
|
|
|
|
1 |
+
<div align="center"><img src="assets/logo.png" width="350"></div>
|
|
|
2 |
<img src="assets/demo.png" >
|
3 |
|
4 |
+
## Introduction
|
5 |
YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities.
|
6 |
|
7 |
+
<img src="assets/git_fig.png" width="1000" >
|
8 |
|
9 |
+
## Updates!!
|
10 |
+
* γ2020/07/19γ We have released our technical report on Arxiv.
|
|
|
|
|
|
|
|
|
11 |
|
12 |
+
## Benchmark
|
13 |
|
14 |
+
#### Standard Models.
|
15 |
|Model |size |mAP<sup>test<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
|
16 |
| ------ |:---: | :---: |:---: |:---: | :---: | :----: |
|
17 |
+
|[YOLOX-s](./exps/yolox_s.py) |640 |39.6 |9.8 |9.0 | 26.8 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EW62gmO2vnNNs5npxjzunVwB9p307qqygaCkXdTO88BLUg?e=NMTQYw) |
|
18 |
+
|[YOLOX-m](./exps/yolox_m.py) |640 |46.4 |12.3 |25.3 |73.8| [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ERMTP7VFqrVBrXKMU7Vl4TcBQs0SUeCT7kvc-JdIbej4tQ?e=1MDo9y) |
|
19 |
+
|[YOLOX-l](./exps/yolox_l.py) |640 |50.0 |14.5 |54.2| 155.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EWA8w_IEOzBKvuueBqfaZh0BeoG5sVzR-XYbOJO4YlOkRw?e=wHWOBE) |
|
20 |
+
|[YOLOX-x](./exps/yolox_x.py) |640 |**51.2** | 17.3 |99.1 |281.9 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EdgVPHBziOVBtGAXHfeHI5kBza0q9yyueMGdT0wXZfI1rQ?e=tABO5u) |
|
21 |
+
|[YOLOX-Darknet53](./exps/yolov3.py) |640 | 47.4 | 11.1 |63.7 | 185.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZ-MV1r_fMFPkPrNjvbJEMoBLOLAnXH-XKEB77w8LhXL6Q?e=mf6wOc) |
|
22 |
|
23 |
+
#### Light Models.
|
24 |
+
|Model |size |mAP<sup>val<br>0.5:0.95 | Params<br>(M) |FLOPs<br>(B)| weights |
|
25 |
+
| ------ |:---: | :---: |:---: |:---: | :---: |
|
26 |
+
|[YOLOX-Nano](./exps/nano.py) |416 |25.3 | 0.91 |1.08 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EdcREey-krhLtdtSnxolxiUBjWMy6EFdiaO9bdOwZ5ygCQ?e=yQpdds) |
|
27 |
+
|[YOLOX-Tiny](./exps/yolox_tiny.py) |416 |31.7 | 5.06 |6.45 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EYtjNFPqvZBBrQ-VowLcSr4B6Z5TdTflUsr_gO2CwhC3bQ?e=SBTwXj) |
|
28 |
|
29 |
+
## Quick Start
|
30 |
|
31 |
+
<details>
|
32 |
+
<summary>Installation</summary>
|
33 |
|
34 |
Step1. Install [apex](https://github.com/NVIDIA/apex).
|
35 |
|
|
|
45 |
$ pip3 install -v -e . # or "python3 setup.py develop
|
46 |
```
|
47 |
|
48 |
+
</details>
|
49 |
+
|
50 |
+
<details>
|
51 |
+
<summary>Demo</summary>
|
52 |
+
|
53 |
+
Step1. Download a pretrained model from the benchmark table.
|
54 |
|
55 |
+
Step2. Use either -n or -f to specify your detector's config. For example:
|
56 |
|
57 |
```shell
|
58 |
+
python tools/demo.py image -n yolox-s -c /path/to/your/yolox_s.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
|
59 |
```
|
60 |
or
|
61 |
```shell
|
62 |
+
python tools/demo.py image -f exps/yolox_s.py -c /path/to/your/yolox_s.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
|
63 |
+
```
|
64 |
+
Demo for video:
|
65 |
+
```shell
|
66 |
+
python tools/demo.py video -n yolox-s -c /path/to/your/yolox_s.pth.tar --path /path/to/your/video --conf 0.3 --nms 0.65 --tsize 640 --save_result
|
67 |
```
|
68 |
|
69 |
|
70 |
+
</details>
|
71 |
+
|
72 |
+
<details>
|
73 |
<summary>Reproduce our results on COCO</summary>
|
74 |
|
75 |
+
Step1. Prepare dataset
|
76 |
+
```shell
|
77 |
+
cd <YOLOX_HOME>
|
78 |
+
mkdir datasets
|
79 |
+
ln -s /path/to/your/COCO ./datasets/COCO
|
80 |
+
```
|
81 |
|
82 |
+
Step2. Reproduce our results on COCO by specifying -n:
|
83 |
|
84 |
```shell
|
85 |
python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
|
|
|
87 |
yolox-l
|
88 |
yolox-x
|
89 |
```
|
|
|
90 |
* -d: number of gpu devices
|
91 |
+
* -b: total batch size, the recommended number for -b is num_gpu * 8
|
92 |
* --fp16: mixed precision training
|
93 |
|
94 |
+
When using -f, the above commands are equivalent to:
|
95 |
|
96 |
```shell
|
97 |
python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
|
|
|
100 |
exps/base/yolox-x.py
|
101 |
```
|
102 |
|
|
|
|
|
|
|
103 |
</details>
|
104 |
|
105 |
+
|
106 |
+
<details>
|
107 |
<summary>Evaluation</summary>
|
108 |
+
|
109 |
We support batch testing for fast evaluation:
|
110 |
|
111 |
```shell
|
112 |
+
python tools/eval.py -n yolox-s -c yolox_s.pth.tar -b 64 -d 8 --conf 0.001 [--fp16] [--fuse]
|
113 |
yolox-m
|
114 |
yolox-l
|
115 |
yolox-x
|
116 |
```
|
117 |
+
* --fuse: fuse conv and bn
|
118 |
+
* -d: number of GPUs used for evaluation. DEFAULT: All GPUs available will be used.
|
119 |
+
* -b: total batch size across on all GPUs
|
120 |
|
121 |
To reproduce speed test, we use the following command:
|
122 |
```shell
|
123 |
+
python tools/eval.py -n yolox-s -c yolox_s.pth.tar -b 1 -d 1 --conf 0.001 --fp16 --fuse
|
124 |
yolox-m
|
125 |
yolox-l
|
126 |
yolox-x
|
127 |
```
|
128 |
|
|
|
|
|
129 |
</details>
|
130 |
|
|
|
|
|
|
|
|
|
131 |
|
132 |
+
<details open>
|
133 |
+
<summary>Toturials</summary>
|
134 |
+
|
135 |
+
* [Training on custom data](docs/train_custom_data.md).
|
136 |
+
|
137 |
+
</details>
|
138 |
+
|
139 |
+
## Deployment
|
140 |
|
141 |
|
142 |
+
1. [ONNX: Including ONNX export and an ONNXRuntime demo.](./demo/ONNXRuntime)
|
143 |
+
2. [TensorRT in both C++ and Python](./demo/TensorRT)
|
144 |
+
3. [NCNN in C++](./demo/ncnn/android)
|
145 |
+
4. [OpenVINO in both C++ and Python](./demo/OpenVINO)
|
146 |
|
147 |
+
## Citing YOLOX
|
148 |
+
If you use YOLOX in your research, please cite our work by using the following BibTeX entry:
|
demo/ONNXRuntime/README.md
CHANGED
@@ -1,17 +1,18 @@
|
|
1 |
-
## ONNXRuntime
|
2 |
|
3 |
This doc introduces how to convert you pytorch model into onnx, and how to run an onnxruntime demo to verify your convertion.
|
4 |
|
5 |
### Download ONNX models.
|
6 |
-
| Model | Parameters | GFLOPs | Test Size | mAP |
|
7 |
-
|:------| :----: | :----: | :---: | :---: |
|
8 |
-
|
|
9 |
-
|
|
10 |
-
|
|
11 |
-
|
|
12 |
-
|
|
13 |
-
|
|
14 |
-
|
|
|
|
15 |
|
16 |
### Convert Your Model to ONNX
|
17 |
|
@@ -28,7 +29,7 @@ python3 tools/export_onnx.py --output-name yolox_s.onnx -n yolox-s -c yolox_s.pt
|
|
28 |
Notes:
|
29 |
* -n: specify a model name. The model name must be one of the [yolox-s,m,l,x and yolox-nane, yolox-tiny, yolov3]
|
30 |
* -c: the model you have trained
|
31 |
-
* -o: opset version, default 11. **However, if you will further convert your onnx model to [OpenVINO](), please specify the opset version to 10.**
|
32 |
* --no-onnxsim: disable onnxsim
|
33 |
* To customize an input shape for onnx model, modify the following code in tools/export.py:
|
34 |
|
@@ -36,7 +37,7 @@ Notes:
|
|
36 |
dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
|
37 |
```
|
38 |
|
39 |
-
2. Convert a standard YOLOX model by -f.
|
40 |
|
41 |
```shell
|
42 |
python3 tools/export_onnx.py --output-name yolox_s.onnx -f exps/yolox_s.py -c yolox_s.pth.tar
|
@@ -52,7 +53,7 @@ python3 tools/export_onnx.py --output-name your_yolox.onnx -f exps/your_yolox.py
|
|
52 |
|
53 |
Step1.
|
54 |
```shell
|
55 |
-
cd <YOLOX_HOME>/
|
56 |
```
|
57 |
|
58 |
Step2.
|
|
|
1 |
+
## YOLOX-ONNXRuntime in Python
|
2 |
|
3 |
This doc introduces how to convert you pytorch model into onnx, and how to run an onnxruntime demo to verify your convertion.
|
4 |
|
5 |
### Download ONNX models.
|
6 |
+
| Model | Parameters | GFLOPs | Test Size | mAP | Weights |
|
7 |
+
|:------| :----: | :----: | :---: | :---: | :---: |
|
8 |
+
| YOLOX-Nano | 0.91M | 1.08 | 416x416 | 25.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EfAGwvevU-lNhW5OqFAyHbwBJdI_7EaKu5yU04fgF5BU7w?e=gvq4hf) |
|
9 |
+
| YOLOX-Tiny | 5.06M | 6.45 | 416x416 |31.7 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EVigCszU1ilDn-MwLwHCF1ABsgTy06xFdVgZ04Yyo4lHVA?e=hVKiCw) |
|
10 |
+
| YOLOX-S | 9.0M | 26.8 | 640x640 |39.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/Ec0L1d1x2UtIpbfiahgxhtgBZVjb1NCXbotO8SCOdMqpQQ?e=siyIsK) |
|
11 |
+
| YOLOX-M | 25.3M | 73.8 | 640x640 |46.4 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ERUKlQe-nlxBoTKPy1ynbxsBmAZ_h-VBEV-nnfPdzUIkZQ?e=hyQQtl) |
|
12 |
+
| YOLOX-L | 54.2M | 155.6 | 640x640 |50.0 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ET5w926jCA5GlVfg9ixB4KEBiW0HYl7SzaHNRaRG9dYO_A?e=ISmCYX) |
|
13 |
+
| YOLOX-Darknet53| 63.72M | 185.3 | 640x640 |47.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ESArloSW-MlPlLuemLh9zKkBdovgweKbfu4zkvzKAp7pPQ?e=f81Ikw) |
|
14 |
+
| YOLOX-X | 99.1M | 281.9 | 640x640 |51.2 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ERjqoeMJlFdGuM3tQfXQmhABmGHlIHydWCwhlugeWLE9AA) |
|
15 |
+
|
16 |
|
17 |
### Convert Your Model to ONNX
|
18 |
|
|
|
29 |
Notes:
|
30 |
* -n: specify a model name. The model name must be one of the [yolox-s,m,l,x and yolox-nane, yolox-tiny, yolov3]
|
31 |
* -c: the model you have trained
|
32 |
+
* -o: opset version, default 11. **However, if you will further convert your onnx model to [OpenVINO](../OpenVINO/), please specify the opset version to 10.**
|
33 |
* --no-onnxsim: disable onnxsim
|
34 |
* To customize an input shape for onnx model, modify the following code in tools/export.py:
|
35 |
|
|
|
37 |
dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
|
38 |
```
|
39 |
|
40 |
+
2. Convert a standard YOLOX model by -f. When using -f, the above command is equivalent to:
|
41 |
|
42 |
```shell
|
43 |
python3 tools/export_onnx.py --output-name yolox_s.onnx -f exps/yolox_s.py -c yolox_s.pth.tar
|
|
|
53 |
|
54 |
Step1.
|
55 |
```shell
|
56 |
+
cd <YOLOX_HOME>/demo/ONNXRuntime
|
57 |
```
|
58 |
|
59 |
Step2.
|
demo/OpenVINO/README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
## YOLOX
|
2 |
|
3 |
-
* [C++ Demo]()
|
4 |
-
* [Python Demo]()
|
|
|
1 |
+
## YOLOX for OpenVINO
|
2 |
|
3 |
+
* [C++ Demo](./cpp)
|
4 |
+
* [Python Demo](./python)
|
demo/OpenVINO/cpp/README.md
CHANGED
@@ -1,17 +1,17 @@
|
|
1 |
-
#
|
2 |
|
3 |
This toturial includes a C++ demo for OpenVINO, as well as some converted models.
|
4 |
|
5 |
### Download OpenVINO models.
|
6 |
-
| Model | Parameters | GFLOPs | Test Size | mAP |
|
7 |
-
|:------| :----: | :----: | :---: | :---: |
|
8 |
-
| [YOLOX-Nano](
|
9 |
-
| [YOLOX-Tiny](
|
10 |
-
| [YOLOX-S](
|
11 |
-
| [YOLOX-M](
|
12 |
-
| [YOLOX-L](
|
13 |
-
| [YOLOX-
|
14 |
-
| [YOLOX-
|
15 |
|
16 |
## Install OpenVINO Toolkit
|
17 |
|
@@ -51,7 +51,7 @@ source ~/.bashrc
|
|
51 |
|
52 |
1. Export ONNX model
|
53 |
|
54 |
-
Please refer to the [ONNX toturial]()
|
55 |
|
56 |
2. Convert ONNX to OpenVINO
|
57 |
|
|
|
1 |
+
# YOLOX-OpenVINO in C++
|
2 |
|
3 |
This toturial includes a C++ demo for OpenVINO, as well as some converted models.
|
4 |
|
5 |
### Download OpenVINO models.
|
6 |
+
| Model | Parameters | GFLOPs | Test Size | mAP | Weights |
|
7 |
+
|:------| :----: | :----: | :---: | :---: | :---: |
|
8 |
+
| [YOLOX-Nano](../../../exps/nano.py) | 0.91M | 1.08 | 416x416 | 25.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EeWY57o5wQZFtXYd1KJw6Z8B4vxZru649XxQHYIFgio3Qw?e=ZS81ce) |
|
9 |
+
| [YOLOX-Tiny](../../../exps/yolox_tiny.py) | 5.06M | 6.45 | 416x416 |31.7 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ETfvOoCXdVZNinoSpKA_sEYBIQVqfjjF5_M6VvHRnLVcsA?e=STL1pi) |
|
10 |
+
| [YOLOX-S](../../../exps/yolox_s.py) | 9.0M | 26.8 | 640x640 |39.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EXUjf3PQnbBLrxNrXPueqaIBzVZOrYQOnJpLK1Fytj5ssA?e=GK0LOM) |
|
11 |
+
| [YOLOX-M](../../../exps/yolox_m.py) | 25.3M | 73.8 | 640x640 |46.4 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EcoT1BPpeRpLvE_4c441zn8BVNCQ2naxDH3rho7WqdlgLQ?e=95VaM9) |
|
12 |
+
| [YOLOX-L](../../../exps/yolox_l.py) | 54.2M | 155.6 | 640x640 |50.0 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZvmn-YLRuVPh0GAP_w3xHMB2VGvrKqQXyK_Cv5yi_DXUg?e=YRh6Eq) |
|
13 |
+
| [YOLOX-Darknet53](../../../exps/yolov3.py) | 63.72M | 185.3 | 640x640 |47.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EQP8LSroikFHuwX0jFRetmcBOCDWSFmylHxolV7ezUPXGw?e=bEw5iq) |
|
14 |
+
| [YOLOX-X](../../../exps/yolox_x.py) | 99.1M | 281.9 | 640x640 |51.2 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZFPnLqiD-xIlt7rcZYDjQgB4YXE9wnq1qaSXQwJrsKbdg?e=83nwEz) |
|
15 |
|
16 |
## Install OpenVINO Toolkit
|
17 |
|
|
|
51 |
|
52 |
1. Export ONNX model
|
53 |
|
54 |
+
Please refer to the [ONNX toturial](../../ONNXRuntime). **Note that you should set --opset to 10, otherwise your next step will fail.**
|
55 |
|
56 |
2. Convert ONNX to OpenVINO
|
57 |
|
demo/OpenVINO/python/README.md
CHANGED
@@ -1,17 +1,17 @@
|
|
1 |
-
#
|
2 |
|
3 |
This toturial includes a Python demo for OpenVINO, as well as some converted models.
|
4 |
|
5 |
### Download OpenVINO models.
|
6 |
-
| Model | Parameters | GFLOPs | Test Size | mAP |
|
7 |
-
|:------| :----: | :----: | :---: | :---: |
|
8 |
-
| [YOLOX-Nano](
|
9 |
-
| [YOLOX-Tiny](
|
10 |
-
| [YOLOX-S](
|
11 |
-
| [YOLOX-M](
|
12 |
-
| [YOLOX-L](
|
13 |
-
| [YOLOX-
|
14 |
-
| [YOLOX-
|
15 |
|
16 |
## Install OpenVINO Toolkit
|
17 |
|
@@ -51,7 +51,7 @@ source ~/.bashrc
|
|
51 |
|
52 |
1. Export ONNX model
|
53 |
|
54 |
-
Please refer to the [ONNX toturial]()
|
55 |
|
56 |
2. Convert ONNX to OpenVINO
|
57 |
|
@@ -71,7 +71,7 @@ source ~/.bashrc
|
|
71 |
```
|
72 |
For example:
|
73 |
```shell
|
74 |
-
python3 mo.py --input_model yolox.onnx --input_shape
|
75 |
```
|
76 |
|
77 |
## Demo
|
|
|
1 |
+
# YOLOX-OpenVINO in Python
|
2 |
|
3 |
This toturial includes a Python demo for OpenVINO, as well as some converted models.
|
4 |
|
5 |
### Download OpenVINO models.
|
6 |
+
| Model | Parameters | GFLOPs | Test Size | mAP | Weights |
|
7 |
+
|:------| :----: | :----: | :---: | :---: | :---: |
|
8 |
+
| [YOLOX-Nano](../../../exps/nano.py) | 0.91M | 1.08 | 416x416 | 25.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EeWY57o5wQZFtXYd1KJw6Z8B4vxZru649XxQHYIFgio3Qw?e=ZS81ce) |
|
9 |
+
| [YOLOX-Tiny](../../../exps/yolox_tiny.py) | 5.06M | 6.45 | 416x416 |31.7 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ETfvOoCXdVZNinoSpKA_sEYBIQVqfjjF5_M6VvHRnLVcsA?e=STL1pi) |
|
10 |
+
| [YOLOX-S](../../../exps/yolox_s.py) | 9.0M | 26.8 | 640x640 |39.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EXUjf3PQnbBLrxNrXPueqaIBzVZOrYQOnJpLK1Fytj5ssA?e=GK0LOM) |
|
11 |
+
| [YOLOX-M](../../../exps/yolox_m.py) | 25.3M | 73.8 | 640x640 |46.4 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EcoT1BPpeRpLvE_4c441zn8BVNCQ2naxDH3rho7WqdlgLQ?e=95VaM9) |
|
12 |
+
| [YOLOX-L](../../../exps/yolox_l.py) | 54.2M | 155.6 | 640x640 |50.0 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZvmn-YLRuVPh0GAP_w3xHMB2VGvrKqQXyK_Cv5yi_DXUg?e=YRh6Eq) |
|
13 |
+
| [YOLOX-Darknet53](../../../exps/yolov3.py) | 63.72M | 185.3 | 640x640 |47.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EQP8LSroikFHuwX0jFRetmcBOCDWSFmylHxolV7ezUPXGw?e=bEw5iq) |
|
14 |
+
| [YOLOX-X](../../../exps/yolox_x.py) | 99.1M | 281.9 | 640x640 |51.2 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZFPnLqiD-xIlt7rcZYDjQgB4YXE9wnq1qaSXQwJrsKbdg?e=83nwEz) |
|
15 |
|
16 |
## Install OpenVINO Toolkit
|
17 |
|
|
|
51 |
|
52 |
1. Export ONNX model
|
53 |
|
54 |
+
Please refer to the [ONNX toturial](../../ONNXRuntime). **Note that you should set --opset to 10, otherwise your next step will fail.**
|
55 |
|
56 |
2. Convert ONNX to OpenVINO
|
57 |
|
|
|
71 |
```
|
72 |
For example:
|
73 |
```shell
|
74 |
+
python3 mo.py --input_model yolox.onnx --input_shape [1,3,640,640] --data_type FP16 --output_dir converted_output
|
75 |
```
|
76 |
|
77 |
## Demo
|
demo/TensorRT/cpp/README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
#
|
2 |
|
3 |
As YOLOX models is easy to converted to tensorrt using [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt),
|
4 |
our C++ demo will not include the model converting or constructing like other tenorrt demos.
|
@@ -6,7 +6,7 @@ our C++ demo will not include the model converting or constructing like other te
|
|
6 |
|
7 |
## Step 1: Prepare serialized engine file
|
8 |
|
9 |
-
Follow the trt [python demo README](../
|
10 |
|
11 |
|
12 |
## Step 2: build the demo
|
|
|
1 |
+
# YOLOX-TensorRT in C++
|
2 |
|
3 |
As YOLOX models is easy to converted to tensorrt using [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt),
|
4 |
our C++ demo will not include the model converting or constructing like other tenorrt demos.
|
|
|
6 |
|
7 |
## Step 1: Prepare serialized engine file
|
8 |
|
9 |
+
Follow the trt [python demo README](../python/README.md) to convert and save the serialized engine file.
|
10 |
|
11 |
|
12 |
## Step 2: build the demo
|
demo/TensorRT/python/README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
#
|
2 |
|
3 |
This toturial includes a Python demo for TensorRT.
|
4 |
|
@@ -12,21 +12,21 @@ YOLOX models can be easily conveted to TensorRT models using torch2trt
|
|
12 |
|
13 |
If you want to convert our model, use the flag -n to specify a model name:
|
14 |
```shell
|
15 |
-
python tools/
|
16 |
```
|
17 |
For example:
|
18 |
```shell
|
19 |
-
python tools/
|
20 |
```
|
21 |
<YOLOX_MODEL_NAME> can be: yolox-nano, yolox-tiny. yolox-s, yolox-m, yolox-l, yolox-x.
|
22 |
|
23 |
If you want to convert your customized model, use the flag -f to specify you exp file:
|
24 |
```shell
|
25 |
-
python tools/
|
26 |
```
|
27 |
For example:
|
28 |
```shell
|
29 |
-
python tools/
|
30 |
```
|
31 |
*yolox_s.py* can be any exp file modified by you.
|
32 |
|
|
|
1 |
+
# YOLOX-TensorRT in Python
|
2 |
|
3 |
This toturial includes a Python demo for TensorRT.
|
4 |
|
|
|
12 |
|
13 |
If you want to convert our model, use the flag -n to specify a model name:
|
14 |
```shell
|
15 |
+
python tools/trt.py -n <YOLOX_MODEL_NAME> -c <YOLOX_CHECKPOINT>
|
16 |
```
|
17 |
For example:
|
18 |
```shell
|
19 |
+
python tools/trt.py -n yolox-s -c your_ckpt.pth.tar
|
20 |
```
|
21 |
<YOLOX_MODEL_NAME> can be: yolox-nano, yolox-tiny. yolox-s, yolox-m, yolox-l, yolox-x.
|
22 |
|
23 |
If you want to convert your customized model, use the flag -f to specify you exp file:
|
24 |
```shell
|
25 |
+
python tools/trt.py -f <YOLOX_EXP_FILE> -c <YOLOX_CHECKPOINT>
|
26 |
```
|
27 |
For example:
|
28 |
```shell
|
29 |
+
python tools/trt.py -f /path/to/your/yolox/exps/yolox_s.py -c your_ckpt.pth.tar
|
30 |
```
|
31 |
*yolox_s.py* can be any exp file modified by you.
|
32 |
|
docs/.gitkeep
ADDED
File without changes
|
docs/train_custom_data.md
ADDED
@@ -0,0 +1,118 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Train Custom Data.
|
2 |
+
This page explains how to train your own custom data with YOLOX.
|
3 |
+
|
4 |
+
We take an example of finetuing YOLOX-S model on VOC dataset to give a more clear guide.
|
5 |
+
|
6 |
+
## 0. Before you start
|
7 |
+
Clone this repo and follow the [README](../README.md) to install YOLOX.
|
8 |
+
|
9 |
+
## 1. Create your own dataset
|
10 |
+
**Step 1** Prepare your own dataset with images and labels first. For labeling images, you may use a tool like [Labelme](https://github.com/wkentaro/labelme) or [CVAT](https://github.com/openvinotoolkit/cvat).
|
11 |
+
|
12 |
+
**Step 2** Then, you should write the corresponding Dataset Class which can load images and labels through "\_\_getitem\_\_" method. We currently support COCO format and VOC format.
|
13 |
+
|
14 |
+
You can also write the Dataset by you own. Let's take the [VOC](../yolox/data/datasets/voc.py#L151) Dataset file for example:
|
15 |
+
```python
|
16 |
+
@Dataset.resize_getitem
|
17 |
+
def __getitem__(self, index):
|
18 |
+
img, target, img_info, img_id = self.pull_item(index)
|
19 |
+
|
20 |
+
if self.preproc is not None:
|
21 |
+
img, target = self.preproc(img, target, self.input_dim)
|
22 |
+
|
23 |
+
return img, target, img_info, img_id
|
24 |
+
|
25 |
+
```
|
26 |
+
|
27 |
+
One more thing worth noting is that you should also implement "[pull_item](../yolox/data/datasets/voc.py#L129)" and "[load_anno](../yolox/data/datasets/voc.py#L121)" method for the Mosiac and MixUp augmentation.
|
28 |
+
|
29 |
+
**Step 3** Prepare the evaluator. We currently have [COCO evaluator](../yolox/evaluators/coco_evaluator.py) and [VOC evaluator](../yolox/evaluators/voc_evaluator.py).
|
30 |
+
If you have your own format data or evaluation metric, you may write your own evaluator.
|
31 |
+
|
32 |
+
## 2. Create your Exp file to control everything
|
33 |
+
We put everything involved in a model to one single Exp file, including model setting, training setting, and testing setting.
|
34 |
+
|
35 |
+
A complete Exp file is at [yolox_base.py](../yolox/exp/yolox_base.py). It may be too long to write for every exp, but you can inherit the base Exp file and only overwrite the changed part.
|
36 |
+
|
37 |
+
Let's still take the [VOC Exp file](../exps/example/yolox_voc/yolox_voc_s.py) for an example.
|
38 |
+
|
39 |
+
We select YOLOX-S model here, so we should change the network depth and width. VOC has only 20 classes, so we should also change the num_classes.
|
40 |
+
|
41 |
+
These configs are changed in the inti() methd:
|
42 |
+
```python
|
43 |
+
class Exp(MyExp):
|
44 |
+
def __init__(self):
|
45 |
+
super(Exp, self).__init__()
|
46 |
+
self.num_classes = 20
|
47 |
+
self.depth = 0.33
|
48 |
+
self.width = 0.50
|
49 |
+
self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
|
50 |
+
```
|
51 |
+
|
52 |
+
Besides, you should also overwrite the dataset and evaluator preprared before to training the model on your own data.
|
53 |
+
|
54 |
+
Please see "[get_data_loader](../exps/example/yolox_voc/yolox_voc_s.py#L20)", "[get_eval_loader](../exps/example/yolox_voc/yolox_voc_s.py#L82)", and "[get_evaluator](../exps/example/yolox_voc/yolox_voc_s.py#L113)" for more details.
|
55 |
+
|
56 |
+
## 3. Train
|
57 |
+
Except special cases, we always recommend to use our [COCO pretrained weights](../README.md) for initializing.
|
58 |
+
|
59 |
+
Once you get the Exp file and the COCO pretrained weights we provided, you can train your own model by the following command:
|
60 |
+
```bash
|
61 |
+
python tools/train.py -f /path/to/your/Exp/file -d 8 -b 64 --fp16 -o -c /path/to/the/pretrained/weights
|
62 |
+
```
|
63 |
+
|
64 |
+
or take the YOLOX-S VOC training for example:
|
65 |
+
```bash
|
66 |
+
python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 8 -b 64 --fp16 -o -c /path/to/yolox_s.pth.tar
|
67 |
+
```
|
68 |
+
|
69 |
+
(Don't worry for the different shape of detection head between the pretrained weights and your own model, we will handle it)
|
70 |
+
|
71 |
+
## 4. Tips for Best Training Results
|
72 |
+
|
73 |
+
As YOLOX is an anchor-free detector with only several hyper-parameters, most of the time good results can be obtained with no changes to the models or training settings.
|
74 |
+
We thus always recommend you first train with all default training settings.
|
75 |
+
|
76 |
+
If at first you don't get good results, there are steps you could considier to take to improve.
|
77 |
+
|
78 |
+
**Model Selection** We provide YOLOX-Nano, YOLOX-Tiny, and YOLOX-S for mobile deployments, while YOLOX-M/L/X for cloud or high performance GPU deployments.
|
79 |
+
|
80 |
+
If your deployment meets some trouble of compatibility. we recommand YOLOX-DarkNet53.
|
81 |
+
|
82 |
+
**Training Configs** If your training overfits early, then you can reduce max\_epochs or decrease the base\_lr and min\_lr\_ratio in your Exp file:
|
83 |
+
```python
|
84 |
+
# -------------- training config --------------------- #
|
85 |
+
self.warmup_epochs = 5
|
86 |
+
self.max_epoch = 300
|
87 |
+
self.warmup_lr = 0
|
88 |
+
self.basic_lr_per_img = 0.01 / 64.0
|
89 |
+
self.scheduler = "yoloxwarmcos"
|
90 |
+
self.no_aug_epochs = 15
|
91 |
+
self.min_lr_ratio = 0.05
|
92 |
+
self.ema = True
|
93 |
+
|
94 |
+
self.weight_decay = 5e-4
|
95 |
+
self.momentum = 0.9
|
96 |
+
```
|
97 |
+
|
98 |
+
**Aug Configs** You may also change the degree of the augmentations.
|
99 |
+
|
100 |
+
Generally, for small models, you should weak the aug, while for large models or small size of dataset, you may enchance the aug in your Exp file:
|
101 |
+
```python
|
102 |
+
# --------------- transform config ----------------- #
|
103 |
+
self.degrees = 10.0
|
104 |
+
self.translate = 0.1
|
105 |
+
self.scale = (0.1, 2)
|
106 |
+
self.mscale = (0.8, 1.6)
|
107 |
+
self.shear = 2.0
|
108 |
+
self.perspective = 0.0
|
109 |
+
self.enable_mixup = True
|
110 |
+
```
|
111 |
+
|
112 |
+
**Design your own detector** You may refer to our [Arxiv]() paper for details and suggestions for designing your own detector.
|
113 |
+
|
114 |
+
|
115 |
+
|
116 |
+
|
117 |
+
|
118 |
+
|
exps/example/yolox_voc/yolox_voc_s.py
ADDED
@@ -0,0 +1,124 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# encoding: utf-8
|
2 |
+
import os
|
3 |
+
import random
|
4 |
+
import torch
|
5 |
+
import torch.nn as nn
|
6 |
+
import torch.distributed as dist
|
7 |
+
|
8 |
+
from yolox.exp import Exp as MyExp
|
9 |
+
|
10 |
+
|
11 |
+
class Exp(MyExp):
|
12 |
+
def __init__(self):
|
13 |
+
super(Exp, self).__init__()
|
14 |
+
self.num_classes = 20
|
15 |
+
self.depth = 0.33
|
16 |
+
self.width = 0.50
|
17 |
+
self.eval_interval = 2
|
18 |
+
self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
|
19 |
+
|
20 |
+
def get_data_loader(self, batch_size, is_distributed, no_aug=False):
|
21 |
+
from yolox.data import (
|
22 |
+
VOCDetection,
|
23 |
+
TrainTransform,
|
24 |
+
YoloBatchSampler,
|
25 |
+
DataLoader,
|
26 |
+
InfiniteSampler,
|
27 |
+
MosaicDetection,
|
28 |
+
)
|
29 |
+
|
30 |
+
dataset = VOCDetection(
|
31 |
+
data_dir='/data/Datasets/VOCdevkit',
|
32 |
+
image_sets=[('2007', 'trainval'), ('2012', 'trainval')],
|
33 |
+
img_size=self.input_size,
|
34 |
+
preproc=TrainTransform(
|
35 |
+
rgb_means=(0.485, 0.456, 0.406),
|
36 |
+
std=(0.229, 0.224, 0.225),
|
37 |
+
max_labels=50,
|
38 |
+
),
|
39 |
+
)
|
40 |
+
|
41 |
+
dataset = MosaicDetection(
|
42 |
+
dataset,
|
43 |
+
mosaic=not no_aug,
|
44 |
+
img_size=self.input_size,
|
45 |
+
preproc=TrainTransform(
|
46 |
+
rgb_means=(0.485, 0.456, 0.406),
|
47 |
+
std=(0.229, 0.224, 0.225),
|
48 |
+
max_labels=120,
|
49 |
+
),
|
50 |
+
degrees=self.degrees,
|
51 |
+
translate=self.translate,
|
52 |
+
scale=self.scale,
|
53 |
+
shear=self.shear,
|
54 |
+
perspective=self.perspective,
|
55 |
+
enable_mixup=self.enable_mixup,
|
56 |
+
)
|
57 |
+
|
58 |
+
self.dataset = dataset
|
59 |
+
|
60 |
+
if is_distributed:
|
61 |
+
batch_size = batch_size // dist.get_world_size()
|
62 |
+
sampler = InfiniteSampler(
|
63 |
+
len(self.dataset), seed=self.seed if self.seed else 0
|
64 |
+
)
|
65 |
+
else:
|
66 |
+
sampler = torch.utils.data.RandomSampler(self.dataset)
|
67 |
+
|
68 |
+
batch_sampler = YoloBatchSampler(
|
69 |
+
sampler=sampler,
|
70 |
+
batch_size=batch_size,
|
71 |
+
drop_last=False,
|
72 |
+
input_dimension=self.input_size,
|
73 |
+
mosaic=not no_aug,
|
74 |
+
)
|
75 |
+
|
76 |
+
dataloader_kwargs = {"num_workers": self.data_num_workers, "pin_memory": True}
|
77 |
+
dataloader_kwargs["batch_sampler"] = batch_sampler
|
78 |
+
train_loader = DataLoader(self.dataset, **dataloader_kwargs)
|
79 |
+
|
80 |
+
return train_loader
|
81 |
+
|
82 |
+
def get_eval_loader(self, batch_size, is_distributed, testdev=False):
|
83 |
+
from yolox.data import VOCDetection, ValTransform
|
84 |
+
|
85 |
+
valdataset = VOCDetection(
|
86 |
+
data_dir='/data/Datasets/VOCdevkit',
|
87 |
+
image_sets=[('2007', 'test')],
|
88 |
+
img_size=self.test_size,
|
89 |
+
preproc=ValTransform(
|
90 |
+
rgb_means=(0.485, 0.456, 0.406),
|
91 |
+
std=(0.229, 0.224, 0.225),
|
92 |
+
),
|
93 |
+
)
|
94 |
+
|
95 |
+
if is_distributed:
|
96 |
+
batch_size = batch_size // dist.get_world_size()
|
97 |
+
sampler = torch.utils.data.distributed.DistributedSampler(
|
98 |
+
valdataset, shuffle=False
|
99 |
+
)
|
100 |
+
else:
|
101 |
+
sampler = torch.utils.data.SequentialSampler(valdataset)
|
102 |
+
|
103 |
+
dataloader_kwargs = {
|
104 |
+
"num_workers": self.data_num_workers,
|
105 |
+
"pin_memory": True,
|
106 |
+
"sampler": sampler,
|
107 |
+
}
|
108 |
+
dataloader_kwargs["batch_size"] = batch_size
|
109 |
+
val_loader = torch.utils.data.DataLoader(valdataset, **dataloader_kwargs)
|
110 |
+
|
111 |
+
return val_loader
|
112 |
+
|
113 |
+
def get_evaluator(self, batch_size, is_distributed, testdev=False):
|
114 |
+
from yolox.evalutors import VOCEvaluator
|
115 |
+
|
116 |
+
val_loader = self.get_eval_loader(batch_size, is_distributed, testdev=testdev)
|
117 |
+
evaluator = VOCEvaluator(
|
118 |
+
dataloader=val_loader,
|
119 |
+
img_size=self.test_size,
|
120 |
+
confthre=self.test_conf,
|
121 |
+
nmsthre=self.nmsthre,
|
122 |
+
num_classes=self.num_classes,
|
123 |
+
)
|
124 |
+
return evaluator
|
requirements.txt
CHANGED
@@ -12,3 +12,6 @@ Pillow
|
|
12 |
skimage
|
13 |
thop
|
14 |
ninja
|
|
|
|
|
|
|
|
12 |
skimage
|
13 |
thop
|
14 |
ninja
|
15 |
+
tabulate
|
16 |
+
tensorboard
|
17 |
+
onnxruntime
|
tools/demo.py
CHANGED
@@ -66,12 +66,6 @@ def make_parser():
|
|
66 |
action="store_true",
|
67 |
help="Using TensorRT model for testing.",
|
68 |
)
|
69 |
-
parser.add_argument(
|
70 |
-
"opts",
|
71 |
-
help="Modify config options using the command-line",
|
72 |
-
default=None,
|
73 |
-
nargs=argparse.REMAINDER,
|
74 |
-
)
|
75 |
return parser
|
76 |
|
77 |
|
@@ -137,13 +131,14 @@ class Predictor(object):
|
|
137 |
def visual(self, output, img_info, cls_conf=0.35):
|
138 |
ratio = img_info['ratio']
|
139 |
img = img_info['raw_img']
|
|
|
|
|
140 |
output = output.cpu()
|
141 |
|
142 |
bboxes = output[:, 0:4]
|
143 |
|
144 |
# preprocessing: resize
|
145 |
bboxes /= ratio
|
146 |
-
bboxes = xyxy2xywh(bboxes)
|
147 |
|
148 |
cls = output[:, 6]
|
149 |
scores = output[:, 4] * output[:, 5]
|
@@ -193,7 +188,7 @@ def imageflow_demo(predictor, vis_folder, current_time, args):
|
|
193 |
ret_val, frame = cap.read()
|
194 |
if ret_val:
|
195 |
outputs, img_info = predictor.inference(frame)
|
196 |
-
result_frame = predictor.
|
197 |
if args.save_result:
|
198 |
vid_writer.write(result_frame)
|
199 |
ch = cv2.waitKey(1)
|
@@ -258,7 +253,7 @@ def main(exp, args):
|
|
258 |
"TensorRT model is not support model fusing!"
|
259 |
trt_file = os.path.join(file_name, "model_trt.pth")
|
260 |
assert os.path.exists(trt_file), (
|
261 |
-
"TensorRT model is not found!\n Run python3
|
262 |
)
|
263 |
model.head.decode_in_inference = False
|
264 |
decoder = model.head.decode_outputs
|
|
|
66 |
action="store_true",
|
67 |
help="Using TensorRT model for testing.",
|
68 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
69 |
return parser
|
70 |
|
71 |
|
|
|
131 |
def visual(self, output, img_info, cls_conf=0.35):
|
132 |
ratio = img_info['ratio']
|
133 |
img = img_info['raw_img']
|
134 |
+
if output is None:
|
135 |
+
return img
|
136 |
output = output.cpu()
|
137 |
|
138 |
bboxes = output[:, 0:4]
|
139 |
|
140 |
# preprocessing: resize
|
141 |
bboxes /= ratio
|
|
|
142 |
|
143 |
cls = output[:, 6]
|
144 |
scores = output[:, 4] * output[:, 5]
|
|
|
188 |
ret_val, frame = cap.read()
|
189 |
if ret_val:
|
190 |
outputs, img_info = predictor.inference(frame)
|
191 |
+
result_frame = predictor.visual(outputs[0], img_info)
|
192 |
if args.save_result:
|
193 |
vid_writer.write(result_frame)
|
194 |
ch = cv2.waitKey(1)
|
|
|
253 |
"TensorRT model is not support model fusing!"
|
254 |
trt_file = os.path.join(file_name, "model_trt.pth")
|
255 |
assert os.path.exists(trt_file), (
|
256 |
+
"TensorRT model is not found!\n Run python3 tools/trt.py first!"
|
257 |
)
|
258 |
model.head.decode_in_inference = False
|
259 |
decoder = model.head.decode_outputs
|
yolox/data/datasets/coco.py
CHANGED
@@ -46,29 +46,20 @@ class COCODataset(Dataset):
|
|
46 |
cats = self.coco.loadCats(self.coco.getCatIds())
|
47 |
self._classes = tuple([c["name"] for c in cats])
|
48 |
self.name = name
|
49 |
-
self.max_labels = 50
|
50 |
self.img_size = img_size
|
51 |
self.preproc = preproc
|
52 |
|
53 |
def __len__(self):
|
54 |
return len(self.ids)
|
55 |
|
56 |
-
def
|
57 |
id_ = self.ids[index]
|
|
|
|
|
58 |
|
59 |
im_ann = self.coco.loadImgs(id_)[0]
|
60 |
width = im_ann["width"]
|
61 |
height = im_ann["height"]
|
62 |
-
anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
|
63 |
-
annotations = self.coco.loadAnns(anno_ids)
|
64 |
-
|
65 |
-
# load image and preprocess
|
66 |
-
img_file = os.path.join(
|
67 |
-
self.data_dir, self.name, "{:012}".format(id_) + ".jpg"
|
68 |
-
)
|
69 |
-
|
70 |
-
img = cv2.imread(img_file)
|
71 |
-
assert img is not None
|
72 |
|
73 |
# load labels
|
74 |
valid_objs = []
|
@@ -90,6 +81,25 @@ class COCODataset(Dataset):
|
|
90 |
res[ix, 0:4] = obj["clean_bbox"]
|
91 |
res[ix, 4] = cls
|
92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
img_info = (height, width)
|
94 |
|
95 |
return img, res, img_info, id_
|
@@ -105,7 +115,7 @@ class COCODataset(Dataset):
|
|
105 |
Returns:
|
106 |
img (numpy.ndarray): pre-processed image
|
107 |
padded_labels (torch.Tensor): pre-processed label data.
|
108 |
-
The shape is :math:`[
|
109 |
each label consists of [class, xc, yc, w, h]:
|
110 |
class (float): class index.
|
111 |
xc, yc (float) : center of bbox whose values range from 0 to 1.
|
|
|
46 |
cats = self.coco.loadCats(self.coco.getCatIds())
|
47 |
self._classes = tuple([c["name"] for c in cats])
|
48 |
self.name = name
|
|
|
49 |
self.img_size = img_size
|
50 |
self.preproc = preproc
|
51 |
|
52 |
def __len__(self):
|
53 |
return len(self.ids)
|
54 |
|
55 |
+
def load_anno(self, index):
|
56 |
id_ = self.ids[index]
|
57 |
+
anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
|
58 |
+
annotations = self.coco.loadAnns(anno_ids)
|
59 |
|
60 |
im_ann = self.coco.loadImgs(id_)[0]
|
61 |
width = im_ann["width"]
|
62 |
height = im_ann["height"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
# load labels
|
65 |
valid_objs = []
|
|
|
81 |
res[ix, 0:4] = obj["clean_bbox"]
|
82 |
res[ix, 4] = cls
|
83 |
|
84 |
+
return res
|
85 |
+
|
86 |
+
def pull_item(self, index):
|
87 |
+
id_ = self.ids[index]
|
88 |
+
|
89 |
+
im_ann = self.coco.loadImgs(id_)[0]
|
90 |
+
width = im_ann["width"]
|
91 |
+
height = im_ann["height"]
|
92 |
+
|
93 |
+
# load image and preprocess
|
94 |
+
img_file = os.path.join(
|
95 |
+
self.data_dir, self.name, "{:012}".format(id_) + ".jpg"
|
96 |
+
)
|
97 |
+
|
98 |
+
img = cv2.imread(img_file)
|
99 |
+
assert img is not None
|
100 |
+
|
101 |
+
# load anno
|
102 |
+
res = self.load_anno(index)
|
103 |
img_info = (height, width)
|
104 |
|
105 |
return img, res, img_info, id_
|
|
|
115 |
Returns:
|
116 |
img (numpy.ndarray): pre-processed image
|
117 |
padded_labels (torch.Tensor): pre-processed label data.
|
118 |
+
The shape is :math:`[max_labels, 5]`.
|
119 |
each label consists of [class, xc, yc, w, h]:
|
120 |
class (float): class index.
|
121 |
xc, yc (float) : center of bbox whose values range from 0 to 1.
|
yolox/data/datasets/mosaicdetection.py
CHANGED
@@ -93,7 +93,6 @@ class MosaicDetection(Dataset):
|
|
93 |
labels[:, 1] = scale * _labels[:, 1] + padh
|
94 |
labels[:, 2] = scale * _labels[:, 2] + padw
|
95 |
labels[:, 3] = scale * _labels[:, 3] + padh
|
96 |
-
|
97 |
labels4.append(labels)
|
98 |
|
99 |
if len(labels4):
|
@@ -136,9 +135,7 @@ class MosaicDetection(Dataset):
|
|
136 |
cp_labels = []
|
137 |
while len(cp_labels) == 0:
|
138 |
cp_index = random.randint(0, self.__len__() - 1)
|
139 |
-
|
140 |
-
anno_ids = self._dataset.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
|
141 |
-
cp_labels = self._dataset.coco.loadAnns(anno_ids)
|
142 |
img, cp_labels, _, _ = self._dataset.pull_item(cp_index)
|
143 |
|
144 |
if len(img.shape) == 3:
|
|
|
93 |
labels[:, 1] = scale * _labels[:, 1] + padh
|
94 |
labels[:, 2] = scale * _labels[:, 2] + padw
|
95 |
labels[:, 3] = scale * _labels[:, 3] + padh
|
|
|
96 |
labels4.append(labels)
|
97 |
|
98 |
if len(labels4):
|
|
|
135 |
cp_labels = []
|
136 |
while len(cp_labels) == 0:
|
137 |
cp_index = random.randint(0, self.__len__() - 1)
|
138 |
+
cp_labels = self._dataset.load_anno(cp_index)
|
|
|
|
|
139 |
img, cp_labels, _, _ = self._dataset.pull_item(cp_index)
|
140 |
|
141 |
if len(img.shape) == 3:
|
yolox/data/datasets/voc.py
CHANGED
@@ -19,16 +19,6 @@ from yolox.evalutors.voc_eval import voc_eval
|
|
19 |
from .datasets_wrapper import Dataset
|
20 |
from .voc_classes import VOC_CLASSES
|
21 |
|
22 |
-
# for making bounding boxes pretty
|
23 |
-
COLORS = (
|
24 |
-
(255, 0, 0, 128),
|
25 |
-
(0, 255, 0, 128),
|
26 |
-
(0, 0, 255, 128),
|
27 |
-
(0, 255, 255, 128),
|
28 |
-
(255, 0, 255, 128),
|
29 |
-
(255, 255, 0, 128),
|
30 |
-
)
|
31 |
-
|
32 |
|
33 |
class AnnotationTransform(object):
|
34 |
|
@@ -100,16 +90,17 @@ class VOCDetection(Dataset):
|
|
100 |
|
101 |
def __init__(
|
102 |
self,
|
103 |
-
|
104 |
-
image_sets,
|
|
|
105 |
preproc=None,
|
106 |
target_transform=AnnotationTransform(),
|
107 |
-
input_dim=(416, 416),
|
108 |
dataset_name="VOC0712",
|
109 |
):
|
110 |
-
super().__init__(
|
111 |
-
self.root =
|
112 |
self.image_set = image_sets
|
|
|
113 |
self.preproc = preproc
|
114 |
self.target_transform = target_transform
|
115 |
self.name = dataset_name
|
@@ -125,59 +116,16 @@ class VOCDetection(Dataset):
|
|
125 |
):
|
126 |
self.ids.append((rootpath, line.strip()))
|
127 |
|
128 |
-
@Dataset.resize_getitem
|
129 |
-
def __getitem__(self, index):
|
130 |
-
img_id = self.ids[index]
|
131 |
-
target = ET.parse(self._annopath % img_id).getroot()
|
132 |
-
img = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
|
133 |
-
# img = Image.open(self._imgpath % img_id).convert('RGB')
|
134 |
-
|
135 |
-
height, width, _ = img.shape
|
136 |
-
|
137 |
-
if self.target_transform is not None:
|
138 |
-
target = self.target_transform(target)
|
139 |
-
|
140 |
-
if self.preproc is not None:
|
141 |
-
img, target = self.preproc(img, target, self.input_dim)
|
142 |
-
# print(img.size())
|
143 |
-
|
144 |
-
img_info = (width, height)
|
145 |
-
|
146 |
-
return img, target, img_info, img_id
|
147 |
-
|
148 |
def __len__(self):
|
149 |
return len(self.ids)
|
150 |
|
151 |
-
def
|
152 |
-
"""Returns the original image object at index in PIL form
|
153 |
-
|
154 |
-
Note: not using self.__getitem__(), as any transformations passed in
|
155 |
-
could mess up this functionality.
|
156 |
-
|
157 |
-
Argument:
|
158 |
-
index (int): index of img to show
|
159 |
-
Return:
|
160 |
-
PIL img
|
161 |
-
"""
|
162 |
img_id = self.ids[index]
|
163 |
-
|
164 |
-
|
165 |
-
|
166 |
-
"""Returns the original annotation of image at index
|
167 |
-
|
168 |
-
Note: not using self.__getitem__(), as any transformations passed in
|
169 |
-
could mess up this functionality.
|
170 |
|
171 |
-
|
172 |
-
index (int): index of img to get annotation of
|
173 |
-
Return:
|
174 |
-
list: [img_id, [(label, bbox coords),...]]
|
175 |
-
eg: ('001718', [('dog', (96, 13, 438, 332))])
|
176 |
-
"""
|
177 |
-
img_id = self.ids[index]
|
178 |
-
anno = ET.parse(self._annopath % img_id).getroot()
|
179 |
-
gt = self.target_transform(anno, 1, 1)
|
180 |
-
return img_id[1], gt
|
181 |
|
182 |
def pull_item(self, index):
|
183 |
"""Returns the original image and target at an index for mixup
|
@@ -191,14 +139,21 @@ class VOCDetection(Dataset):
|
|
191 |
img, target
|
192 |
"""
|
193 |
img_id = self.ids[index]
|
194 |
-
target = ET.parse(self._annopath % img_id).getroot()
|
195 |
img = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
|
196 |
-
|
197 |
height, width, _ = img.shape
|
198 |
|
|
|
|
|
199 |
img_info = (width, height)
|
200 |
-
|
201 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
202 |
|
203 |
return img, target, img_info, img_id
|
204 |
|
@@ -212,7 +167,7 @@ class VOCDetection(Dataset):
|
|
212 |
all_boxes[class][image] = [] or np.array of shape #dets x 5
|
213 |
"""
|
214 |
self._write_voc_results_file(all_boxes)
|
215 |
-
IouTh = np.linspace(0.5, 0.95, np.round((0.95 - 0.5) / 0.05) + 1, endpoint=True)
|
216 |
mAPs = []
|
217 |
for iou in IouTh:
|
218 |
mAP = self._do_python_eval(output_dir, iou)
|
@@ -270,7 +225,7 @@ class VOCDetection(Dataset):
|
|
270 |
aps = []
|
271 |
# The PASCAL VOC metric changed in 2010
|
272 |
use_07_metric = True if int(self._year) < 2010 else False
|
273 |
-
print("
|
274 |
if output_dir is not None and not os.path.isdir(output_dir):
|
275 |
os.mkdir(output_dir)
|
276 |
for i, cls in enumerate(VOC_CLASSES):
|
|
|
19 |
from .datasets_wrapper import Dataset
|
20 |
from .voc_classes import VOC_CLASSES
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
class AnnotationTransform(object):
|
24 |
|
|
|
90 |
|
91 |
def __init__(
|
92 |
self,
|
93 |
+
data_dir,
|
94 |
+
image_sets=[('2007', 'trainval'), ('2012', 'trainval')],
|
95 |
+
img_size=(416, 416),
|
96 |
preproc=None,
|
97 |
target_transform=AnnotationTransform(),
|
|
|
98 |
dataset_name="VOC0712",
|
99 |
):
|
100 |
+
super().__init__(img_size)
|
101 |
+
self.root = data_dir
|
102 |
self.image_set = image_sets
|
103 |
+
self.img_size = img_size
|
104 |
self.preproc = preproc
|
105 |
self.target_transform = target_transform
|
106 |
self.name = dataset_name
|
|
|
116 |
):
|
117 |
self.ids.append((rootpath, line.strip()))
|
118 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
119 |
def __len__(self):
|
120 |
return len(self.ids)
|
121 |
|
122 |
+
def load_anno(self, index):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
123 |
img_id = self.ids[index]
|
124 |
+
target = ET.parse(self._annopath % img_id).getroot()
|
125 |
+
if self.target_transform is not None:
|
126 |
+
target = self.target_transform(target)
|
|
|
|
|
|
|
|
|
127 |
|
128 |
+
return target
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
129 |
|
130 |
def pull_item(self, index):
|
131 |
"""Returns the original image and target at an index for mixup
|
|
|
139 |
img, target
|
140 |
"""
|
141 |
img_id = self.ids[index]
|
|
|
142 |
img = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
|
|
|
143 |
height, width, _ = img.shape
|
144 |
|
145 |
+
target = self.load_anno(index)
|
146 |
+
|
147 |
img_info = (width, height)
|
148 |
+
|
149 |
+
return img, target, img_info, index
|
150 |
+
|
151 |
+
@Dataset.resize_getitem
|
152 |
+
def __getitem__(self, index):
|
153 |
+
img, target, img_info, img_id = self.pull_item(index)
|
154 |
+
|
155 |
+
if self.preproc is not None:
|
156 |
+
img, target = self.preproc(img, target, self.input_dim)
|
157 |
|
158 |
return img, target, img_info, img_id
|
159 |
|
|
|
167 |
all_boxes[class][image] = [] or np.array of shape #dets x 5
|
168 |
"""
|
169 |
self._write_voc_results_file(all_boxes)
|
170 |
+
IouTh = np.linspace(0.5, 0.95, int(np.round((0.95 - 0.5) / 0.05)) + 1, endpoint=True)
|
171 |
mAPs = []
|
172 |
for iou in IouTh:
|
173 |
mAP = self._do_python_eval(output_dir, iou)
|
|
|
225 |
aps = []
|
226 |
# The PASCAL VOC metric changed in 2010
|
227 |
use_07_metric = True if int(self._year) < 2010 else False
|
228 |
+
print("Eval IoU : {:.2f}".format(iou))
|
229 |
if output_dir is not None and not os.path.isdir(output_dir):
|
230 |
os.mkdir(output_dir)
|
231 |
for i, cls in enumerate(VOC_CLASSES):
|
yolox/{evalutors β evaluators}/__init__.py
RENAMED
File without changes
|
yolox/{evalutors β evaluators}/coco_evaluator.py
RENAMED
File without changes
|
yolox/{evalutors β evaluators}/voc_eval.py
RENAMED
File without changes
|
yolox/evaluators/voc_evaluator.py
ADDED
@@ -0,0 +1,183 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
# -*- coding:utf-8 -*-
|
3 |
+
# Copyright (c) Megvii, Inc. and its affiliates.
|
4 |
+
|
5 |
+
import sys
|
6 |
+
import tempfile
|
7 |
+
import time
|
8 |
+
from collections import ChainMap
|
9 |
+
from loguru import logger
|
10 |
+
from tqdm import tqdm
|
11 |
+
|
12 |
+
import numpy as np
|
13 |
+
|
14 |
+
import torch
|
15 |
+
|
16 |
+
from yolox.utils import gather, is_main_process, postprocess, synchronize, time_synchronized
|
17 |
+
|
18 |
+
|
19 |
+
class VOCEvaluator:
|
20 |
+
"""
|
21 |
+
VOC AP Evaluation class.
|
22 |
+
"""
|
23 |
+
|
24 |
+
def __init__(
|
25 |
+
self, dataloader, img_size, confthre, nmsthre, num_classes,
|
26 |
+
):
|
27 |
+
"""
|
28 |
+
Args:
|
29 |
+
dataloader (Dataloader): evaluate dataloader.
|
30 |
+
img_size (int): image size after preprocess. images are resized
|
31 |
+
to squares whose shape is (img_size, img_size).
|
32 |
+
confthre (float): confidence threshold ranging from 0 to 1, which
|
33 |
+
is defined in the config file.
|
34 |
+
nmsthre (float): IoU threshold of non-max supression ranging from 0 to 1.
|
35 |
+
"""
|
36 |
+
self.dataloader = dataloader
|
37 |
+
self.img_size = img_size
|
38 |
+
self.confthre = confthre
|
39 |
+
self.nmsthre = nmsthre
|
40 |
+
self.num_classes = num_classes
|
41 |
+
self.num_images = len(dataloader.dataset)
|
42 |
+
|
43 |
+
def evaluate(
|
44 |
+
self, model, distributed=False, half=False, trt_file=None, decoder=None, test_size=None
|
45 |
+
):
|
46 |
+
"""
|
47 |
+
VOC average precision (AP) Evaluation. Iterate inference on the test dataset
|
48 |
+
and the results are evaluated by COCO API.
|
49 |
+
|
50 |
+
NOTE: This function will change training mode to False, please save states if needed.
|
51 |
+
|
52 |
+
Args:
|
53 |
+
model : model to evaluate.
|
54 |
+
|
55 |
+
Returns:
|
56 |
+
ap50_95 (float) : COCO style AP of IoU=50:95
|
57 |
+
ap50 (float) : VOC 2007 metric AP of IoU=50
|
58 |
+
summary (sr): summary info of evaluation.
|
59 |
+
"""
|
60 |
+
# TODO half to amp_test
|
61 |
+
tensor_type = torch.cuda.HalfTensor if half else torch.cuda.FloatTensor
|
62 |
+
model = model.eval()
|
63 |
+
if half:
|
64 |
+
model = model.half()
|
65 |
+
ids = []
|
66 |
+
data_list = {}
|
67 |
+
progress_bar = tqdm if is_main_process() else iter
|
68 |
+
|
69 |
+
inference_time = 0
|
70 |
+
nms_time = 0
|
71 |
+
n_samples = len(self.dataloader) - 1
|
72 |
+
|
73 |
+
if trt_file is not None:
|
74 |
+
from torch2trt import TRTModule
|
75 |
+
model_trt = TRTModule()
|
76 |
+
model_trt.load_state_dict(torch.load(trt_file))
|
77 |
+
|
78 |
+
x = torch.ones(1, 3, test_size[0], test_size[1]).cuda()
|
79 |
+
model(x)
|
80 |
+
model = model_trt
|
81 |
+
|
82 |
+
for cur_iter, (imgs, _, info_imgs, ids) in enumerate(progress_bar(self.dataloader)):
|
83 |
+
with torch.no_grad():
|
84 |
+
imgs = imgs.type(tensor_type)
|
85 |
+
|
86 |
+
# skip the the last iters since batchsize might be not enough for batch inference
|
87 |
+
is_time_record = cur_iter < len(self.dataloader) - 1
|
88 |
+
if is_time_record:
|
89 |
+
start = time.time()
|
90 |
+
|
91 |
+
outputs = model(imgs)
|
92 |
+
if decoder is not None:
|
93 |
+
outputs = decoder(outputs, dtype=outputs.type())
|
94 |
+
|
95 |
+
if is_time_record:
|
96 |
+
infer_end = time_synchronized()
|
97 |
+
inference_time += infer_end - start
|
98 |
+
|
99 |
+
outputs = postprocess(
|
100 |
+
outputs, self.num_classes, self.confthre, self.nmsthre
|
101 |
+
)
|
102 |
+
if is_time_record:
|
103 |
+
nms_end = time_synchronized()
|
104 |
+
nms_time += nms_end - infer_end
|
105 |
+
|
106 |
+
data_list.update(self.convert_to_voc_format(outputs, info_imgs, ids))
|
107 |
+
|
108 |
+
statistics = torch.cuda.FloatTensor([inference_time, nms_time, n_samples])
|
109 |
+
if distributed:
|
110 |
+
data_list = gather(data_list, dst=0)
|
111 |
+
data_list = ChainMap(*data_list)
|
112 |
+
torch.distributed.reduce(statistics, dst=0)
|
113 |
+
|
114 |
+
eval_results = self.evaluate_prediction(data_list, statistics)
|
115 |
+
synchronize()
|
116 |
+
return eval_results
|
117 |
+
|
118 |
+
def convert_to_voc_format(self, outputs, info_imgs, ids):
|
119 |
+
predictions = {}
|
120 |
+
for (output, img_h, img_w, img_id) in zip(outputs, info_imgs[0], info_imgs[1], ids):
|
121 |
+
if output is None:
|
122 |
+
predictions[int(img_id)] = (None, None, None)
|
123 |
+
continue
|
124 |
+
output = output.cpu()
|
125 |
+
|
126 |
+
bboxes = output[:, 0:4]
|
127 |
+
|
128 |
+
# preprocessing: resize
|
129 |
+
scale = min(self.img_size[0] / float(img_h), self.img_size[1] / float(img_w))
|
130 |
+
bboxes /= scale
|
131 |
+
|
132 |
+
cls = output[:, 6]
|
133 |
+
scores = output[:, 4] * output[:, 5]
|
134 |
+
|
135 |
+
predictions[int(img_id)] = (bboxes, cls, scores)
|
136 |
+
return predictions
|
137 |
+
|
138 |
+
def evaluate_prediction(self, data_dict, statistics):
|
139 |
+
if not is_main_process():
|
140 |
+
return 0, 0, None
|
141 |
+
|
142 |
+
logger.info("Evaluate in main process...")
|
143 |
+
|
144 |
+
inference_time = statistics[0].item()
|
145 |
+
nms_time = statistics[1].item()
|
146 |
+
n_samples = statistics[2].item()
|
147 |
+
|
148 |
+
a_infer_time = 1000 * inference_time / (n_samples * self.dataloader.batch_size)
|
149 |
+
a_nms_time = 1000 * nms_time / (n_samples * self.dataloader.batch_size)
|
150 |
+
|
151 |
+
time_info = ", ".join(
|
152 |
+
["Average {} time: {:.2f} ms".format(k, v) for k, v in zip(
|
153 |
+
["forward", "NMS", "inference"],
|
154 |
+
[a_infer_time, a_nms_time, (a_infer_time + a_nms_time)]
|
155 |
+
)]
|
156 |
+
)
|
157 |
+
|
158 |
+
info = time_info + "\n"
|
159 |
+
|
160 |
+
all_boxes = [[[] for _ in range(self.num_images)] for _ in range(self.num_classes)]
|
161 |
+
for img_num in range(self.num_images):
|
162 |
+
bboxes, cls, scores = data_dict[img_num]
|
163 |
+
if bboxes is None:
|
164 |
+
for j in range(self.num_classes):
|
165 |
+
all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
|
166 |
+
continue
|
167 |
+
for j in range(self.num_classes):
|
168 |
+
mask_c = cls == j
|
169 |
+
if sum(mask_c) == 0:
|
170 |
+
all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
|
171 |
+
continue
|
172 |
+
|
173 |
+
c_dets = torch.cat((bboxes, scores.unsqueeze(1)), dim=1)
|
174 |
+
all_boxes[j][img_num] = c_dets[mask_c].numpy()
|
175 |
+
|
176 |
+
sys.stdout.write(
|
177 |
+
"im_eval: {:d}/{:d} \r".format(img_num + 1, self.num_images)
|
178 |
+
)
|
179 |
+
sys.stdout.flush()
|
180 |
+
|
181 |
+
with tempfile.TemporaryDirectory() as tempdir:
|
182 |
+
mAP50, mAP70 = self.dataloader.dataset.evaluate_detections(all_boxes, tempdir)
|
183 |
+
return mAP50, mAP70, info
|
yolox/evalutors/voc_evaluator.py
DELETED
@@ -1,202 +0,0 @@
|
|
1 |
-
#!/usr/bin/env python3
|
2 |
-
# -*- coding:utf-8 -*-
|
3 |
-
# Copyright (c) Megvii, Inc. and its affiliates.
|
4 |
-
|
5 |
-
# NOTE: this file is not finished.
|
6 |
-
import sys
|
7 |
-
import tempfile
|
8 |
-
import time
|
9 |
-
from tqdm import tqdm
|
10 |
-
|
11 |
-
import torch
|
12 |
-
|
13 |
-
from yolox.data.dataset.vocdataset import ValTransform
|
14 |
-
from yolox.utils import get_rank, is_main_process, make_pred_vis, make_vis, synchronize
|
15 |
-
|
16 |
-
|
17 |
-
def _accumulate_predictions_from_multiple_gpus(predictions_per_gpu):
|
18 |
-
all_predictions = dist.scatter_gather(predictions_per_gpu)
|
19 |
-
if not is_main_process():
|
20 |
-
return
|
21 |
-
# merge the list of dicts
|
22 |
-
predictions = {}
|
23 |
-
for p in all_predictions:
|
24 |
-
predictions.update(p)
|
25 |
-
# convert a dict where the key is the index in a list
|
26 |
-
image_ids = list(sorted(predictions.keys()))
|
27 |
-
if len(image_ids) != image_ids[-1] + 1:
|
28 |
-
print("num_imgs: ", len(image_ids))
|
29 |
-
print("last img_id: ", image_ids[-1])
|
30 |
-
print(
|
31 |
-
"Number of images that were gathered from multiple processes is not "
|
32 |
-
"a contiguous set. Some images might be missing from the evaluation"
|
33 |
-
)
|
34 |
-
|
35 |
-
# convert to a list
|
36 |
-
predictions = [predictions[i] for i in image_ids]
|
37 |
-
return predictions
|
38 |
-
|
39 |
-
|
40 |
-
class VOCEvaluator:
|
41 |
-
"""
|
42 |
-
COCO AP Evaluation class.
|
43 |
-
All the data in the val2017 dataset are processed \
|
44 |
-
and evaluated by COCO API.
|
45 |
-
"""
|
46 |
-
|
47 |
-
def __init__(self, data_dir, img_size, confthre, nmsthre, vis=False):
|
48 |
-
"""
|
49 |
-
Args:
|
50 |
-
data_dir (str): dataset root directory
|
51 |
-
img_size (int): image size after preprocess. images are resized \
|
52 |
-
to squares whose shape is (img_size, img_size).
|
53 |
-
confthre (float):
|
54 |
-
confidence threshold ranging from 0 to 1, \
|
55 |
-
which is defined in the config file.
|
56 |
-
nmsthre (float):
|
57 |
-
IoU threshold of non-max supression ranging from 0 to 1.
|
58 |
-
"""
|
59 |
-
test_sets = [("2007", "test")]
|
60 |
-
self.dataset = VOCDetection(
|
61 |
-
root=data_dir,
|
62 |
-
image_sets=test_sets,
|
63 |
-
input_dim=img_size,
|
64 |
-
preproc=ValTransform(
|
65 |
-
rgb_means=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)
|
66 |
-
),
|
67 |
-
)
|
68 |
-
self.num_images = len(self.dataset)
|
69 |
-
self.dataloader = torch.utils.data.DataLoader(
|
70 |
-
self.dataset, batch_size=1, shuffle=False, num_workers=0
|
71 |
-
)
|
72 |
-
self.img_size = img_size
|
73 |
-
self.confthre = confthre
|
74 |
-
self.nmsthre = nmsthre
|
75 |
-
self.vis = vis
|
76 |
-
|
77 |
-
def evaluate(self, model, distributed=False):
|
78 |
-
"""
|
79 |
-
COCO average precision (AP) Evaluation. Iterate inference on the test dataset
|
80 |
-
and the results are evaluated by COCO API.
|
81 |
-
Args:
|
82 |
-
model : model object
|
83 |
-
Returns:
|
84 |
-
ap50_95 (float) : calculated COCO AP for IoU=50:95
|
85 |
-
ap50 (float) : calculated COCO AP for IoU=50
|
86 |
-
"""
|
87 |
-
if isinstance(model, torch.nn.parallel.DistributedDataParallel):
|
88 |
-
model = model.module
|
89 |
-
model.eval()
|
90 |
-
cuda = torch.cuda.is_available()
|
91 |
-
Tensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor
|
92 |
-
|
93 |
-
ids = []
|
94 |
-
data_dict = []
|
95 |
-
dataiterator = iter(self.dataloader)
|
96 |
-
img_num = 0
|
97 |
-
indices = list(range(self.num_images))
|
98 |
-
dis_indices = indices[get_rank() :: distributed_util.get_world_size()]
|
99 |
-
progress_bar = tqdm if distributed_util.is_main_process() else iter
|
100 |
-
num_classes = 20
|
101 |
-
predictions = {}
|
102 |
-
|
103 |
-
if is_main_process():
|
104 |
-
inference_time = 0
|
105 |
-
nms_time = 0
|
106 |
-
n_samples = len(dis_indices)
|
107 |
-
|
108 |
-
for i in progress_bar(dis_indices):
|
109 |
-
img, _, info_img, id_ = self.dataset[i] # load a batch
|
110 |
-
info_img = [float(info) for info in info_img]
|
111 |
-
ids.append(id_)
|
112 |
-
with torch.no_grad():
|
113 |
-
img = Variable(img.type(Tensor).unsqueeze(0))
|
114 |
-
|
115 |
-
if is_main_process() and i > 9:
|
116 |
-
start = time.time()
|
117 |
-
|
118 |
-
if self.vis:
|
119 |
-
outputs, fuse_weights, fused_f = model(img)
|
120 |
-
else:
|
121 |
-
outputs = model(img)
|
122 |
-
|
123 |
-
if is_main_process() and i > 9:
|
124 |
-
infer_end = time.time()
|
125 |
-
inference_time += infer_end - start
|
126 |
-
|
127 |
-
outputs = postprocess(outputs, 20, self.confthre, self.nmsthre)
|
128 |
-
|
129 |
-
if is_main_process() and i > 9:
|
130 |
-
nms_end = time.time()
|
131 |
-
nms_time += nms_end - infer_end
|
132 |
-
|
133 |
-
if outputs[0] is None:
|
134 |
-
predictions[i] = (None, None, None)
|
135 |
-
continue
|
136 |
-
outputs = outputs[0].cpu().data
|
137 |
-
|
138 |
-
bboxes = outputs[:, 0:4]
|
139 |
-
bboxes[:, 0::2] *= info_img[0] / self.img_size[0]
|
140 |
-
bboxes[:, 1::2] *= info_img[1] / self.img_size[1]
|
141 |
-
cls = outputs[:, 6]
|
142 |
-
scores = outputs[:, 4] * outputs[:, 5]
|
143 |
-
predictions[i] = (bboxes, cls, scores)
|
144 |
-
|
145 |
-
if self.vis:
|
146 |
-
o_img, _, _, _ = self.dataset.pull_item(i)
|
147 |
-
make_vis("VOC", i, o_img, fuse_weights, fused_f)
|
148 |
-
class_names = self.dataset._classes
|
149 |
-
|
150 |
-
bbox = bboxes.clone()
|
151 |
-
bbox[:, 2] = bbox[:, 2] - bbox[:, 0]
|
152 |
-
bbox[:, 3] = bbox[:, 3] - bbox[:, 1]
|
153 |
-
|
154 |
-
make_pred_vis("VOC", i, o_img, class_names, bbox, cls, scores)
|
155 |
-
|
156 |
-
if is_main_process():
|
157 |
-
o_img, _, _, _ = self.dataset.pull_item(i)
|
158 |
-
class_names = self.dataset._classes
|
159 |
-
bbox = bboxes.clone()
|
160 |
-
bbox[:, 2] = bbox[:, 2] - bbox[:, 0]
|
161 |
-
bbox[:, 3] = bbox[:, 3] - bbox[:, 1]
|
162 |
-
make_pred_vis("VOC", i, o_img, class_names, bbox, cls, scores)
|
163 |
-
|
164 |
-
synchronize()
|
165 |
-
predictions = _accumulate_predictions_from_multiple_gpus(predictions)
|
166 |
-
if not is_main_process():
|
167 |
-
return 0, 0
|
168 |
-
|
169 |
-
print("Main process Evaluating...")
|
170 |
-
|
171 |
-
a_infer_time = 1000 * inference_time / (n_samples - 10)
|
172 |
-
a_nms_time = 1000 * nms_time / (n_samples - 10)
|
173 |
-
|
174 |
-
print(
|
175 |
-
"Average forward time: %.2f ms, Average NMS time: %.2f ms, Average inference time: %.2f ms"
|
176 |
-
% (a_infer_time, a_nms_time, (a_infer_time + a_nms_time))
|
177 |
-
)
|
178 |
-
|
179 |
-
all_boxes = [[[] for _ in range(self.num_images)] for _ in range(num_classes)]
|
180 |
-
for img_num in range(self.num_images):
|
181 |
-
bboxes, cls, scores = predictions[img_num]
|
182 |
-
if bboxes is None:
|
183 |
-
for j in range(num_classes):
|
184 |
-
all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
|
185 |
-
continue
|
186 |
-
for j in range(num_classes):
|
187 |
-
mask_c = cls == j
|
188 |
-
if sum(mask_c) == 0:
|
189 |
-
all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
|
190 |
-
continue
|
191 |
-
|
192 |
-
c_dets = torch.cat((bboxes, scores.unsqueeze(1)), dim=1)
|
193 |
-
all_boxes[j][img_num] = c_dets[mask_c].numpy()
|
194 |
-
|
195 |
-
sys.stdout.write(
|
196 |
-
"im_eval: {:d}/{:d} \r".format(img_num + 1, self.num_images)
|
197 |
-
)
|
198 |
-
sys.stdout.flush()
|
199 |
-
|
200 |
-
with tempfile.TemporaryDirectory() as tempdir:
|
201 |
-
mAP50, mAP70 = self.dataset.evaluate_detections(all_boxes, tempdir)
|
202 |
-
return mAP50, mAP70
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
yolox/models/yolo_head.py
CHANGED
@@ -166,6 +166,13 @@ class YOLOXHead(nn.Module):
|
|
166 |
torch.zeros(1, grid.shape[1]).fill_(stride_this_level).type_as(xin[0])
|
167 |
)
|
168 |
if self.use_l1:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
169 |
origin_preds.append(reg_output.clone())
|
170 |
|
171 |
else:
|
@@ -193,7 +200,7 @@ class YOLOXHead(nn.Module):
|
|
193 |
batch_size = output.shape[0]
|
194 |
n_ch = 5 + self.num_classes
|
195 |
hsize, wsize = output.shape[-2:]
|
196 |
-
if grid.shape[2:
|
197 |
yv, xv = torch.meshgrid([torch.arange(hsize), torch.arange(wsize)])
|
198 |
grid = torch.stack((xv, yv), 2).view(1, 1, hsize, wsize, 2).type(dtype)
|
199 |
self.grids[k] = grid
|
|
|
166 |
torch.zeros(1, grid.shape[1]).fill_(stride_this_level).type_as(xin[0])
|
167 |
)
|
168 |
if self.use_l1:
|
169 |
+
batch_size = reg_output.shape[0]
|
170 |
+
hsize, wsize = reg_output.shape[-2:]
|
171 |
+
reg_output = reg_output.view(batch_size, self.n_anchors, 4, hsize, wsize)
|
172 |
+
reg_output = (
|
173 |
+
reg_output.permute(0, 1, 3, 4, 2)
|
174 |
+
.reshape(batch_size, -1, 4)
|
175 |
+
)
|
176 |
origin_preds.append(reg_output.clone())
|
177 |
|
178 |
else:
|
|
|
200 |
batch_size = output.shape[0]
|
201 |
n_ch = 5 + self.num_classes
|
202 |
hsize, wsize = output.shape[-2:]
|
203 |
+
if grid.shape[2:4] != output.shape[2:4]:
|
204 |
yv, xv = torch.meshgrid([torch.arange(hsize), torch.arange(wsize)])
|
205 |
grid = torch.stack((xv, yv), 2).view(1, 1, hsize, wsize, 2).type(dtype)
|
206 |
self.grids[k] = grid
|
yolox/utils/visualize.py
CHANGED
@@ -18,8 +18,8 @@ def vis(img, boxes, scores, cls_ids, conf=0.5, class_names=None):
|
|
18 |
continue
|
19 |
x0 = int(box[0])
|
20 |
y0 = int(box[1])
|
21 |
-
x1 = int(box[
|
22 |
-
y1 = int(box[
|
23 |
|
24 |
color = (_COLORS[cls_id] * 255).astype(np.uint8).tolist()
|
25 |
text = '{}:{:.1f}%'.format(class_names[cls_id], score * 100)
|
|
|
18 |
continue
|
19 |
x0 = int(box[0])
|
20 |
y0 = int(box[1])
|
21 |
+
x1 = int(box[2])
|
22 |
+
y1 = int(box[3])
|
23 |
|
24 |
color = (_COLORS[cls_id] * 255).astype(np.uint8).tolist()
|
25 |
text = '{}:{:.1f}%'.format(class_names[cls_id], score * 100)
|