ηŽ‹ζž«02-Base Detection commited on
Commit
5b8ab8f
Β·
1 Parent(s): e9faa7e

feat(YOLOX): update README and fix serveral bugs.

Browse files
README.md CHANGED
@@ -1,37 +1,35 @@
1
- <div align="center"><img src="assets/logo.png" width="600"></div>
2
-
3
  <img src="assets/demo.png" >
4
 
5
- ## <div align="center">Introduction</div>
6
  YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities.
7
 
 
8
 
9
- ## <div align="center">Why YOLOX?</div>
10
-
11
- <div align="center"><img src="assets/fig1.png" width="400" ><img src="assets/fig2.png" width="400"></div>
12
-
13
- ## <div align="center">News!!</div>
14
- * 【2020/07/19】 We have released our technical report on [Arxiv](xxx)!!
15
 
16
- ## <div align="center">Benchmark</div>
17
 
18
- ### Standard Models.
19
  |Model |size |mAP<sup>test<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
20
  | ------ |:---: | :---: |:---: |:---: | :---: | :----: |
21
- |[YOLOX-s]() |640 |39.6 |9.8 |9.0 | 26.8 | - |
22
- |[YOLOX-m]() |640 |46.4 |12.3 |25.3 |73.8| - |
23
- |[YOLOX-l]() |640 |50.0 |14.5 |54.2| 155.6 | - |
24
- |[YOLOX-x]() |640 |**51.2** | 17.3 |99.1 |281.9 | - |
 
25
 
26
- ### Light Models.
27
- |Model |size |mAP<sup>val<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
28
- | ------ |:---: | :---: |:---: |:---: | :---: | :----: |
29
- |[YOLOX-Nano]() |416 |25.3 |- | 0.91 |1.08 | - |
30
- |[YOLOX-Tiny]() |416 |31.7 |- | 5.06 |6.45 | - |
31
 
32
- ## <div align="center">Quick Start</div>
33
 
34
- ### Installation
 
35
 
36
  Step1. Install [apex](https://github.com/NVIDIA/apex).
37
 
@@ -47,25 +45,41 @@ $ cd yolox
47
  $ pip3 install -v -e . # or "python3 setup.py develop
48
  ```
49
 
50
- ### Demo
 
 
 
 
 
51
 
52
- You can use either -n or -f to specify your detector's config:
53
 
54
  ```shell
55
- python tools/demo.py -n yolox-s -c <MODEL_PATH> --conf 0.3 --nms 0.65 --tsize 640
56
  ```
57
  or
58
  ```shell
59
- python tools/demo.py -f exps/base/yolox_s.py -c <MODEL_PATH> --conf 0.3 --nms 0.65 --tsize 640
 
 
 
 
60
  ```
61
 
62
 
63
- <details open>
 
 
64
  <summary>Reproduce our results on COCO</summary>
65
 
66
- Step1.
 
 
 
 
 
67
 
68
- * Reproduce our results on COCO by specifying -n:
69
 
70
  ```shell
71
  python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
@@ -73,12 +87,11 @@ python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
73
  yolox-l
74
  yolox-x
75
  ```
76
- Notes:
77
  * -d: number of gpu devices
78
- * -b: total batch size, the recommended number for -b equals to num_gpu * 8
79
  * --fp16: mixed precision training
80
 
81
- The above commands are equivalent to:
82
 
83
  ```shell
84
  python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
@@ -87,42 +100,49 @@ python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
87
  exps/base/yolox-x.py
88
  ```
89
 
90
- * Customize your training.
91
-
92
- * Finetune your datset on COCO pretrained models.
93
  </details>
94
 
95
- <details open>
 
96
  <summary>Evaluation</summary>
 
97
  We support batch testing for fast evaluation:
98
 
99
  ```shell
100
- python tools/eval.py -n yolox-s -b 64 --conf 0.001 --fp16 (optional) --fuse (optional) --test (for test-dev set)
101
  yolox-m
102
  yolox-l
103
  yolox-x
104
  ```
 
 
 
105
 
106
  To reproduce speed test, we use the following command:
107
  ```shell
108
- python tools/eval.py -n yolox-s -b 1 -d 0 --conf 0.001 --fp16 --fuse --test (for test-dev set)
109
  yolox-m
110
  yolox-l
111
  yolox-x
112
  ```
113
 
114
- ## <div align="center">Deployment</div>
115
-
116
  </details>
117
 
118
- 1. [ONNX: Including ONNX export and an ONNXRuntime demo.]()
119
- 2. [TensorRT in both C++ and Python]()
120
- 3. [NCNN in C++]()
121
- 4. [OpenVINO in both C++ and Python]()
122
 
123
- ## <div align="center">Cite Our Work</div>
 
 
 
 
 
 
 
124
 
125
 
126
- If you find this project useful for you, please use the following BibTeX entry.
 
 
 
127
 
128
- TODO
 
 
1
+ <div align="center"><img src="assets/logo.png" width="350"></div>
 
2
  <img src="assets/demo.png" >
3
 
4
+ ## Introduction
5
  YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities.
6
 
7
+ <img src="assets/git_fig.png" width="1000" >
8
 
9
+ ## Updates!!
10
+ * 【2020/07/19】 We have released our technical report on Arxiv.
 
 
 
 
11
 
12
+ ## Benchmark
13
 
14
+ #### Standard Models.
15
  |Model |size |mAP<sup>test<br>0.5:0.95 | Speed V100<br>(ms) | Params<br>(M) |FLOPs<br>(B)| weights |
16
  | ------ |:---: | :---: |:---: |:---: | :---: | :----: |
17
+ |[YOLOX-s](./exps/yolox_s.py) |640 |39.6 |9.8 |9.0 | 26.8 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EW62gmO2vnNNs5npxjzunVwB9p307qqygaCkXdTO88BLUg?e=NMTQYw) |
18
+ |[YOLOX-m](./exps/yolox_m.py) |640 |46.4 |12.3 |25.3 |73.8| [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ERMTP7VFqrVBrXKMU7Vl4TcBQs0SUeCT7kvc-JdIbej4tQ?e=1MDo9y) |
19
+ |[YOLOX-l](./exps/yolox_l.py) |640 |50.0 |14.5 |54.2| 155.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EWA8w_IEOzBKvuueBqfaZh0BeoG5sVzR-XYbOJO4YlOkRw?e=wHWOBE) |
20
+ |[YOLOX-x](./exps/yolox_x.py) |640 |**51.2** | 17.3 |99.1 |281.9 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EdgVPHBziOVBtGAXHfeHI5kBza0q9yyueMGdT0wXZfI1rQ?e=tABO5u) |
21
+ |[YOLOX-Darknet53](./exps/yolov3.py) |640 | 47.4 | 11.1 |63.7 | 185.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZ-MV1r_fMFPkPrNjvbJEMoBLOLAnXH-XKEB77w8LhXL6Q?e=mf6wOc) |
22
 
23
+ #### Light Models.
24
+ |Model |size |mAP<sup>val<br>0.5:0.95 | Params<br>(M) |FLOPs<br>(B)| weights |
25
+ | ------ |:---: | :---: |:---: |:---: | :---: |
26
+ |[YOLOX-Nano](./exps/nano.py) |416 |25.3 | 0.91 |1.08 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EdcREey-krhLtdtSnxolxiUBjWMy6EFdiaO9bdOwZ5ygCQ?e=yQpdds) |
27
+ |[YOLOX-Tiny](./exps/yolox_tiny.py) |416 |31.7 | 5.06 |6.45 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EYtjNFPqvZBBrQ-VowLcSr4B6Z5TdTflUsr_gO2CwhC3bQ?e=SBTwXj) |
28
 
29
+ ## Quick Start
30
 
31
+ <details>
32
+ <summary>Installation</summary>
33
 
34
  Step1. Install [apex](https://github.com/NVIDIA/apex).
35
 
 
45
  $ pip3 install -v -e . # or "python3 setup.py develop
46
  ```
47
 
48
+ </details>
49
+
50
+ <details>
51
+ <summary>Demo</summary>
52
+
53
+ Step1. Download a pretrained model from the benchmark table.
54
 
55
+ Step2. Use either -n or -f to specify your detector's config. For example:
56
 
57
  ```shell
58
+ python tools/demo.py image -n yolox-s -c /path/to/your/yolox_s.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
59
  ```
60
  or
61
  ```shell
62
+ python tools/demo.py image -f exps/yolox_s.py -c /path/to/your/yolox_s.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
63
+ ```
64
+ Demo for video:
65
+ ```shell
66
+ python tools/demo.py video -n yolox-s -c /path/to/your/yolox_s.pth.tar --path /path/to/your/video --conf 0.3 --nms 0.65 --tsize 640 --save_result
67
  ```
68
 
69
 
70
+ </details>
71
+
72
+ <details>
73
  <summary>Reproduce our results on COCO</summary>
74
 
75
+ Step1. Prepare dataset
76
+ ```shell
77
+ cd <YOLOX_HOME>
78
+ mkdir datasets
79
+ ln -s /path/to/your/COCO ./datasets/COCO
80
+ ```
81
 
82
+ Step2. Reproduce our results on COCO by specifying -n:
83
 
84
  ```shell
85
  python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o
 
87
  yolox-l
88
  yolox-x
89
  ```
 
90
  * -d: number of gpu devices
91
+ * -b: total batch size, the recommended number for -b is num_gpu * 8
92
  * --fp16: mixed precision training
93
 
94
+ When using -f, the above commands are equivalent to:
95
 
96
  ```shell
97
  python tools/train.py -f exps/base/yolox-s.py -d 8 -b 64 --fp16 -o
 
100
  exps/base/yolox-x.py
101
  ```
102
 
 
 
 
103
  </details>
104
 
105
+
106
+ <details>
107
  <summary>Evaluation</summary>
108
+
109
  We support batch testing for fast evaluation:
110
 
111
  ```shell
112
+ python tools/eval.py -n yolox-s -c yolox_s.pth.tar -b 64 -d 8 --conf 0.001 [--fp16] [--fuse]
113
  yolox-m
114
  yolox-l
115
  yolox-x
116
  ```
117
+ * --fuse: fuse conv and bn
118
+ * -d: number of GPUs used for evaluation. DEFAULT: All GPUs available will be used.
119
+ * -b: total batch size across on all GPUs
120
 
121
  To reproduce speed test, we use the following command:
122
  ```shell
123
+ python tools/eval.py -n yolox-s -c yolox_s.pth.tar -b 1 -d 1 --conf 0.001 --fp16 --fuse
124
  yolox-m
125
  yolox-l
126
  yolox-x
127
  ```
128
 
 
 
129
  </details>
130
 
 
 
 
 
131
 
132
+ <details open>
133
+ <summary>Toturials</summary>
134
+
135
+ * [Training on custom data](docs/train_custom_data.md).
136
+
137
+ </details>
138
+
139
+ ## Deployment
140
 
141
 
142
+ 1. [ONNX: Including ONNX export and an ONNXRuntime demo.](./demo/ONNXRuntime)
143
+ 2. [TensorRT in both C++ and Python](./demo/TensorRT)
144
+ 3. [NCNN in C++](./demo/ncnn/android)
145
+ 4. [OpenVINO in both C++ and Python](./demo/OpenVINO)
146
 
147
+ ## Citing YOLOX
148
+ If you use YOLOX in your research, please cite our work by using the following BibTeX entry:
demo/ONNXRuntime/README.md CHANGED
@@ -1,17 +1,18 @@
1
- ## ONNXRuntime Demo in Python
2
 
3
  This doc introduces how to convert you pytorch model into onnx, and how to run an onnxruntime demo to verify your convertion.
4
 
5
  ### Download ONNX models.
6
- | Model | Parameters | GFLOPs | Test Size | mAP |
7
- |:------| :----: | :----: | :---: | :---: |
8
- | [YOLOX-Nano](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.res101.fpn.coco.800size.1x) | 0.91M | 1.08 | 416x416 | 25.3 |
9
- | [YOLOX-Tiny](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.fpn.coco.800size.1x) | 5.06M | 6.45 | 416x416 |31.7 |
10
- | [YOLOX-S](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 9.0M | 26.8 | 640x640 |39.6 |
11
- | [YOLOX-M](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 25.3M | 73.8 | 640x640 |46.4 |
12
- | [YOLOX-L](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 54.2M | 155.6 | 640x640 |50.0 |
13
- | [YOLOX-X](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 99.1M | 281.9 | 640x640 |51.2 |
14
- | [YOLOX-Darknet53](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 63.72M | 185.3 | 640x640 |47.3 |
 
15
 
16
  ### Convert Your Model to ONNX
17
 
@@ -28,7 +29,7 @@ python3 tools/export_onnx.py --output-name yolox_s.onnx -n yolox-s -c yolox_s.pt
28
  Notes:
29
  * -n: specify a model name. The model name must be one of the [yolox-s,m,l,x and yolox-nane, yolox-tiny, yolov3]
30
  * -c: the model you have trained
31
- * -o: opset version, default 11. **However, if you will further convert your onnx model to [OpenVINO](), please specify the opset version to 10.**
32
  * --no-onnxsim: disable onnxsim
33
  * To customize an input shape for onnx model, modify the following code in tools/export.py:
34
 
@@ -36,7 +37,7 @@ Notes:
36
  dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
37
  ```
38
 
39
- 2. Convert a standard YOLOX model by -f. By using -f, the above command is equivalent to:
40
 
41
  ```shell
42
  python3 tools/export_onnx.py --output-name yolox_s.onnx -f exps/yolox_s.py -c yolox_s.pth.tar
@@ -52,7 +53,7 @@ python3 tools/export_onnx.py --output-name your_yolox.onnx -f exps/your_yolox.py
52
 
53
  Step1.
54
  ```shell
55
- cd <YOLOX_HOME>/yolox/deploy/demo_onnxruntime/
56
  ```
57
 
58
  Step2.
 
1
+ ## YOLOX-ONNXRuntime in Python
2
 
3
  This doc introduces how to convert you pytorch model into onnx, and how to run an onnxruntime demo to verify your convertion.
4
 
5
  ### Download ONNX models.
6
+ | Model | Parameters | GFLOPs | Test Size | mAP | Weights |
7
+ |:------| :----: | :----: | :---: | :---: | :---: |
8
+ | YOLOX-Nano | 0.91M | 1.08 | 416x416 | 25.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EfAGwvevU-lNhW5OqFAyHbwBJdI_7EaKu5yU04fgF5BU7w?e=gvq4hf) |
9
+ | YOLOX-Tiny | 5.06M | 6.45 | 416x416 |31.7 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EVigCszU1ilDn-MwLwHCF1ABsgTy06xFdVgZ04Yyo4lHVA?e=hVKiCw) |
10
+ | YOLOX-S | 9.0M | 26.8 | 640x640 |39.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/Ec0L1d1x2UtIpbfiahgxhtgBZVjb1NCXbotO8SCOdMqpQQ?e=siyIsK) |
11
+ | YOLOX-M | 25.3M | 73.8 | 640x640 |46.4 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ERUKlQe-nlxBoTKPy1ynbxsBmAZ_h-VBEV-nnfPdzUIkZQ?e=hyQQtl) |
12
+ | YOLOX-L | 54.2M | 155.6 | 640x640 |50.0 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ET5w926jCA5GlVfg9ixB4KEBiW0HYl7SzaHNRaRG9dYO_A?e=ISmCYX) |
13
+ | YOLOX-Darknet53| 63.72M | 185.3 | 640x640 |47.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ESArloSW-MlPlLuemLh9zKkBdovgweKbfu4zkvzKAp7pPQ?e=f81Ikw) |
14
+ | YOLOX-X | 99.1M | 281.9 | 640x640 |51.2 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ERjqoeMJlFdGuM3tQfXQmhABmGHlIHydWCwhlugeWLE9AA) |
15
+
16
 
17
  ### Convert Your Model to ONNX
18
 
 
29
  Notes:
30
  * -n: specify a model name. The model name must be one of the [yolox-s,m,l,x and yolox-nane, yolox-tiny, yolov3]
31
  * -c: the model you have trained
32
+ * -o: opset version, default 11. **However, if you will further convert your onnx model to [OpenVINO](../OpenVINO/), please specify the opset version to 10.**
33
  * --no-onnxsim: disable onnxsim
34
  * To customize an input shape for onnx model, modify the following code in tools/export.py:
35
 
 
37
  dummy_input = torch.randn(1, 3, exp.test_size[0], exp.test_size[1])
38
  ```
39
 
40
+ 2. Convert a standard YOLOX model by -f. When using -f, the above command is equivalent to:
41
 
42
  ```shell
43
  python3 tools/export_onnx.py --output-name yolox_s.onnx -f exps/yolox_s.py -c yolox_s.pth.tar
 
53
 
54
  Step1.
55
  ```shell
56
+ cd <YOLOX_HOME>/demo/ONNXRuntime
57
  ```
58
 
59
  Step2.
demo/OpenVINO/README.md CHANGED
@@ -1,4 +1,4 @@
1
- ## YOLOX on OpenVINO
2
 
3
- * [C++ Demo]()
4
- * [Python Demo]()
 
1
+ ## YOLOX for OpenVINO
2
 
3
+ * [C++ Demo](./cpp)
4
+ * [Python Demo](./python)
demo/OpenVINO/cpp/README.md CHANGED
@@ -1,17 +1,17 @@
1
- # User Guide for Deploy YOLOX on OpenVINO
2
 
3
  This toturial includes a C++ demo for OpenVINO, as well as some converted models.
4
 
5
  ### Download OpenVINO models.
6
- | Model | Parameters | GFLOPs | Test Size | mAP |
7
- |:------| :----: | :----: | :---: | :---: |
8
- | [YOLOX-Nano](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.res101.fpn.coco.800size.1x) | 0.91M | 1.08 | 416x416 | 25.3 |
9
- | [YOLOX-Tiny](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.fpn.coco.800size.1x) | 5.06M | 6.45 | 416x416 |31.7 |
10
- | [YOLOX-S](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 9.0M | 26.8 | 640x640 |39.6 |
11
- | [YOLOX-M](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 25.3M | 73.8 | 640x640 |46.4 |
12
- | [YOLOX-L](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 54.2M | 155.6 | 640x640 |50.0 |
13
- | [YOLOX-X](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 99.1M | 281.9 | 640x640 |51.2 |
14
- | [YOLOX-Darknet53](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 63.72M | 185.3 | 640x640 |47.3 |
15
 
16
  ## Install OpenVINO Toolkit
17
 
@@ -51,7 +51,7 @@ source ~/.bashrc
51
 
52
  1. Export ONNX model
53
 
54
- Please refer to the [ONNX toturial]() for more details. **Note that you should set --opset to 10, otherwise your next step will fail.**
55
 
56
  2. Convert ONNX to OpenVINO
57
 
 
1
+ # YOLOX-OpenVINO in C++
2
 
3
  This toturial includes a C++ demo for OpenVINO, as well as some converted models.
4
 
5
  ### Download OpenVINO models.
6
+ | Model | Parameters | GFLOPs | Test Size | mAP | Weights |
7
+ |:------| :----: | :----: | :---: | :---: | :---: |
8
+ | [YOLOX-Nano](../../../exps/nano.py) | 0.91M | 1.08 | 416x416 | 25.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EeWY57o5wQZFtXYd1KJw6Z8B4vxZru649XxQHYIFgio3Qw?e=ZS81ce) |
9
+ | [YOLOX-Tiny](../../../exps/yolox_tiny.py) | 5.06M | 6.45 | 416x416 |31.7 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ETfvOoCXdVZNinoSpKA_sEYBIQVqfjjF5_M6VvHRnLVcsA?e=STL1pi) |
10
+ | [YOLOX-S](../../../exps/yolox_s.py) | 9.0M | 26.8 | 640x640 |39.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EXUjf3PQnbBLrxNrXPueqaIBzVZOrYQOnJpLK1Fytj5ssA?e=GK0LOM) |
11
+ | [YOLOX-M](../../../exps/yolox_m.py) | 25.3M | 73.8 | 640x640 |46.4 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EcoT1BPpeRpLvE_4c441zn8BVNCQ2naxDH3rho7WqdlgLQ?e=95VaM9) |
12
+ | [YOLOX-L](../../../exps/yolox_l.py) | 54.2M | 155.6 | 640x640 |50.0 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZvmn-YLRuVPh0GAP_w3xHMB2VGvrKqQXyK_Cv5yi_DXUg?e=YRh6Eq) |
13
+ | [YOLOX-Darknet53](../../../exps/yolov3.py) | 63.72M | 185.3 | 640x640 |47.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EQP8LSroikFHuwX0jFRetmcBOCDWSFmylHxolV7ezUPXGw?e=bEw5iq) |
14
+ | [YOLOX-X](../../../exps/yolox_x.py) | 99.1M | 281.9 | 640x640 |51.2 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZFPnLqiD-xIlt7rcZYDjQgB4YXE9wnq1qaSXQwJrsKbdg?e=83nwEz) |
15
 
16
  ## Install OpenVINO Toolkit
17
 
 
51
 
52
  1. Export ONNX model
53
 
54
+ Please refer to the [ONNX toturial](../../ONNXRuntime). **Note that you should set --opset to 10, otherwise your next step will fail.**
55
 
56
  2. Convert ONNX to OpenVINO
57
 
demo/OpenVINO/python/README.md CHANGED
@@ -1,17 +1,17 @@
1
- # User Guide for Deploy YOLOX on OpenVINO
2
 
3
  This toturial includes a Python demo for OpenVINO, as well as some converted models.
4
 
5
  ### Download OpenVINO models.
6
- | Model | Parameters | GFLOPs | Test Size | mAP |
7
- |:------| :----: | :----: | :---: | :---: |
8
- | [YOLOX-Nano](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.res101.fpn.coco.800size.1x) | 0.91M | 1.08 | 416x416 | 25.3 |
9
- | [YOLOX-Tiny](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.fpn.coco.800size.1x) | 5.06M | 6.45 | 416x416 |31.7 |
10
- | [YOLOX-S](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 9.0M | 26.8 | 640x640 |39.6 |
11
- | [YOLOX-M](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 25.3M | 73.8 | 640x640 |46.4 |
12
- | [YOLOX-L](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 54.2M | 155.6 | 640x640 |50.0 |
13
- | [YOLOX-X](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 99.1M | 281.9 | 640x640 |51.2 |
14
- | [YOLOX-Darknet53](https://github.com/Joker316701882/OTA/tree/main/playground/detection/coco/ota.x101.dcnv2.fpn.coco.800size.1x) | 63.72M | 185.3 | 640x640 |47.3 |
15
 
16
  ## Install OpenVINO Toolkit
17
 
@@ -51,7 +51,7 @@ source ~/.bashrc
51
 
52
  1. Export ONNX model
53
 
54
- Please refer to the [ONNX toturial]() for more details. **Note that you should set --opset to 10, otherwise your next step will fail.**
55
 
56
  2. Convert ONNX to OpenVINO
57
 
@@ -71,7 +71,7 @@ source ~/.bashrc
71
  ```
72
  For example:
73
  ```shell
74
- python3 mo.py --input_model yolox.onnx --input_shape (1,3,640,640) --data_type FP16
75
  ```
76
 
77
  ## Demo
 
1
+ # YOLOX-OpenVINO in Python
2
 
3
  This toturial includes a Python demo for OpenVINO, as well as some converted models.
4
 
5
  ### Download OpenVINO models.
6
+ | Model | Parameters | GFLOPs | Test Size | mAP | Weights |
7
+ |:------| :----: | :----: | :---: | :---: | :---: |
8
+ | [YOLOX-Nano](../../../exps/nano.py) | 0.91M | 1.08 | 416x416 | 25.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EeWY57o5wQZFtXYd1KJw6Z8B4vxZru649XxQHYIFgio3Qw?e=ZS81ce) |
9
+ | [YOLOX-Tiny](../../../exps/yolox_tiny.py) | 5.06M | 6.45 | 416x416 |31.7 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/ETfvOoCXdVZNinoSpKA_sEYBIQVqfjjF5_M6VvHRnLVcsA?e=STL1pi) |
10
+ | [YOLOX-S](../../../exps/yolox_s.py) | 9.0M | 26.8 | 640x640 |39.6 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EXUjf3PQnbBLrxNrXPueqaIBzVZOrYQOnJpLK1Fytj5ssA?e=GK0LOM) |
11
+ | [YOLOX-M](../../../exps/yolox_m.py) | 25.3M | 73.8 | 640x640 |46.4 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EcoT1BPpeRpLvE_4c441zn8BVNCQ2naxDH3rho7WqdlgLQ?e=95VaM9) |
12
+ | [YOLOX-L](../../../exps/yolox_l.py) | 54.2M | 155.6 | 640x640 |50.0 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZvmn-YLRuVPh0GAP_w3xHMB2VGvrKqQXyK_Cv5yi_DXUg?e=YRh6Eq) |
13
+ | [YOLOX-Darknet53](../../../exps/yolov3.py) | 63.72M | 185.3 | 640x640 |47.3 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EQP8LSroikFHuwX0jFRetmcBOCDWSFmylHxolV7ezUPXGw?e=bEw5iq) |
14
+ | [YOLOX-X](../../../exps/yolox_x.py) | 99.1M | 281.9 | 640x640 |51.2 | [Download](https://megvii-my.sharepoint.cn/:u:/g/personal/gezheng_megvii_com/EZFPnLqiD-xIlt7rcZYDjQgB4YXE9wnq1qaSXQwJrsKbdg?e=83nwEz) |
15
 
16
  ## Install OpenVINO Toolkit
17
 
 
51
 
52
  1. Export ONNX model
53
 
54
+ Please refer to the [ONNX toturial](../../ONNXRuntime). **Note that you should set --opset to 10, otherwise your next step will fail.**
55
 
56
  2. Convert ONNX to OpenVINO
57
 
 
71
  ```
72
  For example:
73
  ```shell
74
+ python3 mo.py --input_model yolox.onnx --input_shape [1,3,640,640] --data_type FP16 --output_dir converted_output
75
  ```
76
 
77
  ## Demo
demo/TensorRT/cpp/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # User Guide for Deploy YOLOX on TensorRT C++
2
 
3
  As YOLOX models is easy to converted to tensorrt using [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt),
4
  our C++ demo will not include the model converting or constructing like other tenorrt demos.
@@ -6,7 +6,7 @@ our C++ demo will not include the model converting or constructing like other te
6
 
7
  ## Step 1: Prepare serialized engine file
8
 
9
- Follow the trt [python demo README](../Python/README.md) to convert and save the serialized engine file.
10
 
11
 
12
  ## Step 2: build the demo
 
1
+ # YOLOX-TensorRT in C++
2
 
3
  As YOLOX models is easy to converted to tensorrt using [torch2trt gitrepo](https://github.com/NVIDIA-AI-IOT/torch2trt),
4
  our C++ demo will not include the model converting or constructing like other tenorrt demos.
 
6
 
7
  ## Step 1: Prepare serialized engine file
8
 
9
+ Follow the trt [python demo README](../python/README.md) to convert and save the serialized engine file.
10
 
11
 
12
  ## Step 2: build the demo
demo/TensorRT/python/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # User Guide for Deploy YOLOX on TensorRT
2
 
3
  This toturial includes a Python demo for TensorRT.
4
 
@@ -12,21 +12,21 @@ YOLOX models can be easily conveted to TensorRT models using torch2trt
12
 
13
  If you want to convert our model, use the flag -n to specify a model name:
14
  ```shell
15
- python tools/deploy/trt.py -n <YOLOX_MODEL_NAME> -c <YOLOX_CHECKPOINT>
16
  ```
17
  For example:
18
  ```shell
19
- python tools/deploy/trt.py -n yolox-s -c your_ckpt.pth.tar
20
  ```
21
  <YOLOX_MODEL_NAME> can be: yolox-nano, yolox-tiny. yolox-s, yolox-m, yolox-l, yolox-x.
22
 
23
  If you want to convert your customized model, use the flag -f to specify you exp file:
24
  ```shell
25
- python tools/deploy/trt.py -f <YOLOX_EXP_FILE> -c <YOLOX_CHECKPOINT>
26
  ```
27
  For example:
28
  ```shell
29
- python tools/deploy/trt.py -f /path/to/your/yolox/exps/yolox_s.py -c your_ckpt.pth.tar
30
  ```
31
  *yolox_s.py* can be any exp file modified by you.
32
 
 
1
+ # YOLOX-TensorRT in Python
2
 
3
  This toturial includes a Python demo for TensorRT.
4
 
 
12
 
13
  If you want to convert our model, use the flag -n to specify a model name:
14
  ```shell
15
+ python tools/trt.py -n <YOLOX_MODEL_NAME> -c <YOLOX_CHECKPOINT>
16
  ```
17
  For example:
18
  ```shell
19
+ python tools/trt.py -n yolox-s -c your_ckpt.pth.tar
20
  ```
21
  <YOLOX_MODEL_NAME> can be: yolox-nano, yolox-tiny. yolox-s, yolox-m, yolox-l, yolox-x.
22
 
23
  If you want to convert your customized model, use the flag -f to specify you exp file:
24
  ```shell
25
+ python tools/trt.py -f <YOLOX_EXP_FILE> -c <YOLOX_CHECKPOINT>
26
  ```
27
  For example:
28
  ```shell
29
+ python tools/trt.py -f /path/to/your/yolox/exps/yolox_s.py -c your_ckpt.pth.tar
30
  ```
31
  *yolox_s.py* can be any exp file modified by you.
32
 
docs/.gitkeep ADDED
File without changes
docs/train_custom_data.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Train Custom Data.
2
+ This page explains how to train your own custom data with YOLOX.
3
+
4
+ We take an example of finetuing YOLOX-S model on VOC dataset to give a more clear guide.
5
+
6
+ ## 0. Before you start
7
+ Clone this repo and follow the [README](../README.md) to install YOLOX.
8
+
9
+ ## 1. Create your own dataset
10
+ **Step 1** Prepare your own dataset with images and labels first. For labeling images, you may use a tool like [Labelme](https://github.com/wkentaro/labelme) or [CVAT](https://github.com/openvinotoolkit/cvat).
11
+
12
+ **Step 2** Then, you should write the corresponding Dataset Class which can load images and labels through "\_\_getitem\_\_" method. We currently support COCO format and VOC format.
13
+
14
+ You can also write the Dataset by you own. Let's take the [VOC](../yolox/data/datasets/voc.py#L151) Dataset file for example:
15
+ ```python
16
+ @Dataset.resize_getitem
17
+ def __getitem__(self, index):
18
+ img, target, img_info, img_id = self.pull_item(index)
19
+
20
+ if self.preproc is not None:
21
+ img, target = self.preproc(img, target, self.input_dim)
22
+
23
+ return img, target, img_info, img_id
24
+
25
+ ```
26
+
27
+ One more thing worth noting is that you should also implement "[pull_item](../yolox/data/datasets/voc.py#L129)" and "[load_anno](../yolox/data/datasets/voc.py#L121)" method for the Mosiac and MixUp augmentation.
28
+
29
+ **Step 3** Prepare the evaluator. We currently have [COCO evaluator](../yolox/evaluators/coco_evaluator.py) and [VOC evaluator](../yolox/evaluators/voc_evaluator.py).
30
+ If you have your own format data or evaluation metric, you may write your own evaluator.
31
+
32
+ ## 2. Create your Exp file to control everything
33
+ We put everything involved in a model to one single Exp file, including model setting, training setting, and testing setting.
34
+
35
+ A complete Exp file is at [yolox_base.py](../yolox/exp/yolox_base.py). It may be too long to write for every exp, but you can inherit the base Exp file and only overwrite the changed part.
36
+
37
+ Let's still take the [VOC Exp file](../exps/example/yolox_voc/yolox_voc_s.py) for an example.
38
+
39
+ We select YOLOX-S model here, so we should change the network depth and width. VOC has only 20 classes, so we should also change the num_classes.
40
+
41
+ These configs are changed in the inti() methd:
42
+ ```python
43
+ class Exp(MyExp):
44
+ def __init__(self):
45
+ super(Exp, self).__init__()
46
+ self.num_classes = 20
47
+ self.depth = 0.33
48
+ self.width = 0.50
49
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
50
+ ```
51
+
52
+ Besides, you should also overwrite the dataset and evaluator preprared before to training the model on your own data.
53
+
54
+ Please see "[get_data_loader](../exps/example/yolox_voc/yolox_voc_s.py#L20)", "[get_eval_loader](../exps/example/yolox_voc/yolox_voc_s.py#L82)", and "[get_evaluator](../exps/example/yolox_voc/yolox_voc_s.py#L113)" for more details.
55
+
56
+ ## 3. Train
57
+ Except special cases, we always recommend to use our [COCO pretrained weights](../README.md) for initializing.
58
+
59
+ Once you get the Exp file and the COCO pretrained weights we provided, you can train your own model by the following command:
60
+ ```bash
61
+ python tools/train.py -f /path/to/your/Exp/file -d 8 -b 64 --fp16 -o -c /path/to/the/pretrained/weights
62
+ ```
63
+
64
+ or take the YOLOX-S VOC training for example:
65
+ ```bash
66
+ python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 8 -b 64 --fp16 -o -c /path/to/yolox_s.pth.tar
67
+ ```
68
+
69
+ (Don't worry for the different shape of detection head between the pretrained weights and your own model, we will handle it)
70
+
71
+ ## 4. Tips for Best Training Results
72
+
73
+ As YOLOX is an anchor-free detector with only several hyper-parameters, most of the time good results can be obtained with no changes to the models or training settings.
74
+ We thus always recommend you first train with all default training settings.
75
+
76
+ If at first you don't get good results, there are steps you could considier to take to improve.
77
+
78
+ **Model Selection** We provide YOLOX-Nano, YOLOX-Tiny, and YOLOX-S for mobile deployments, while YOLOX-M/L/X for cloud or high performance GPU deployments.
79
+
80
+ If your deployment meets some trouble of compatibility. we recommand YOLOX-DarkNet53.
81
+
82
+ **Training Configs** If your training overfits early, then you can reduce max\_epochs or decrease the base\_lr and min\_lr\_ratio in your Exp file:
83
+ ```python
84
+ # -------------- training config --------------------- #
85
+ self.warmup_epochs = 5
86
+ self.max_epoch = 300
87
+ self.warmup_lr = 0
88
+ self.basic_lr_per_img = 0.01 / 64.0
89
+ self.scheduler = "yoloxwarmcos"
90
+ self.no_aug_epochs = 15
91
+ self.min_lr_ratio = 0.05
92
+ self.ema = True
93
+
94
+ self.weight_decay = 5e-4
95
+ self.momentum = 0.9
96
+ ```
97
+
98
+ **Aug Configs** You may also change the degree of the augmentations.
99
+
100
+ Generally, for small models, you should weak the aug, while for large models or small size of dataset, you may enchance the aug in your Exp file:
101
+ ```python
102
+ # --------------- transform config ----------------- #
103
+ self.degrees = 10.0
104
+ self.translate = 0.1
105
+ self.scale = (0.1, 2)
106
+ self.mscale = (0.8, 1.6)
107
+ self.shear = 2.0
108
+ self.perspective = 0.0
109
+ self.enable_mixup = True
110
+ ```
111
+
112
+ **Design your own detector** You may refer to our [Arxiv]() paper for details and suggestions for designing your own detector.
113
+
114
+
115
+
116
+
117
+
118
+
exps/example/yolox_voc/yolox_voc_s.py ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # encoding: utf-8
2
+ import os
3
+ import random
4
+ import torch
5
+ import torch.nn as nn
6
+ import torch.distributed as dist
7
+
8
+ from yolox.exp import Exp as MyExp
9
+
10
+
11
+ class Exp(MyExp):
12
+ def __init__(self):
13
+ super(Exp, self).__init__()
14
+ self.num_classes = 20
15
+ self.depth = 0.33
16
+ self.width = 0.50
17
+ self.eval_interval = 2
18
+ self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
19
+
20
+ def get_data_loader(self, batch_size, is_distributed, no_aug=False):
21
+ from yolox.data import (
22
+ VOCDetection,
23
+ TrainTransform,
24
+ YoloBatchSampler,
25
+ DataLoader,
26
+ InfiniteSampler,
27
+ MosaicDetection,
28
+ )
29
+
30
+ dataset = VOCDetection(
31
+ data_dir='/data/Datasets/VOCdevkit',
32
+ image_sets=[('2007', 'trainval'), ('2012', 'trainval')],
33
+ img_size=self.input_size,
34
+ preproc=TrainTransform(
35
+ rgb_means=(0.485, 0.456, 0.406),
36
+ std=(0.229, 0.224, 0.225),
37
+ max_labels=50,
38
+ ),
39
+ )
40
+
41
+ dataset = MosaicDetection(
42
+ dataset,
43
+ mosaic=not no_aug,
44
+ img_size=self.input_size,
45
+ preproc=TrainTransform(
46
+ rgb_means=(0.485, 0.456, 0.406),
47
+ std=(0.229, 0.224, 0.225),
48
+ max_labels=120,
49
+ ),
50
+ degrees=self.degrees,
51
+ translate=self.translate,
52
+ scale=self.scale,
53
+ shear=self.shear,
54
+ perspective=self.perspective,
55
+ enable_mixup=self.enable_mixup,
56
+ )
57
+
58
+ self.dataset = dataset
59
+
60
+ if is_distributed:
61
+ batch_size = batch_size // dist.get_world_size()
62
+ sampler = InfiniteSampler(
63
+ len(self.dataset), seed=self.seed if self.seed else 0
64
+ )
65
+ else:
66
+ sampler = torch.utils.data.RandomSampler(self.dataset)
67
+
68
+ batch_sampler = YoloBatchSampler(
69
+ sampler=sampler,
70
+ batch_size=batch_size,
71
+ drop_last=False,
72
+ input_dimension=self.input_size,
73
+ mosaic=not no_aug,
74
+ )
75
+
76
+ dataloader_kwargs = {"num_workers": self.data_num_workers, "pin_memory": True}
77
+ dataloader_kwargs["batch_sampler"] = batch_sampler
78
+ train_loader = DataLoader(self.dataset, **dataloader_kwargs)
79
+
80
+ return train_loader
81
+
82
+ def get_eval_loader(self, batch_size, is_distributed, testdev=False):
83
+ from yolox.data import VOCDetection, ValTransform
84
+
85
+ valdataset = VOCDetection(
86
+ data_dir='/data/Datasets/VOCdevkit',
87
+ image_sets=[('2007', 'test')],
88
+ img_size=self.test_size,
89
+ preproc=ValTransform(
90
+ rgb_means=(0.485, 0.456, 0.406),
91
+ std=(0.229, 0.224, 0.225),
92
+ ),
93
+ )
94
+
95
+ if is_distributed:
96
+ batch_size = batch_size // dist.get_world_size()
97
+ sampler = torch.utils.data.distributed.DistributedSampler(
98
+ valdataset, shuffle=False
99
+ )
100
+ else:
101
+ sampler = torch.utils.data.SequentialSampler(valdataset)
102
+
103
+ dataloader_kwargs = {
104
+ "num_workers": self.data_num_workers,
105
+ "pin_memory": True,
106
+ "sampler": sampler,
107
+ }
108
+ dataloader_kwargs["batch_size"] = batch_size
109
+ val_loader = torch.utils.data.DataLoader(valdataset, **dataloader_kwargs)
110
+
111
+ return val_loader
112
+
113
+ def get_evaluator(self, batch_size, is_distributed, testdev=False):
114
+ from yolox.evalutors import VOCEvaluator
115
+
116
+ val_loader = self.get_eval_loader(batch_size, is_distributed, testdev=testdev)
117
+ evaluator = VOCEvaluator(
118
+ dataloader=val_loader,
119
+ img_size=self.test_size,
120
+ confthre=self.test_conf,
121
+ nmsthre=self.nmsthre,
122
+ num_classes=self.num_classes,
123
+ )
124
+ return evaluator
requirements.txt CHANGED
@@ -12,3 +12,6 @@ Pillow
12
  skimage
13
  thop
14
  ninja
 
 
 
 
12
  skimage
13
  thop
14
  ninja
15
+ tabulate
16
+ tensorboard
17
+ onnxruntime
tools/demo.py CHANGED
@@ -66,12 +66,6 @@ def make_parser():
66
  action="store_true",
67
  help="Using TensorRT model for testing.",
68
  )
69
- parser.add_argument(
70
- "opts",
71
- help="Modify config options using the command-line",
72
- default=None,
73
- nargs=argparse.REMAINDER,
74
- )
75
  return parser
76
 
77
 
@@ -137,13 +131,14 @@ class Predictor(object):
137
  def visual(self, output, img_info, cls_conf=0.35):
138
  ratio = img_info['ratio']
139
  img = img_info['raw_img']
 
 
140
  output = output.cpu()
141
 
142
  bboxes = output[:, 0:4]
143
 
144
  # preprocessing: resize
145
  bboxes /= ratio
146
- bboxes = xyxy2xywh(bboxes)
147
 
148
  cls = output[:, 6]
149
  scores = output[:, 4] * output[:, 5]
@@ -193,7 +188,7 @@ def imageflow_demo(predictor, vis_folder, current_time, args):
193
  ret_val, frame = cap.read()
194
  if ret_val:
195
  outputs, img_info = predictor.inference(frame)
196
- result_frame = predictor.visualize(outputs[0], img_info)
197
  if args.save_result:
198
  vid_writer.write(result_frame)
199
  ch = cv2.waitKey(1)
@@ -258,7 +253,7 @@ def main(exp, args):
258
  "TensorRT model is not support model fusing!"
259
  trt_file = os.path.join(file_name, "model_trt.pth")
260
  assert os.path.exists(trt_file), (
261
- "TensorRT model is not found!\n Run python3 yolox/deploy/trt.py first!"
262
  )
263
  model.head.decode_in_inference = False
264
  decoder = model.head.decode_outputs
 
66
  action="store_true",
67
  help="Using TensorRT model for testing.",
68
  )
 
 
 
 
 
 
69
  return parser
70
 
71
 
 
131
  def visual(self, output, img_info, cls_conf=0.35):
132
  ratio = img_info['ratio']
133
  img = img_info['raw_img']
134
+ if output is None:
135
+ return img
136
  output = output.cpu()
137
 
138
  bboxes = output[:, 0:4]
139
 
140
  # preprocessing: resize
141
  bboxes /= ratio
 
142
 
143
  cls = output[:, 6]
144
  scores = output[:, 4] * output[:, 5]
 
188
  ret_val, frame = cap.read()
189
  if ret_val:
190
  outputs, img_info = predictor.inference(frame)
191
+ result_frame = predictor.visual(outputs[0], img_info)
192
  if args.save_result:
193
  vid_writer.write(result_frame)
194
  ch = cv2.waitKey(1)
 
253
  "TensorRT model is not support model fusing!"
254
  trt_file = os.path.join(file_name, "model_trt.pth")
255
  assert os.path.exists(trt_file), (
256
+ "TensorRT model is not found!\n Run python3 tools/trt.py first!"
257
  )
258
  model.head.decode_in_inference = False
259
  decoder = model.head.decode_outputs
yolox/data/datasets/coco.py CHANGED
@@ -46,29 +46,20 @@ class COCODataset(Dataset):
46
  cats = self.coco.loadCats(self.coco.getCatIds())
47
  self._classes = tuple([c["name"] for c in cats])
48
  self.name = name
49
- self.max_labels = 50
50
  self.img_size = img_size
51
  self.preproc = preproc
52
 
53
  def __len__(self):
54
  return len(self.ids)
55
 
56
- def pull_item(self, index):
57
  id_ = self.ids[index]
 
 
58
 
59
  im_ann = self.coco.loadImgs(id_)[0]
60
  width = im_ann["width"]
61
  height = im_ann["height"]
62
- anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
63
- annotations = self.coco.loadAnns(anno_ids)
64
-
65
- # load image and preprocess
66
- img_file = os.path.join(
67
- self.data_dir, self.name, "{:012}".format(id_) + ".jpg"
68
- )
69
-
70
- img = cv2.imread(img_file)
71
- assert img is not None
72
 
73
  # load labels
74
  valid_objs = []
@@ -90,6 +81,25 @@ class COCODataset(Dataset):
90
  res[ix, 0:4] = obj["clean_bbox"]
91
  res[ix, 4] = cls
92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  img_info = (height, width)
94
 
95
  return img, res, img_info, id_
@@ -105,7 +115,7 @@ class COCODataset(Dataset):
105
  Returns:
106
  img (numpy.ndarray): pre-processed image
107
  padded_labels (torch.Tensor): pre-processed label data.
108
- The shape is :math:`[self.max_labels, 5]`.
109
  each label consists of [class, xc, yc, w, h]:
110
  class (float): class index.
111
  xc, yc (float) : center of bbox whose values range from 0 to 1.
 
46
  cats = self.coco.loadCats(self.coco.getCatIds())
47
  self._classes = tuple([c["name"] for c in cats])
48
  self.name = name
 
49
  self.img_size = img_size
50
  self.preproc = preproc
51
 
52
  def __len__(self):
53
  return len(self.ids)
54
 
55
+ def load_anno(self, index):
56
  id_ = self.ids[index]
57
+ anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
58
+ annotations = self.coco.loadAnns(anno_ids)
59
 
60
  im_ann = self.coco.loadImgs(id_)[0]
61
  width = im_ann["width"]
62
  height = im_ann["height"]
 
 
 
 
 
 
 
 
 
 
63
 
64
  # load labels
65
  valid_objs = []
 
81
  res[ix, 0:4] = obj["clean_bbox"]
82
  res[ix, 4] = cls
83
 
84
+ return res
85
+
86
+ def pull_item(self, index):
87
+ id_ = self.ids[index]
88
+
89
+ im_ann = self.coco.loadImgs(id_)[0]
90
+ width = im_ann["width"]
91
+ height = im_ann["height"]
92
+
93
+ # load image and preprocess
94
+ img_file = os.path.join(
95
+ self.data_dir, self.name, "{:012}".format(id_) + ".jpg"
96
+ )
97
+
98
+ img = cv2.imread(img_file)
99
+ assert img is not None
100
+
101
+ # load anno
102
+ res = self.load_anno(index)
103
  img_info = (height, width)
104
 
105
  return img, res, img_info, id_
 
115
  Returns:
116
  img (numpy.ndarray): pre-processed image
117
  padded_labels (torch.Tensor): pre-processed label data.
118
+ The shape is :math:`[max_labels, 5]`.
119
  each label consists of [class, xc, yc, w, h]:
120
  class (float): class index.
121
  xc, yc (float) : center of bbox whose values range from 0 to 1.
yolox/data/datasets/mosaicdetection.py CHANGED
@@ -93,7 +93,6 @@ class MosaicDetection(Dataset):
93
  labels[:, 1] = scale * _labels[:, 1] + padh
94
  labels[:, 2] = scale * _labels[:, 2] + padw
95
  labels[:, 3] = scale * _labels[:, 3] + padh
96
-
97
  labels4.append(labels)
98
 
99
  if len(labels4):
@@ -136,9 +135,7 @@ class MosaicDetection(Dataset):
136
  cp_labels = []
137
  while len(cp_labels) == 0:
138
  cp_index = random.randint(0, self.__len__() - 1)
139
- id_ = self._dataset.ids[cp_index]
140
- anno_ids = self._dataset.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
141
- cp_labels = self._dataset.coco.loadAnns(anno_ids)
142
  img, cp_labels, _, _ = self._dataset.pull_item(cp_index)
143
 
144
  if len(img.shape) == 3:
 
93
  labels[:, 1] = scale * _labels[:, 1] + padh
94
  labels[:, 2] = scale * _labels[:, 2] + padw
95
  labels[:, 3] = scale * _labels[:, 3] + padh
 
96
  labels4.append(labels)
97
 
98
  if len(labels4):
 
135
  cp_labels = []
136
  while len(cp_labels) == 0:
137
  cp_index = random.randint(0, self.__len__() - 1)
138
+ cp_labels = self._dataset.load_anno(cp_index)
 
 
139
  img, cp_labels, _, _ = self._dataset.pull_item(cp_index)
140
 
141
  if len(img.shape) == 3:
yolox/data/datasets/voc.py CHANGED
@@ -19,16 +19,6 @@ from yolox.evalutors.voc_eval import voc_eval
19
  from .datasets_wrapper import Dataset
20
  from .voc_classes import VOC_CLASSES
21
 
22
- # for making bounding boxes pretty
23
- COLORS = (
24
- (255, 0, 0, 128),
25
- (0, 255, 0, 128),
26
- (0, 0, 255, 128),
27
- (0, 255, 255, 128),
28
- (255, 0, 255, 128),
29
- (255, 255, 0, 128),
30
- )
31
-
32
 
33
  class AnnotationTransform(object):
34
 
@@ -100,16 +90,17 @@ class VOCDetection(Dataset):
100
 
101
  def __init__(
102
  self,
103
- root,
104
- image_sets,
 
105
  preproc=None,
106
  target_transform=AnnotationTransform(),
107
- input_dim=(416, 416),
108
  dataset_name="VOC0712",
109
  ):
110
- super().__init__(input_dim)
111
- self.root = root
112
  self.image_set = image_sets
 
113
  self.preproc = preproc
114
  self.target_transform = target_transform
115
  self.name = dataset_name
@@ -125,59 +116,16 @@ class VOCDetection(Dataset):
125
  ):
126
  self.ids.append((rootpath, line.strip()))
127
 
128
- @Dataset.resize_getitem
129
- def __getitem__(self, index):
130
- img_id = self.ids[index]
131
- target = ET.parse(self._annopath % img_id).getroot()
132
- img = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
133
- # img = Image.open(self._imgpath % img_id).convert('RGB')
134
-
135
- height, width, _ = img.shape
136
-
137
- if self.target_transform is not None:
138
- target = self.target_transform(target)
139
-
140
- if self.preproc is not None:
141
- img, target = self.preproc(img, target, self.input_dim)
142
- # print(img.size())
143
-
144
- img_info = (width, height)
145
-
146
- return img, target, img_info, img_id
147
-
148
  def __len__(self):
149
  return len(self.ids)
150
 
151
- def pull_image(self, index):
152
- """Returns the original image object at index in PIL form
153
-
154
- Note: not using self.__getitem__(), as any transformations passed in
155
- could mess up this functionality.
156
-
157
- Argument:
158
- index (int): index of img to show
159
- Return:
160
- PIL img
161
- """
162
  img_id = self.ids[index]
163
- return cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
164
-
165
- def pull_anno(self, index):
166
- """Returns the original annotation of image at index
167
-
168
- Note: not using self.__getitem__(), as any transformations passed in
169
- could mess up this functionality.
170
 
171
- Argument:
172
- index (int): index of img to get annotation of
173
- Return:
174
- list: [img_id, [(label, bbox coords),...]]
175
- eg: ('001718', [('dog', (96, 13, 438, 332))])
176
- """
177
- img_id = self.ids[index]
178
- anno = ET.parse(self._annopath % img_id).getroot()
179
- gt = self.target_transform(anno, 1, 1)
180
- return img_id[1], gt
181
 
182
  def pull_item(self, index):
183
  """Returns the original image and target at an index for mixup
@@ -191,14 +139,21 @@ class VOCDetection(Dataset):
191
  img, target
192
  """
193
  img_id = self.ids[index]
194
- target = ET.parse(self._annopath % img_id).getroot()
195
  img = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
196
-
197
  height, width, _ = img.shape
198
 
 
 
199
  img_info = (width, height)
200
- if self.target_transform is not None:
201
- target = self.target_transform(target)
 
 
 
 
 
 
 
202
 
203
  return img, target, img_info, img_id
204
 
@@ -212,7 +167,7 @@ class VOCDetection(Dataset):
212
  all_boxes[class][image] = [] or np.array of shape #dets x 5
213
  """
214
  self._write_voc_results_file(all_boxes)
215
- IouTh = np.linspace(0.5, 0.95, np.round((0.95 - 0.5) / 0.05) + 1, endpoint=True)
216
  mAPs = []
217
  for iou in IouTh:
218
  mAP = self._do_python_eval(output_dir, iou)
@@ -270,7 +225,7 @@ class VOCDetection(Dataset):
270
  aps = []
271
  # The PASCAL VOC metric changed in 2010
272
  use_07_metric = True if int(self._year) < 2010 else False
273
- print("VOC07 metric? " + ("Yes" if use_07_metric else "No"))
274
  if output_dir is not None and not os.path.isdir(output_dir):
275
  os.mkdir(output_dir)
276
  for i, cls in enumerate(VOC_CLASSES):
 
19
  from .datasets_wrapper import Dataset
20
  from .voc_classes import VOC_CLASSES
21
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  class AnnotationTransform(object):
24
 
 
90
 
91
  def __init__(
92
  self,
93
+ data_dir,
94
+ image_sets=[('2007', 'trainval'), ('2012', 'trainval')],
95
+ img_size=(416, 416),
96
  preproc=None,
97
  target_transform=AnnotationTransform(),
 
98
  dataset_name="VOC0712",
99
  ):
100
+ super().__init__(img_size)
101
+ self.root = data_dir
102
  self.image_set = image_sets
103
+ self.img_size = img_size
104
  self.preproc = preproc
105
  self.target_transform = target_transform
106
  self.name = dataset_name
 
116
  ):
117
  self.ids.append((rootpath, line.strip()))
118
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
  def __len__(self):
120
  return len(self.ids)
121
 
122
+ def load_anno(self, index):
 
 
 
 
 
 
 
 
 
 
123
  img_id = self.ids[index]
124
+ target = ET.parse(self._annopath % img_id).getroot()
125
+ if self.target_transform is not None:
126
+ target = self.target_transform(target)
 
 
 
 
127
 
128
+ return target
 
 
 
 
 
 
 
 
 
129
 
130
  def pull_item(self, index):
131
  """Returns the original image and target at an index for mixup
 
139
  img, target
140
  """
141
  img_id = self.ids[index]
 
142
  img = cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
 
143
  height, width, _ = img.shape
144
 
145
+ target = self.load_anno(index)
146
+
147
  img_info = (width, height)
148
+
149
+ return img, target, img_info, index
150
+
151
+ @Dataset.resize_getitem
152
+ def __getitem__(self, index):
153
+ img, target, img_info, img_id = self.pull_item(index)
154
+
155
+ if self.preproc is not None:
156
+ img, target = self.preproc(img, target, self.input_dim)
157
 
158
  return img, target, img_info, img_id
159
 
 
167
  all_boxes[class][image] = [] or np.array of shape #dets x 5
168
  """
169
  self._write_voc_results_file(all_boxes)
170
+ IouTh = np.linspace(0.5, 0.95, int(np.round((0.95 - 0.5) / 0.05)) + 1, endpoint=True)
171
  mAPs = []
172
  for iou in IouTh:
173
  mAP = self._do_python_eval(output_dir, iou)
 
225
  aps = []
226
  # The PASCAL VOC metric changed in 2010
227
  use_07_metric = True if int(self._year) < 2010 else False
228
+ print("Eval IoU : {:.2f}".format(iou))
229
  if output_dir is not None and not os.path.isdir(output_dir):
230
  os.mkdir(output_dir)
231
  for i, cls in enumerate(VOC_CLASSES):
yolox/{evalutors β†’ evaluators}/__init__.py RENAMED
File without changes
yolox/{evalutors β†’ evaluators}/coco_evaluator.py RENAMED
File without changes
yolox/{evalutors β†’ evaluators}/voc_eval.py RENAMED
File without changes
yolox/evaluators/voc_evaluator.py ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding:utf-8 -*-
3
+ # Copyright (c) Megvii, Inc. and its affiliates.
4
+
5
+ import sys
6
+ import tempfile
7
+ import time
8
+ from collections import ChainMap
9
+ from loguru import logger
10
+ from tqdm import tqdm
11
+
12
+ import numpy as np
13
+
14
+ import torch
15
+
16
+ from yolox.utils import gather, is_main_process, postprocess, synchronize, time_synchronized
17
+
18
+
19
+ class VOCEvaluator:
20
+ """
21
+ VOC AP Evaluation class.
22
+ """
23
+
24
+ def __init__(
25
+ self, dataloader, img_size, confthre, nmsthre, num_classes,
26
+ ):
27
+ """
28
+ Args:
29
+ dataloader (Dataloader): evaluate dataloader.
30
+ img_size (int): image size after preprocess. images are resized
31
+ to squares whose shape is (img_size, img_size).
32
+ confthre (float): confidence threshold ranging from 0 to 1, which
33
+ is defined in the config file.
34
+ nmsthre (float): IoU threshold of non-max supression ranging from 0 to 1.
35
+ """
36
+ self.dataloader = dataloader
37
+ self.img_size = img_size
38
+ self.confthre = confthre
39
+ self.nmsthre = nmsthre
40
+ self.num_classes = num_classes
41
+ self.num_images = len(dataloader.dataset)
42
+
43
+ def evaluate(
44
+ self, model, distributed=False, half=False, trt_file=None, decoder=None, test_size=None
45
+ ):
46
+ """
47
+ VOC average precision (AP) Evaluation. Iterate inference on the test dataset
48
+ and the results are evaluated by COCO API.
49
+
50
+ NOTE: This function will change training mode to False, please save states if needed.
51
+
52
+ Args:
53
+ model : model to evaluate.
54
+
55
+ Returns:
56
+ ap50_95 (float) : COCO style AP of IoU=50:95
57
+ ap50 (float) : VOC 2007 metric AP of IoU=50
58
+ summary (sr): summary info of evaluation.
59
+ """
60
+ # TODO half to amp_test
61
+ tensor_type = torch.cuda.HalfTensor if half else torch.cuda.FloatTensor
62
+ model = model.eval()
63
+ if half:
64
+ model = model.half()
65
+ ids = []
66
+ data_list = {}
67
+ progress_bar = tqdm if is_main_process() else iter
68
+
69
+ inference_time = 0
70
+ nms_time = 0
71
+ n_samples = len(self.dataloader) - 1
72
+
73
+ if trt_file is not None:
74
+ from torch2trt import TRTModule
75
+ model_trt = TRTModule()
76
+ model_trt.load_state_dict(torch.load(trt_file))
77
+
78
+ x = torch.ones(1, 3, test_size[0], test_size[1]).cuda()
79
+ model(x)
80
+ model = model_trt
81
+
82
+ for cur_iter, (imgs, _, info_imgs, ids) in enumerate(progress_bar(self.dataloader)):
83
+ with torch.no_grad():
84
+ imgs = imgs.type(tensor_type)
85
+
86
+ # skip the the last iters since batchsize might be not enough for batch inference
87
+ is_time_record = cur_iter < len(self.dataloader) - 1
88
+ if is_time_record:
89
+ start = time.time()
90
+
91
+ outputs = model(imgs)
92
+ if decoder is not None:
93
+ outputs = decoder(outputs, dtype=outputs.type())
94
+
95
+ if is_time_record:
96
+ infer_end = time_synchronized()
97
+ inference_time += infer_end - start
98
+
99
+ outputs = postprocess(
100
+ outputs, self.num_classes, self.confthre, self.nmsthre
101
+ )
102
+ if is_time_record:
103
+ nms_end = time_synchronized()
104
+ nms_time += nms_end - infer_end
105
+
106
+ data_list.update(self.convert_to_voc_format(outputs, info_imgs, ids))
107
+
108
+ statistics = torch.cuda.FloatTensor([inference_time, nms_time, n_samples])
109
+ if distributed:
110
+ data_list = gather(data_list, dst=0)
111
+ data_list = ChainMap(*data_list)
112
+ torch.distributed.reduce(statistics, dst=0)
113
+
114
+ eval_results = self.evaluate_prediction(data_list, statistics)
115
+ synchronize()
116
+ return eval_results
117
+
118
+ def convert_to_voc_format(self, outputs, info_imgs, ids):
119
+ predictions = {}
120
+ for (output, img_h, img_w, img_id) in zip(outputs, info_imgs[0], info_imgs[1], ids):
121
+ if output is None:
122
+ predictions[int(img_id)] = (None, None, None)
123
+ continue
124
+ output = output.cpu()
125
+
126
+ bboxes = output[:, 0:4]
127
+
128
+ # preprocessing: resize
129
+ scale = min(self.img_size[0] / float(img_h), self.img_size[1] / float(img_w))
130
+ bboxes /= scale
131
+
132
+ cls = output[:, 6]
133
+ scores = output[:, 4] * output[:, 5]
134
+
135
+ predictions[int(img_id)] = (bboxes, cls, scores)
136
+ return predictions
137
+
138
+ def evaluate_prediction(self, data_dict, statistics):
139
+ if not is_main_process():
140
+ return 0, 0, None
141
+
142
+ logger.info("Evaluate in main process...")
143
+
144
+ inference_time = statistics[0].item()
145
+ nms_time = statistics[1].item()
146
+ n_samples = statistics[2].item()
147
+
148
+ a_infer_time = 1000 * inference_time / (n_samples * self.dataloader.batch_size)
149
+ a_nms_time = 1000 * nms_time / (n_samples * self.dataloader.batch_size)
150
+
151
+ time_info = ", ".join(
152
+ ["Average {} time: {:.2f} ms".format(k, v) for k, v in zip(
153
+ ["forward", "NMS", "inference"],
154
+ [a_infer_time, a_nms_time, (a_infer_time + a_nms_time)]
155
+ )]
156
+ )
157
+
158
+ info = time_info + "\n"
159
+
160
+ all_boxes = [[[] for _ in range(self.num_images)] for _ in range(self.num_classes)]
161
+ for img_num in range(self.num_images):
162
+ bboxes, cls, scores = data_dict[img_num]
163
+ if bboxes is None:
164
+ for j in range(self.num_classes):
165
+ all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
166
+ continue
167
+ for j in range(self.num_classes):
168
+ mask_c = cls == j
169
+ if sum(mask_c) == 0:
170
+ all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
171
+ continue
172
+
173
+ c_dets = torch.cat((bboxes, scores.unsqueeze(1)), dim=1)
174
+ all_boxes[j][img_num] = c_dets[mask_c].numpy()
175
+
176
+ sys.stdout.write(
177
+ "im_eval: {:d}/{:d} \r".format(img_num + 1, self.num_images)
178
+ )
179
+ sys.stdout.flush()
180
+
181
+ with tempfile.TemporaryDirectory() as tempdir:
182
+ mAP50, mAP70 = self.dataloader.dataset.evaluate_detections(all_boxes, tempdir)
183
+ return mAP50, mAP70, info
yolox/evalutors/voc_evaluator.py DELETED
@@ -1,202 +0,0 @@
1
- #!/usr/bin/env python3
2
- # -*- coding:utf-8 -*-
3
- # Copyright (c) Megvii, Inc. and its affiliates.
4
-
5
- # NOTE: this file is not finished.
6
- import sys
7
- import tempfile
8
- import time
9
- from tqdm import tqdm
10
-
11
- import torch
12
-
13
- from yolox.data.dataset.vocdataset import ValTransform
14
- from yolox.utils import get_rank, is_main_process, make_pred_vis, make_vis, synchronize
15
-
16
-
17
- def _accumulate_predictions_from_multiple_gpus(predictions_per_gpu):
18
- all_predictions = dist.scatter_gather(predictions_per_gpu)
19
- if not is_main_process():
20
- return
21
- # merge the list of dicts
22
- predictions = {}
23
- for p in all_predictions:
24
- predictions.update(p)
25
- # convert a dict where the key is the index in a list
26
- image_ids = list(sorted(predictions.keys()))
27
- if len(image_ids) != image_ids[-1] + 1:
28
- print("num_imgs: ", len(image_ids))
29
- print("last img_id: ", image_ids[-1])
30
- print(
31
- "Number of images that were gathered from multiple processes is not "
32
- "a contiguous set. Some images might be missing from the evaluation"
33
- )
34
-
35
- # convert to a list
36
- predictions = [predictions[i] for i in image_ids]
37
- return predictions
38
-
39
-
40
- class VOCEvaluator:
41
- """
42
- COCO AP Evaluation class.
43
- All the data in the val2017 dataset are processed \
44
- and evaluated by COCO API.
45
- """
46
-
47
- def __init__(self, data_dir, img_size, confthre, nmsthre, vis=False):
48
- """
49
- Args:
50
- data_dir (str): dataset root directory
51
- img_size (int): image size after preprocess. images are resized \
52
- to squares whose shape is (img_size, img_size).
53
- confthre (float):
54
- confidence threshold ranging from 0 to 1, \
55
- which is defined in the config file.
56
- nmsthre (float):
57
- IoU threshold of non-max supression ranging from 0 to 1.
58
- """
59
- test_sets = [("2007", "test")]
60
- self.dataset = VOCDetection(
61
- root=data_dir,
62
- image_sets=test_sets,
63
- input_dim=img_size,
64
- preproc=ValTransform(
65
- rgb_means=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)
66
- ),
67
- )
68
- self.num_images = len(self.dataset)
69
- self.dataloader = torch.utils.data.DataLoader(
70
- self.dataset, batch_size=1, shuffle=False, num_workers=0
71
- )
72
- self.img_size = img_size
73
- self.confthre = confthre
74
- self.nmsthre = nmsthre
75
- self.vis = vis
76
-
77
- def evaluate(self, model, distributed=False):
78
- """
79
- COCO average precision (AP) Evaluation. Iterate inference on the test dataset
80
- and the results are evaluated by COCO API.
81
- Args:
82
- model : model object
83
- Returns:
84
- ap50_95 (float) : calculated COCO AP for IoU=50:95
85
- ap50 (float) : calculated COCO AP for IoU=50
86
- """
87
- if isinstance(model, torch.nn.parallel.DistributedDataParallel):
88
- model = model.module
89
- model.eval()
90
- cuda = torch.cuda.is_available()
91
- Tensor = torch.cuda.FloatTensor if cuda else torch.FloatTensor
92
-
93
- ids = []
94
- data_dict = []
95
- dataiterator = iter(self.dataloader)
96
- img_num = 0
97
- indices = list(range(self.num_images))
98
- dis_indices = indices[get_rank() :: distributed_util.get_world_size()]
99
- progress_bar = tqdm if distributed_util.is_main_process() else iter
100
- num_classes = 20
101
- predictions = {}
102
-
103
- if is_main_process():
104
- inference_time = 0
105
- nms_time = 0
106
- n_samples = len(dis_indices)
107
-
108
- for i in progress_bar(dis_indices):
109
- img, _, info_img, id_ = self.dataset[i] # load a batch
110
- info_img = [float(info) for info in info_img]
111
- ids.append(id_)
112
- with torch.no_grad():
113
- img = Variable(img.type(Tensor).unsqueeze(0))
114
-
115
- if is_main_process() and i > 9:
116
- start = time.time()
117
-
118
- if self.vis:
119
- outputs, fuse_weights, fused_f = model(img)
120
- else:
121
- outputs = model(img)
122
-
123
- if is_main_process() and i > 9:
124
- infer_end = time.time()
125
- inference_time += infer_end - start
126
-
127
- outputs = postprocess(outputs, 20, self.confthre, self.nmsthre)
128
-
129
- if is_main_process() and i > 9:
130
- nms_end = time.time()
131
- nms_time += nms_end - infer_end
132
-
133
- if outputs[0] is None:
134
- predictions[i] = (None, None, None)
135
- continue
136
- outputs = outputs[0].cpu().data
137
-
138
- bboxes = outputs[:, 0:4]
139
- bboxes[:, 0::2] *= info_img[0] / self.img_size[0]
140
- bboxes[:, 1::2] *= info_img[1] / self.img_size[1]
141
- cls = outputs[:, 6]
142
- scores = outputs[:, 4] * outputs[:, 5]
143
- predictions[i] = (bboxes, cls, scores)
144
-
145
- if self.vis:
146
- o_img, _, _, _ = self.dataset.pull_item(i)
147
- make_vis("VOC", i, o_img, fuse_weights, fused_f)
148
- class_names = self.dataset._classes
149
-
150
- bbox = bboxes.clone()
151
- bbox[:, 2] = bbox[:, 2] - bbox[:, 0]
152
- bbox[:, 3] = bbox[:, 3] - bbox[:, 1]
153
-
154
- make_pred_vis("VOC", i, o_img, class_names, bbox, cls, scores)
155
-
156
- if is_main_process():
157
- o_img, _, _, _ = self.dataset.pull_item(i)
158
- class_names = self.dataset._classes
159
- bbox = bboxes.clone()
160
- bbox[:, 2] = bbox[:, 2] - bbox[:, 0]
161
- bbox[:, 3] = bbox[:, 3] - bbox[:, 1]
162
- make_pred_vis("VOC", i, o_img, class_names, bbox, cls, scores)
163
-
164
- synchronize()
165
- predictions = _accumulate_predictions_from_multiple_gpus(predictions)
166
- if not is_main_process():
167
- return 0, 0
168
-
169
- print("Main process Evaluating...")
170
-
171
- a_infer_time = 1000 * inference_time / (n_samples - 10)
172
- a_nms_time = 1000 * nms_time / (n_samples - 10)
173
-
174
- print(
175
- "Average forward time: %.2f ms, Average NMS time: %.2f ms, Average inference time: %.2f ms"
176
- % (a_infer_time, a_nms_time, (a_infer_time + a_nms_time))
177
- )
178
-
179
- all_boxes = [[[] for _ in range(self.num_images)] for _ in range(num_classes)]
180
- for img_num in range(self.num_images):
181
- bboxes, cls, scores = predictions[img_num]
182
- if bboxes is None:
183
- for j in range(num_classes):
184
- all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
185
- continue
186
- for j in range(num_classes):
187
- mask_c = cls == j
188
- if sum(mask_c) == 0:
189
- all_boxes[j][img_num] = np.empty([0, 5], dtype=np.float32)
190
- continue
191
-
192
- c_dets = torch.cat((bboxes, scores.unsqueeze(1)), dim=1)
193
- all_boxes[j][img_num] = c_dets[mask_c].numpy()
194
-
195
- sys.stdout.write(
196
- "im_eval: {:d}/{:d} \r".format(img_num + 1, self.num_images)
197
- )
198
- sys.stdout.flush()
199
-
200
- with tempfile.TemporaryDirectory() as tempdir:
201
- mAP50, mAP70 = self.dataset.evaluate_detections(all_boxes, tempdir)
202
- return mAP50, mAP70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
yolox/models/yolo_head.py CHANGED
@@ -166,6 +166,13 @@ class YOLOXHead(nn.Module):
166
  torch.zeros(1, grid.shape[1]).fill_(stride_this_level).type_as(xin[0])
167
  )
168
  if self.use_l1:
 
 
 
 
 
 
 
169
  origin_preds.append(reg_output.clone())
170
 
171
  else:
@@ -193,7 +200,7 @@ class YOLOXHead(nn.Module):
193
  batch_size = output.shape[0]
194
  n_ch = 5 + self.num_classes
195
  hsize, wsize = output.shape[-2:]
196
- if grid.shape[2:3] != output.shape[2:3]:
197
  yv, xv = torch.meshgrid([torch.arange(hsize), torch.arange(wsize)])
198
  grid = torch.stack((xv, yv), 2).view(1, 1, hsize, wsize, 2).type(dtype)
199
  self.grids[k] = grid
 
166
  torch.zeros(1, grid.shape[1]).fill_(stride_this_level).type_as(xin[0])
167
  )
168
  if self.use_l1:
169
+ batch_size = reg_output.shape[0]
170
+ hsize, wsize = reg_output.shape[-2:]
171
+ reg_output = reg_output.view(batch_size, self.n_anchors, 4, hsize, wsize)
172
+ reg_output = (
173
+ reg_output.permute(0, 1, 3, 4, 2)
174
+ .reshape(batch_size, -1, 4)
175
+ )
176
  origin_preds.append(reg_output.clone())
177
 
178
  else:
 
200
  batch_size = output.shape[0]
201
  n_ch = 5 + self.num_classes
202
  hsize, wsize = output.shape[-2:]
203
+ if grid.shape[2:4] != output.shape[2:4]:
204
  yv, xv = torch.meshgrid([torch.arange(hsize), torch.arange(wsize)])
205
  grid = torch.stack((xv, yv), 2).view(1, 1, hsize, wsize, 2).type(dtype)
206
  self.grids[k] = grid
yolox/utils/visualize.py CHANGED
@@ -18,8 +18,8 @@ def vis(img, boxes, scores, cls_ids, conf=0.5, class_names=None):
18
  continue
19
  x0 = int(box[0])
20
  y0 = int(box[1])
21
- x1 = int(box[0] + box[2])
22
- y1 = int(box[1] + box[3])
23
 
24
  color = (_COLORS[cls_id] * 255).astype(np.uint8).tolist()
25
  text = '{}:{:.1f}%'.format(class_names[cls_id], score * 100)
 
18
  continue
19
  x0 = int(box[0])
20
  y0 = int(box[1])
21
+ x1 = int(box[2])
22
+ y1 = int(box[3])
23
 
24
  color = (_COLORS[cls_id] * 255).astype(np.uint8).tolist()
25
  text = '{}:{:.1f}%'.format(class_names[cls_id], score * 100)