|
# Object Detection Models on TensorFlow 2 |
|
|
|
**Note**: This repository is still under construction. |
|
More features and instructions will be added soon. |
|
|
|
## Prerequsite |
|
To get started, download the code from TensorFlow models GitHub repository or |
|
use the pre-installed Google Cloud VM. |
|
|
|
```bash |
|
git clone https://github.com/tensorflow/models.git |
|
``` |
|
|
|
Next, make sure to use TensorFlow 2.1+ on Google Cloud. Also here are |
|
a few package you need to install to get started: |
|
|
|
```bash |
|
sudo apt-get install -y python-tk && \ |
|
pip3 install -r ~/models/official/requirements.txt |
|
``` |
|
|
|
## Train RetinaNet on TPU |
|
|
|
### Train a vanilla ResNet-50 based RetinaNet. |
|
|
|
```bash |
|
TPU_NAME="<your GCP TPU name>" |
|
MODEL_DIR="<path to the directory to store model files>" |
|
RESNET_CHECKPOINT="<path to the pre-trained Resnet-50 checkpoint>" |
|
TRAIN_FILE_PATTERN="<path to the TFRecord training data>" |
|
EVAL_FILE_PATTERN="<path to the TFRecord validation data>" |
|
VAL_JSON_FILE="<path to the validation annotation JSON file>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu="${TPU_NAME?}" \ |
|
--model_dir="${MODEL_DIR?}" \ |
|
--mode=train \ |
|
--params_override="{ type: retinanet, train: { checkpoint: { path: ${RESNET_CHECKPOINT?}, prefix: resnet50/ }, train_file_pattern: ${TRAIN_FILE_PATTERN?} }, eval: { val_json_file: ${VAL_JSON_FILE?}, eval_file_pattern: ${EVAL_FILE_PATTERN?} } }" |
|
``` |
|
|
|
The pre-trained ResNet-50 checkpoint can be downloaded [here](https://storage.cloud.google.com/cloud-tpu-checkpoints/model-garden-vision/detection/resnet50-2018-02-07.tar.gz). |
|
|
|
Note: The ResNet implementation under |
|
[detection/](https://github.com/tensorflow/models/tree/master/official/vision/detection) |
|
is currently different from the one under |
|
[classification/](https://github.com/tensorflow/models/tree/master/official/vision/image_classification), |
|
so the checkpoints are not compatible. |
|
We will unify the implementation soon. |
|
|
|
|
|
|
|
### Train a custom RetinaNet using the config file. |
|
|
|
First, create a YAML config file, e.g. *my_retinanet.yaml*. This file specifies |
|
the parameters to be overridden, which should at least include the following |
|
fields. |
|
|
|
```YAML |
|
# my_retinanet.yaml |
|
type: 'retinanet' |
|
train: |
|
train_file_pattern: <path to the TFRecord training data> |
|
eval: |
|
eval_file_pattern: <path to the TFRecord validation data> |
|
val_json_file: <path to the validation annotation JSON file> |
|
``` |
|
|
|
Once the YAML config file is created, you can launch the training using the |
|
following command. |
|
|
|
```bash |
|
TPU_NAME="<your GCP TPU name>" |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu="${TPU_NAME?}" \ |
|
--model_dir="${MODEL_DIR?}" \ |
|
--mode=train \ |
|
--config_file="my_retinanet.yaml" |
|
``` |
|
|
|
## Train RetinaNet on GPU |
|
|
|
Training on GPU is similar to that on TPU. The major change is the strategy |
|
type (use "[mirrored](https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy)" for multiple GPU and |
|
"[one_device](https://www.tensorflow.org/api_docs/python/tf/distribute/OneDeviceStrategy)" for single GPU). |
|
|
|
Multi-GPUs example (assuming there are 8GPU connected to the host): |
|
|
|
```bash |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=mirrored \ |
|
--num_gpus=8 \ |
|
--model_dir="${MODEL_DIR?}" \ |
|
--mode=train \ |
|
--config_file="my_retinanet.yaml" |
|
``` |
|
|
|
```bash |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=one_device \ |
|
--num_gpus=1 \ |
|
--model_dir="${MODEL_DIR?}" \ |
|
--mode=train \ |
|
--config_file="my_retinanet.yaml" |
|
``` |
|
|
|
An example with inline configuration (YAML or JSON format): |
|
|
|
``` |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--model_dir=<model folder> \ |
|
--strategy_type=one_device \ |
|
--num_gpus=1 \ |
|
--mode=train \ |
|
--params_override="eval: |
|
eval_file_pattern: <Eval TFRecord file pattern> |
|
batch_size: 8 |
|
val_json_file: <COCO format groundtruth JSON file> |
|
predict: |
|
predict_batch_size: 8 |
|
architecture: |
|
use_bfloat16: False |
|
train: |
|
total_steps: 1 |
|
batch_size: 8 |
|
train_file_pattern: <Eval TFRecord file pattern> |
|
use_tpu: False |
|
" |
|
``` |
|
|
|
--- |
|
|
|
## Train Mask R-CNN on TPU |
|
|
|
### Train a vanilla ResNet-50 based Mask R-CNN. |
|
|
|
```bash |
|
TPU_NAME="<your GCP TPU name>" |
|
MODEL_DIR="<path to the directory to store model files>" |
|
RESNET_CHECKPOINT="<path to the pre-trained Resnet-50 checkpoint>" |
|
TRAIN_FILE_PATTERN="<path to the TFRecord training data>" |
|
EVAL_FILE_PATTERN="<path to the TFRecord validation data>" |
|
VAL_JSON_FILE="<path to the validation annotation JSON file>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu=${TPU_NAME} \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=mask_rcnn \ |
|
--params_override="{train: { checkpoint: { path: ${RESNET_CHECKPOINT}, prefix: resnet50/ }, train_file_pattern: ${TRAIN_FILE_PATTERN} }, eval: { val_json_file: ${VAL_JSON_FILE}, eval_file_pattern: ${EVAL_FILE_PATTERN} } }" |
|
``` |
|
|
|
The pre-trained ResNet-50 checkpoint can be downloaded [here](https://storage.cloud.google.com/cloud-tpu-checkpoints/model-garden-vision/detection/resnet50-2018-02-07.tar.gz). |
|
|
|
Note: The ResNet implementation under |
|
[detection/](https://github.com/tensorflow/models/tree/master/official/vision/detection) |
|
is currently different from the one under |
|
[classification/](https://github.com/tensorflow/models/tree/master/official/vision/image_classification), |
|
so the checkpoints are not compatible. |
|
We will unify the implementation soon. |
|
|
|
|
|
### Train a custom Mask R-CNN using the config file. |
|
|
|
First, create a YAML config file, e.g. *my_maskrcnn.yaml*. |
|
This file specifies the parameters to be overridden, |
|
which should at least include the following fields. |
|
|
|
```YAML |
|
# my_maskrcnn.yaml |
|
train: |
|
train_file_pattern: <path to the TFRecord training data> |
|
eval: |
|
eval_file_pattern: <path to the TFRecord validation data> |
|
val_json_file: <path to the validation annotation JSON file> |
|
``` |
|
|
|
Once the YAML config file is created, you can launch the training using the |
|
following command. |
|
|
|
```bash |
|
TPU_NAME="<your GCP TPU name>" |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu=${TPU_NAME} \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=mask_rcnn \ |
|
--config_file="my_maskrcnn.yaml" |
|
``` |
|
|
|
## Train Mask R-CNN on GPU |
|
|
|
Training on GPU is similar to that on TPU. The major change is the strategy type |
|
(use |
|
"[mirrored](https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy)" |
|
for multiple GPU and |
|
"[one_device](https://www.tensorflow.org/api_docs/python/tf/distribute/OneDeviceStrategy)" |
|
for single GPU). |
|
|
|
Multi-GPUs example (assuming there are 8GPU connected to the host): |
|
|
|
```bash |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=mirrored \ |
|
--num_gpus=8 \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=mask_rcnn \ |
|
--config_file="my_maskrcnn.yaml" |
|
``` |
|
|
|
```bash |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=one_device \ |
|
--num_gpus=1 \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=mask_rcnn \ |
|
--config_file="my_maskrcnn.yaml" |
|
``` |
|
|
|
An example with inline configuration (YAML or JSON format): |
|
|
|
``` |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--model_dir=<model folder> \ |
|
--strategy_type=one_device \ |
|
--num_gpus=1 \ |
|
--mode=train \ |
|
--model=mask_rcnn \ |
|
--params_override="eval: |
|
eval_file_pattern: <Eval TFRecord file pattern> |
|
batch_size: 8 |
|
val_json_file: <COCO format groundtruth JSON file> |
|
predict: |
|
predict_batch_size: 8 |
|
architecture: |
|
use_bfloat16: False |
|
train: |
|
total_steps: 1000 |
|
batch_size: 8 |
|
train_file_pattern: <Eval TFRecord file pattern> |
|
use_tpu: False |
|
" |
|
``` |
|
|
|
## Train ShapeMask on TPU |
|
|
|
### Train a ResNet-50 based ShapeMask. |
|
|
|
```bash |
|
TPU_NAME="<your GCP TPU name>" |
|
MODEL_DIR="<path to the directory to store model files>" |
|
RESNET_CHECKPOINT="<path to the pre-trained Resnet-50 checkpoint>" |
|
TRAIN_FILE_PATTERN="<path to the TFRecord training data>" |
|
EVAL_FILE_PATTERN="<path to the TFRecord validation data>" |
|
VAL_JSON_FILE="<path to the validation annotation JSON file>" |
|
SHAPE_PRIOR_PATH="<path to shape priors>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu=${TPU_NAME} \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=shapemask \ |
|
--params_override="{train: { checkpoint: { path: ${RESNET_CHECKPOINT}, prefix: resnet50/ }, train_file_pattern: ${TRAIN_FILE_PATTERN} }, eval: { val_json_file: ${VAL_JSON_FILE}, eval_file_pattern: ${EVAL_FILE_PATTERN} } shapemask_head: {use_category_for_mask: true, shape_prior_path: ${SHAPE_PRIOR_PATH}} }" |
|
``` |
|
|
|
The pre-trained ResNet-50 checkpoint can be downloaded [here](https://storage.cloud.google.com/cloud-tpu-checkpoints/model-garden-vision/detection/resnet50-2018-02-07.tar.gz). |
|
|
|
The shape priors can be downloaded [here] |
|
(https://storage.googleapis.com/cloud-tpu-checkpoints/shapemask/kmeans_class_priors_91x20x32x32.npy) |
|
|
|
|
|
### Train a custom ShapeMask using the config file. |
|
|
|
First, create a YAML config file, e.g. *my_shapemask.yaml*. |
|
This file specifies the parameters to be overridden: |
|
|
|
```YAML |
|
# my_shapemask.yaml |
|
train: |
|
train_file_pattern: <path to the TFRecord training data> |
|
total_steps: <total steps to train> |
|
batch_size: <training batch size> |
|
eval: |
|
eval_file_pattern: <path to the TFRecord validation data> |
|
val_json_file: <path to the validation annotation JSON file> |
|
batch_size: <evaluation batch size> |
|
shapemask_head: |
|
shape_prior_path: <path to shape priors> |
|
``` |
|
|
|
Once the YAML config file is created, you can launch the training using the |
|
following command. |
|
|
|
```bash |
|
TPU_NAME="<your GCP TPU name>" |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu=${TPU_NAME} \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=shapemask \ |
|
--config_file="my_shapemask.yaml" |
|
``` |
|
|
|
## Train ShapeMask on GPU |
|
|
|
Training on GPU is similar to that on TPU. The major change is the strategy type |
|
(use |
|
"[mirrored](https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy)" |
|
for multiple GPU and |
|
"[one_device](https://www.tensorflow.org/api_docs/python/tf/distribute/OneDeviceStrategy)" |
|
for single GPU). |
|
|
|
Multi-GPUs example (assuming there are 8GPU connected to the host): |
|
|
|
```bash |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=mirrored \ |
|
--num_gpus=8 \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=shapemask \ |
|
--config_file="my_shapemask.yaml" |
|
``` |
|
|
|
A single GPU example |
|
|
|
```bash |
|
MODEL_DIR="<path to the directory to store model files>" |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--strategy_type=one_device \ |
|
--num_gpus=1 \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=train \ |
|
--model=shapemask \ |
|
--config_file="my_shapemask.yaml" |
|
``` |
|
|
|
|
|
An example with inline configuration (YAML or JSON format): |
|
|
|
``` |
|
python3 ~/models/official/vision/detection/main.py \ |
|
--model_dir=<model folder> \ |
|
--strategy_type=one_device \ |
|
--num_gpus=1 \ |
|
--mode=train \ |
|
--model=shapemask \ |
|
--params_override="eval: |
|
eval_file_pattern: <Eval TFRecord file pattern> |
|
batch_size: 8 |
|
val_json_file: <COCO format groundtruth JSON file> |
|
train: |
|
total_steps: 1000 |
|
batch_size: 8 |
|
train_file_pattern: <Eval TFRecord file pattern> |
|
use_tpu: False |
|
" |
|
``` |
|
|
|
|
|
### Run the evaluation (after training) |
|
|
|
``` |
|
python3 /usr/share/models/official/vision/detection/main.py \ |
|
--strategy_type=tpu \ |
|
--tpu=${TPU_NAME} \ |
|
--model_dir=${MODEL_DIR} \ |
|
--mode=eval \ |
|
--model=shapemask \ |
|
--params_override="{eval: { val_json_file: ${VAL_JSON_FILE}, eval_file_pattern: ${EVAL_FILE_PATTERN}, eval_samples: 5000 } }" |
|
``` |
|
|
|
`MODEL_DIR` needs to point to the trained path of ShapeMask model. |
|
Change `strategy_type=mirrored` and `num_gpus=1` to run on a GPU. |
|
|
|
Note: The JSON groundtruth file is useful for [COCO dataset](http://cocodataset.org/#home) and can be |
|
downloaded from the [COCO website](http://cocodataset.org/#download). For custom dataset, it is unncessary because the groundtruth can be included in the TFRecord files. |
|
|
|
## References |
|
|
|
1. [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002). |
|
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. IEEE |
|
International Conference on Computer Vision (ICCV), 2017. |
|
|