Spaces:

NCTCMumbai
/

AdvaitBERT-AI_Explanability

Sleeping

App Files Files Community

AdvaitBERT-AI_Explanability / models /research /deeplab /g3doc /quantize.md

NCTCMumbai

Upload 2583 files

18ddfe2 verified over 1 year ago

preview code

raw

history blame contribute delete

5.06 kB

	# Quantize DeepLab model for faster on-device inference

	This page describes the steps required to quantize DeepLab model and convert it
	to TFLite for on-device inference. The main steps include:

	1. Quantization-aware training
	1. Exporting model
	1. Converting to TFLite FlatBuffer

	We provide details for each step below.

	## Quantization-aware training

	DeepLab supports two approaches to quantize your model.

	1. [Recommended] Training a non-quantized model until convergence. Then
	fine-tune the trained float model with quantization using a small learning
	rate (on PASCAL we use the value of 3e-5) . This fine-tuning step usually
	takes 2k to 5k steps to converge.

	1. Training a deeplab float model with delayed quantization. Usually we delay
	quantization until the last a few thousand steps in training.

	In the current implementation, quantization is only supported with 1)
	`num_clones=1` for training and 2) single scale inference for evaluation,
	visualization and model export. To get the best performance for the quantized
	model, we strongly recommend to train the float model with larger `num_clones`
	and then fine-tune the model with a single clone.

	Here shows the commandline to quantize deeplab model trained on PASCAL VOC
	dataset using fine-tuning:

	```
	# From tensorflow/models/research/
	python deeplab/train.py \
	--logtostderr \
	--training_number_of_steps=3000 \
	--train_split="train" \
	--model_variant="mobilenet_v2" \
	--output_stride=16 \
	--train_crop_size="513,513" \
	--train_batch_size=8 \
	--base_learning_rate=3e-5 \
	--dataset="pascal_voc_seg" \
	--initialize_last_layer \
	--quantize_delay_step=0 \
	--tf_initial_checkpoint=${PATH_TO_TRAINED_FLOAT_MODEL} \
	--train_logdir=${PATH_TO_TRAIN_DIR} \
	--dataset_dir=${PATH_TO_DATASET}
	```

	## Converting to TFLite FlatBuffer

	First use the following commandline to export your trained model.

	```
	# From tensorflow/models/research/
	python deeplab/export_model.py \
	--checkpoint_path=${CHECKPOINT_PATH} \
	--quantize_delay_step=0 \
	--export_path=${OUTPUT_DIR}/frozen_inference_graph.pb

	```

	Commandline below shows how to convert exported graphdef to TFlite model.

	```
	tflite_convert \
	--graph_def_file=${OUTPUT_DIR}/frozen_inference_graph.pb \
	--output_file=${OUTPUT_DIR}/frozen_inference_graph.tflite \
	--output_format=TFLITE \
	--input_shape=1,513,513,3 \
	--input_arrays="MobilenetV2/MobilenetV2/input" \
	--inference_type=QUANTIZED_UINT8 \
	--inference_input_type=QUANTIZED_UINT8 \
	--std_dev_values=128 \
	--mean_values=128 \
	--change_concat_input_ranges=true \
	--output_arrays="ArgMax"
	```

	[Important] Note that converted model expects 513x513 RGB input and doesn't
	include preprocessing (resize and pad input image) and post processing (crop
	padded region and resize to original input size). These steps can be implemented
	outside of TFlite model.

	## Quantized model on PASCAL VOC

	We provide float and quantized checkpoints that have been pretrained on VOC 2012
	train_aug set, using MobileNet-v2 backbone with different depth multipliers.
	Quantized model usually have 1% decay in mIoU.

	For quantized (8bit) model, un-tar'ed directory includes:

	* a frozen inference graph (frozen_inference_graph.pb)

	* a checkpoint (model.ckpt.data*, model.ckpt.index)

	* a converted TFlite FlatBuffer file (frozen_inference_graph.tflite)

	Checkpoint name \| Eval OS \| Eval scales \| Left-right Flip \| Multiply-Adds \| Quantize \| PASCAL mIOU \| Folder Size \| TFLite File Size
	-------------------------------------------------------------------------------------------------------------------------------------------- \| :-----: \| :---------: \| :-------------: \| :-----------: \| :------: \| :----------: \| :-------: \| :-------:
	[mobilenetv2_dm05_coco_voc_trainaug](http://download.tensorflow.org/models/deeplabv3_mnv2_dm05_pascal_trainaug_2018_10_01.tar.gz) \| 16 \| [1.0] \| No \| 0.88B \| No \| 70.19% (val) \| 7.6MB \| N/A
	[mobilenetv2_dm05_coco_voc_trainaug_8bit](http://download.tensorflow.org/models/deeplabv3_mnv2_dm05_pascal_train_aug_8bit_2019_04_26.tar.gz) \| 16 \| [1.0] \| No \| 0.88B \| Yes \| 69.65% (val) \| 8.2MB \| 751.1KB
	[mobilenetv2_coco_voc_trainaug](http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz) \| 16 \| [1.0] \| No \| 2.75B \| No \| 75.32% (val) \| 23MB \| N/A
	[mobilenetv2_coco_voc_trainaug_8bit](http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_train_aug_8bit_2019_04_26.tar.gz) \| 16 \| [1.0] \| No \| 2.75B \| Yes \| 74.26% (val) \| 24MB \| 2.2MB

	Note that you might need the nightly build of TensorFlow (see
	[here](https://www.tensorflow.org/install) for install instructions) to convert
	above quantized model to TFLite.