Convert TensorFlow EfficientDet Models¶
This tutorial explains how to convert EfficientDet* public object detection models to the Intermediate Representation (IR).
Convert EfficientDet Model to IR¶
On GitHub*, you can find several public versions of EfficientDet model implementation. This tutorial explains how to convert models from the https://github.com/google/automl/tree/master/efficientdet repository (commit 96e1fee) to IR.
Get Frozen TensorFlow* Model¶
Follow the instructions below to get frozen TensorFlow EfficientDet model. We use EfficientDet-D4 model as an example:
Clone the repository:
git clone https://github.com/google/automl cd automl/efficientdet
(Optional) Checkout to the commit that the conversion was tested on:
git checkout 96e1fee
Install required dependencies:
python3 -m pip install --upgrade pip python3 -m pip install -r requirements.txt python3 -m pip install --upgrade tensorflow-model-optimization
Download and extract the model checkpoint efficientdet-d4.tar.gz referenced in the “Pre-trained EfficientDet Checkpoints” section of the model repository:
wget https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco2/efficientdet-d4.tar.gz tar zxvf efficientdet-d4.tar.gz
Freeze the model:
mo --runmode=saved_model --model_name=efficientdet-d4 --ckpt_path=efficientdet-d4 --saved_model_dir=savedmodeldir
As a result the frozen model file
savedmodeldir/efficientdet-d4_frozen.pb
will be generated.
Note
For custom trained models, specify --hparams
flag to config.yaml
which was used during training.
Note
If you see an error AttributeError: module ‘tensorflow_core.python.keras.api._v2.keras.initializers has no attribute ‘variance_scaling’` apply the fix from the patch.
Convert EfficientDet TensorFlow Model to the IR¶
To generate the IR of the EfficientDet TensorFlow model, run:
mo \
--input_model savedmodeldir/efficientdet-d4_frozen.pb \
--transformations_config front/tf/automl_efficientdet.json \
--input_shape [1,$IMAGE_SIZE,$IMAGE_SIZE,3] \
--reverse_input_channels
Where $IMAGE_SIZE
is the size that the input image of the original TensorFlow model will be resized to. Different EfficientDet models were trained with different input image sizes. To determine the right one refer to the efficientdet_model_param_dict
dictionary in the hparams_config.py file. The attribute image_size
specifies the shape to be specified for the model conversion.
The transformations_config
command line parameter specifies the configuration json file containing hints to the Model Optimizer on how to convert the model and trigger transformations implemented in the <PYTHON_SITE_PACKAGES>/openvino/tools/mo/front/tf/AutomlEfficientDet.py
. The json file contains some parameters which must be changed if you train the model yourself and modified the hparams_config
file or the parameters are different from the ones used for EfficientDet-D4. The attribute names are self-explanatory or match the name in the hparams_config
file.
Note
The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the RGB<->BGR
conversion specifying the command-line parameter: --reverse_input_channels
. Otherwise, inference results may be incorrect. For more information about the parameter, refer to When to Reverse Input Channels section of Converting a Model to Intermediate Representation (IR).
OpenVINO™ Toolkit Samples and Open Model Zoo Demos¶
OpenVINO toolkit provides samples that can be used to infer EfficientDet models. For more information, refer to the following pages:
Interpreting Results of the TensorFlow Model and the IR¶
The TensorFlow model produces as output a list of 7-element tuples: [image_id, y_min, x_min, y_max, x_max, confidence, class_id]
, where:
image_id
image batch index.y_min
absolutey
coordinate of the lower left corner of the detected object.x_min
absolutex
coordinate of the lower left corner of the detected object.y_max
absolutey
coordinate of the upper right corner of the detected object.x_max
absolutex
coordinate of the upper right corner of the detected object.confidence
is the confidence of the detected object.class_id
is the id of the detected object class counted from 1.
The output of the IR is a list of 7-element tuples: [image_id, class_id, confidence, x_min, y_min, x_max, y_max]
, where:
image_id
image batch index.class_id
is the id of the detected object class counted from 0.confidence
is the confidence of the detected object.x_min
normalizedx
coordinate of the lower left corner of the detected object.y_min
normalizedy
coordinate of the lower left corner of the detected object.x_max
normalizedx
coordinate of the upper right corner of the detected object.y_max
normalizedy
coordinate of the upper right corner of the detected object.
The first element with image_id = -1
means end of data.