This tutorial explains how to convert detection EfficientDet* public models to the Intermediate Representation (IR).
On GitHub*, you can find several public versions of EfficientDet model implementation. This tutorial explains how to convert models from the https://github.com/google/automl/tree/master/efficientdet repository (commit 96e1fee) to IR.
Follow the instructions below to get frozen TensorFlow EfficientDet model. We use EfficientDet-D4 model as an example:
savedmodeldir/efficientdet-d4_frozen.pb
will be generated.NOTE: If you see an error ‘AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.initializers’ has no attribute 'variance_scaling'` apply the fix from the patch.
To generate the IR of the EfficientDet TensorFlow model, run:
Where $IMAGE_SIZE
is the size that the input image of the original TensorFlow model will be resized to. Different EfficientDet models were trained with different input image sizes. To determine the right one refer to the efficientdet_model_param_dict
dictionary in the hparams_config.py file. The attribute image_size
specifies the shape to be specified for the model conversion.
The transformations_config
command line parameter specifies the configuration json file containing hints to the Model Optimizer on how to convert the model and trigger transformations implemented in the $MO_ROOT/extensions/front/tf/AutomlEfficientDet.py
. The json file contains some parameters which must be changed if you train the model yourself and modified the hparams_config
file or the parameters are different from the ones used for EfficientDet-D4. The attribute names are self-explanatory or match the name in the hparams_config
file.
NOTE: The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the
RGB<->BGR
conversion specifying the command-line parameter:--reverse_input_channels
. Otherwise, inference results may be incorrect. For more information about the parameter, refer to When to Reverse Input Channels section of Converting a Model Using General Conversion Parameters.
OpenVINO™ toolkit provides samples that can be used to infer EfficientDet model. For more information, refer to Object Detection for SSD C++ Sample and Object Detection for SSD Python Sample.
The TensorFlow model produces as output a list of 7-element tuples: [image_id, y_min, x_min, y_max, x_max, confidence, class_id]
, where:
image_id
– image batch index.y_min
– absolute y
coordinate of the lower left corner of the detected object.x_min
– absolute x
coordinate of the lower left corner of the detected object.y_max
– absolute y
coordinate of the upper right corner of the detected object.x_max
– absolute x
coordinate of the upper right corner of the detected object.confidence
– is the confidence of the detected object.class_id
– is the id of the detected object class counted from 1.The output of the IR is a list of 7-element tuples: [image_id, class_id, confidence, x_min, y_min, x_max, y_max]
, where:
image_id
– image batch index.class_id
– is the id of the detected object class counted from 0.confidence
– is the confidence of the detected object.x_min
– normalized x
coordinate of the lower left corner of the detected object.y_min
– normalized y
coordinate of the lower left corner of the detected object.x_max
– normalized x
coordinate of the upper right corner of the detected object.y_max
– normalized y
coordinate of the upper right corner of the detected object.The first element with image_id = -1
means end of data.