Converting TensorFlow Object Detection API Models#
Danger
The code described here has been deprecated! Do not use it to avoid working with a legacy solution. It will be kept for some time to ensure backwards compatibility, but you should not use it in contemporary applications.
This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the Python tutorials.
Starting with the 2022.1 release, model conversion API can convert the TensorFlow Object Detection API Faster and Mask RCNNs topologies differently. By default, model conversion adds operation “Proposal” to the generated IR. This operation needs an additional input to the model with name “image_info” which should be fed with several values describing the preprocessing applied to the input image (refer to the Proposal operation specification for more information). However, this input is redundant for the models trained and inferred with equal size images. Model conversion API can generate IR for such models and insert operation DetectionOutput instead of
Proposal
. The DetectionOutput operation does not require additional model input “image_info”. Moreover, for some models the produced inference results are closer to the original TensorFlow model. In order to trigger new behavior, the attribute “operation_to_add” in the corresponding JSON transformation configuration file should be set to value “DetectionOutput” instead of default one “Proposal”.Starting with the 2021.1 release, model conversion API converts the TensorFlow Object Detection API SSDs, Faster and Mask RCNNs topologies keeping shape-calculating sub-graphs by default, so topologies can be re-shaped in the OpenVINO Runtime using dedicated reshape API. Refer to the Using Shape Inference guide for more information on how to use this feature. It is possible to change the both spatial dimensions of the input image and batch size.
To generate IRs for TF 1 SSD topologies, model conversion API creates a number of
PriorBoxClustered
operations instead of a constant node with prior boxes calculated for the particular input image size. This change allows you to reshape the topology in the OpenVINO Runtime using dedicated API. The reshaping is supported for all SSD topologies except FPNs, which contain hardcoded shapes for some operations preventing from changing topology input shape.
Converting a Model#
You can download TensorFlow Object Detection API models from the TensorFlow 1 Detection Model Zoo or TensorFlow 2 Detection Model Zoo.
Note
Before converting, make sure you have configured model conversion API. For configuration steps, refer to the Convert a Model.
To convert a TensorFlow Object Detection API model, run the mo
command with the following required parameters:
input_model <path_to_frozen.pb>
- File with a pretrained model (binary or text .pb file after freezing) ORsaved_model_dir <path_to_saved_model>
for the TensorFlow 2 modelstransformations_config <path_to_subgraph_replacement_configuration_file.json>
- A subgraph replacement configuration file with transformations description. For the models downloaded from the TensorFlow Object Detection API zoo, you can find the configuration files in the<PYTHON_SITE_PACKAGES>/openvino/tools/mo/front/tf
directory. Use:ssd_v2_support.json
- for frozen SSD topologies from the models zoo version up to 1.13.X inclusivelyssd_support_api_v.1.14.json
- for SSD topologies trained using the TensorFlow Object Detection API version 1.14 up to 1.14.X inclusivelyssd_support_api_v.1.15.json
- for SSD topologies trained using the TensorFlow Object Detection API version 1.15 up to 2.0ssd_support_api_v.2.0.json
- for SSD topologies trained using the TensorFlow Object Detection API version 2.0 up to 2.3.X inclusivelyssd_support_api_v.2.4.json
- for SSD topologies trained using the TensorFlow Object Detection API version 2.4 or higherefficient_det_support_api_v.2.0.json
- for EfficientDet topologies trained using the TensorFlow Object Detection API version 2.0 up to 2.3.X inclusivelyefficient_det_support_api_v.2.4.json
- for EfficientDet topologies trained using the TensorFlow Object Detection API version 2.4 or higherfaster_rcnn_support.json
- for Faster R-CNN topologies from the TF 1.X models zoo trained with TensorFlow version up to 1.6.X inclusivelyfaster_rcnn_support_api_v1.7.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 1.7.0 up to 1.9.X inclusivelyfaster_rcnn_support_api_v1.10.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 1.10.0 up to 1.12.X inclusivelyfaster_rcnn_support_api_v1.13.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 1.13.Xfaster_rcnn_support_api_v1.14.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 1.14.0 up to 1.14.X inclusivelyfaster_rcnn_support_api_v1.15.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 1.15.0 up to 2.0faster_rcnn_support_api_v2.0.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 2.0 up to 2.3.X inclusivelyfaster_rcnn_support_api_v2.4.json
- for Faster R-CNN topologies trained using the TensorFlow Object Detection API version 2.4 or highermask_rcnn_support.json
- for Mask R-CNN topologies from the TF 1.X models zoo trained with TensorFlow version 1.9.0 or lower.mask_rcnn_support_api_v1.7.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 1.7.0 up to 1.9.X inclusivelymask_rcnn_support_api_v1.11.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 1.11.0 up to 1.12.X inclusivelymask_rcnn_support_api_v1.13.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 1.13.0 up to 1.13.X inclusivelymask_rcnn_support_api_v1.14.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 1.14.0 up to 1.14.X inclusivelymask_rcnn_support_api_v1.15.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 1.15.0 up to 2.0mask_rcnn_support_api_v2.0.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 2.0 up to 2.3.X inclusivelymask_rcnn_support_api_v2.4.json
- for Mask R-CNN topologies trained using the TensorFlow Object Detection API version 2.4 or higherrfcn_support.json
- for RFCN topology from the models zoo trained with TensorFlow version up to 1.9.X inclusivelyrfcn_support_api_v1.10.json
- for RFCN topology from the models zoo frozen with TensorFlow version 1.10.0 up to 1.12.X inclusivelyrfcn_support_api_v1.13.json
- for RFCN topology from the models zoo frozen with TensorFlow version 1.13.Xrfcn_support_api_v1.14.json
- for RFCN topology from the models zoo frozen with TensorFlow version 1.14.0 or higher
tensorflow_object_detection_api_pipeline_config <path_to_pipeline.config>
- A special configuration file that describes the topology hyper-parameters and structure of the TensorFlow Object Detection API model. For the models downloaded from the TensorFlow Object Detection API zoo, the configuration file is namedpipeline.config
. If you plan to train a model yourself, you can find templates for these files in the models repository.input_shape
(optional) - A custom input image shape. For more information how theinput_shape
parameter is handled for the TensorFlow Object Detection API models, refer to the Custom Input Shape guide.
Note
The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the RGB<->BGR
conversion specifying the command-line parameter: reverse_input_channels
. Otherwise, inference results may be incorrect. If you convert a TensorFlow Object Detection API model to use with the OpenVINO sample applications, you must specify the reverse_input_channels
parameter. For more information about the parameter, refer to the When to Reverse Input Channels section of the Converting a Model to Intermediate Representation (IR) guide.
Additionally to the mandatory parameters listed above you can use optional conversion parameters if needed. A full list of parameters is available in the Converting a TensorFlow Model guide.
For example, if you downloaded the pre-trained SSD InceptionV2 topology and extracted archive to the directory /tmp/ssd_inception_v2_coco_2018_01_28
, the sample command line to convert the model looks as follows:
mo --input_model=/tmp/ssd_inception_v2_coco_2018_01_28/frozen_inference_graph.pb --transformations_config front/tf/ssd_v2_support.json --tensorflow_object_detection_api_pipeline_config /tmp/ssd_inception_v2_coco_2018_01_28/pipeline.config --reverse_input_channels
OpenVINO™ Toolkit Samples and Open Model Zoo Demos#
OpenVINO comes with a number of samples to demonstrate use of OpenVINO Runtime API. Additionally, Open Model Zoo provides set of demo applications to show implementation of close to real life applications, based on deep learning in various tasks, including Image Classification, Visual Object Detection, Text Recognition, Speech Recognition, Natural Language Processing and others. Refer to the links below for more details.
Important
Due to the deprecation of Open Model Zoo, models in the OpenVINO IR format are now published on Hugging Face.
Feeding Input Images to the Samples#
There are several important notes about feeding input images to the samples:
OpenVINO samples stretch input image to the size of the input operation without preserving aspect ratio. This behavior is usually correct for most topologies (including SSDs), but incorrect for other models like Faster R-CNN, Mask R-CNN and R-FCN. These models usually use keeps aspect ratio resizer. The type of preprocessing is defined in the pipeline configuration file in the section
image_resizer
. If keeping aspect ratio is used, then it is necessary to resize image before passing it to the sample and optionally pad the resized image with 0s (if the attribute “pad_to_max_dimension” in the pipeline.config is equal to “true”).TensorFlow implementation of image resize may be different from the one implemented in the sample. Even reading input image from compressed format (like
.jpg
) could give different results in the sample and TensorFlow. If it is necessary to compare accuracy between the TensorFlow and the OpenVINO, it is recommended to pass pre-resized input image in a non-compressed format (like.bmp
).If you want to infer the model with the OpenVINO samples, convert the model specifying the
reverse_input_channels
command line parameter. The samples load images in BGR channels order, while TensorFlow models were trained with images in RGB order. When thereverse_input_channels
command line parameter is specified, model conversion API performs first convolution or other channel dependent operation weights modification so the output will be like the image is passed with RGB channels order.Read carefully the messages printed by model conversion API. They contain important instructions on how to prepare input data before running the inference and how to interpret the output.
Custom Input Shape#
Model conversion handles the command line parameter input_shape
for TensorFlow Object Detection API models in a special way depending on the image resizer type defined in the pipeline.config
file. TensorFlow Object Detection API generates different Preprocessor
sub-graph based on the image resizer type. Model conversion API supports two types of image resizer:
fixed_shape_resizer
— Stretches input image to the specific height and width. Thepipeline.config
snippet below shows afixed_shape_resizer
sample definition:image_resizer { fixed_shape_resizer { height: 300 width: 300 } }
keep_aspect_ratio_resizer
— Resizes the input image keeping aspect ratio to satisfy the minimum and maximum size constraints. Thepipeline.config
snippet below shows akeep_aspect_ratio_resizer
sample definition:image_resizer { keep_aspect_ratio_resizer { min_dimension: 600 max_dimension: 1024 } }
If an additional parameter “pad_to_max_dimension” is equal to “true”, then the resized image will be padded with 0s to the square image of size “max_dimension”.
Fixed Shape Resizer Replacement#
If the
input_shape
command line parameter is not specified, model conversion generates an input operation with the height and width as defined in thepipeline.config
.If the
input_shape [1, H, W, 3]
command line parameter is specified, model conversion sets the input operation height toH
and width toW
and convert the model. However, the conversion may fail because of the following reasons:The model is not reshape-able, meaning that it’s not possible to change the size of the model input image. For example, SSD FPN models have
Reshape
operations with hard-coded output shapes, but the input size to theseReshape
instances depends on the input image size. In this case, model conversion API shows an error during the shape inference phase. Run model conversion withlog_level DEBUG
to see the inferred operations output shapes to see the mismatch.Custom input shape is too small. For example, if you specify
input_shape [1,100,100,3]
to convert a SSD Inception V2 model, one of convolution or pooling nodes decreases input tensor spatial dimensions to non-positive values. In this case, model conversion API shows error message like this: ‘[ ERROR ] Shape [ 1 -1 -1 256] is not fully defined for output X of “node_name”.’
Keeping Aspect Ratio Resizer Replacement#
If the
input_shape
command line parameter is not specified, model conversion API generates an input operation with both height and width equal to the value of parametermin_dimension
in thekeep_aspect_ratio_resizer
.If the
input_shape [1, H, W, 3]
command line parameter is specified, model conversion API scales the specified input image heightH
and widthW
to satisfy themin_dimension
andmax_dimension
constraints defined in thekeep_aspect_ratio_resizer
. The following function calculates the input operation height and width:def calculate_shape_keeping_aspect_ratio(H: int, W: int, min_dimension: int, max_dimension: int): ratio_min = min_dimension / min(H, W) ratio_max = max_dimension / max(H, W) ratio = min(ratio_min, ratio_max) return int(round(H * ratio)), int(round(W * ratio))
The input_shape
command line parameter should be specified only if the “pad_to_max_dimension” does not exist of is set to “false” in the keep_aspect_ratio_resizer
.
Models with keep_aspect_ratio_resizer
were trained to recognize object in real aspect ratio, in contrast with most of the classification topologies trained to recognize objects stretched vertically and horizontally as well. By default, topologies are converted with keep_aspect_ratio_resizer
to consume a square input image. If the non-square image is provided as input, it is stretched without keeping aspect ratio that results to object detection quality decrease.
Note
It is highly recommended to specify the input_shape
command line parameter for the models with keep_aspect_ratio_resizer
, if the input image dimensions are known in advance.
Model Conversion Process in Detail#
This section is intended for users who want to understand how model conversion API performs Object Detection API models conversion in details. The information in this section is also useful for users having complex models that are not converted with model conversion API out of the box. It is highly recommended to read the Graph Transformation Extensions section in the [Legacy] Model Optimizer Extensibility documentation first to understand sub-graph replacement concepts which are used here.
It is also important to open the model in the TensorBoard to see the topology structure. Model conversion API can create an event file that can be then fed to the TensorBoard tool. Run model conversion, providing two command line parameters:
input_model <path_to_frozen.pb>
— Path to the frozen model.tensorboard_logdir
— Path to the directory where TensorBoard looks for the event files.
Implementation of the transformations for Object Detection API models is located in the file. Refer to the code in this file to understand the details of the conversion process.