Image Classification Async C++ Sample

This sample demonstrates how to do inference of image classification models using Asynchronous Inference Request API.

Models with only one input and output are supported.

In addition to regular images, the sample also supports single-channel ubyte images as an input for LeNet model.

Options

Values

Validated Models

alexnet, googlenet-v1

Model Format

OpenVINO™ toolkit Intermediate Representation (*.xml + *.bin), ONNX (*.onnx)

Supported devices

All

Other language realization

Python

The following C++ API is used in the application:

Feature

API

Description

Asynchronous Infer

ov::InferRequest::start_async, ov::InferRequest::set_callback

Do asynchronous inference with callback.

Model Operations

ov::Output::get_shape, ov::set_batch

Manage the model, operate with its batch size. Set batch size using input image count.

Infer Request Operations

ov::InferRequest::get_input_tensor

Get an input tensor.

Tensor Operations

ov::shape_size, ov::Tensor::data

Get a tensor shape size and its data.

Basic OpenVINO™ Runtime API is covered by Hello Classification C++ sample.

// Copyright (C) 2018-2023 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

/**
 * @brief The entry point the OpenVINO Runtime sample application
 * @file classification_sample_async/main.cpp
 * @example classification_sample_async/main.cpp
 */

#include <sys/stat.h>

#include <condition_variable>
#include <fstream>
#include <map>
#include <memory>
#include <mutex>
#include <string>
#include <vector>

// clang-format off
#include "openvino/openvino.hpp"

#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/classification_results.h"
#include "samples/slog.hpp"
#include "format_reader_ptr.h"

#include "classification_sample_async.h"
// clang-format on

constexpr auto N_TOP_RESULTS = 10;

using namespace ov::preprocess;

/**
 * @brief Checks input args
 * @param argc number of args
 * @param argv list of input arguments
 * @return bool status true(Success) or false(Fail)
 */
bool parse_and_check_command_line(int argc, char* argv[]) {
    gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
    if (FLAGS_h) {
        show_usage();
        showAvailableDevices();
        return false;
    }
    slog::info << "Parsing input parameters" << slog::endl;

    if (FLAGS_m.empty()) {
        show_usage();
        throw std::logic_error("Model is required but not set. Please set -m option.");
    }

    if (FLAGS_i.empty()) {
        show_usage();
        throw std::logic_error("Input is required but not set. Please set -i option.");
    }

    return true;
}

int main(int argc, char* argv[]) {
    try {
        // -------- Get OpenVINO Runtime version --------
        slog::info << ov::get_openvino_version() << slog::endl;

        // -------- Parsing and validation of input arguments --------
        if (!parse_and_check_command_line(argc, argv)) {
            return EXIT_SUCCESS;
        }

        // -------- Read input --------
        // This vector stores paths to the processed images
        std::vector<std::string> image_names;
        parseInputFilesArguments(image_names);
        if (image_names.empty())
            throw std::logic_error("No suitable images were found");

        // -------- Step 1. Initialize OpenVINO Runtime Core --------
        ov::Core core;

        // -------- Step 2. Read a model --------
        slog::info << "Loading model files:" << slog::endl << FLAGS_m << slog::endl;
        std::shared_ptr<ov::Model> model = core.read_model(FLAGS_m);
        printInputAndOutputsInfo(*model);

        OPENVINO_ASSERT(model->inputs().size() == 1, "Sample supports models with 1 input only");
        OPENVINO_ASSERT(model->outputs().size() == 1, "Sample supports models with 1 output only");

        // -------- Step 3. Configure preprocessing --------
        const ov::Layout tensor_layout{"NHWC"};

        ov::preprocess::PrePostProcessor ppp(model);
        // 1) input() with no args assumes a model has a single input
        ov::preprocess::InputInfo& input_info = ppp.input();
        // 2) Set input tensor information:
        // - precision of tensor is supposed to be 'u8'
        // - layout of data is 'NHWC'
        input_info.tensor().set_element_type(ov::element::u8).set_layout(tensor_layout);
        // 3) Here we suppose model has 'NCHW' layout for input
        input_info.model().set_layout("NCHW");
        // 4) output() with no args assumes a model has a single result
        // - output() with no args assumes a model has a single result
        // - precision of tensor is supposed to be 'f32'
        ppp.output().tensor().set_element_type(ov::element::f32);

        // 5) Once the build() method is called, the pre(post)processing steps
        // for layout and precision conversions are inserted automatically
        model = ppp.build();

        // -------- Step 4. read input images --------
        slog::info << "Read input images" << slog::endl;

        ov::Shape input_shape = model->input().get_shape();
        const size_t width = input_shape[ov::layout::width_idx(tensor_layout)];
        const size_t height = input_shape[ov::layout::height_idx(tensor_layout)];

        std::vector<std::shared_ptr<unsigned char>> images_data;
        std::vector<std::string> valid_image_names;
        for (const auto& i : image_names) {
            FormatReader::ReaderPtr reader(i.c_str());
            if (reader.get() == nullptr) {
                slog::warn << "Image " + i + " cannot be read!" << slog::endl;
                continue;
            }
            // Collect image data
            std::shared_ptr<unsigned char> data(reader->getData(width, height));
            if (data != nullptr) {
                images_data.push_back(data);
                valid_image_names.push_back(i);
            }
        }
        if (images_data.empty() || valid_image_names.empty())
            throw std::logic_error("Valid input images were not found!");

        // -------- Step 5. Loading model to the device --------
        // Setting batch size using image count
        const size_t batchSize = images_data.size();
        slog::info << "Set batch size " << std::to_string(batchSize) << slog::endl;
        ov::set_batch(model, batchSize);
        printInputAndOutputsInfo(*model);

        // -------- Step 6. Loading model to the device --------
        slog::info << "Loading model to the device " << FLAGS_d << slog::endl;
        ov::CompiledModel compiled_model = core.compile_model(model, FLAGS_d);

        // -------- Step 7. Create infer request --------
        slog::info << "Create infer request" << slog::endl;
        ov::InferRequest infer_request = compiled_model.create_infer_request();

        // -------- Step 8. Combine multiple input images as batch --------
        ov::Tensor input_tensor = infer_request.get_input_tensor();

        for (size_t image_id = 0; image_id < images_data.size(); ++image_id) {
            const size_t image_size = shape_size(model->input().get_shape()) / batchSize;
            std::memcpy(input_tensor.data<std::uint8_t>() + image_id * image_size,
                        images_data[image_id].get(),
                        image_size);
        }

        // -------- Step 9. Do asynchronous inference --------
        size_t num_iterations = 10;
        size_t cur_iteration = 0;
        std::condition_variable condVar;
        std::mutex mutex;
        std::exception_ptr exception_var;
        // -------- Step 10. Do asynchronous inference --------
        infer_request.set_callback([&](std::exception_ptr ex) {
            std::lock_guard<std::mutex> l(mutex);
            if (ex) {
                exception_var = ex;
                condVar.notify_all();
                return;
            }

            cur_iteration++;
            slog::info << "Completed " << cur_iteration << " async request execution" << slog::endl;
            if (cur_iteration < num_iterations) {
                // here a user can read output containing inference results and put new
                // input to repeat async request again
                infer_request.start_async();
            } else {
                // continue sample execution after last Asynchronous inference request
                // execution
                condVar.notify_one();
            }
        });

        // Start async request for the first time
        slog::info << "Start inference (asynchronous executions)" << slog::endl;
        infer_request.start_async();

        // Wait all iterations of the async request
        std::unique_lock<std::mutex> lock(mutex);
        condVar.wait(lock, [&] {
            if (exception_var) {
                std::rethrow_exception(exception_var);
            }

            return cur_iteration == num_iterations;
        });

        slog::info << "Completed async requests execution" << slog::endl;

        // -------- Step 11. Process output --------
        ov::Tensor output = infer_request.get_output_tensor();

        // Read labels from file (e.x. AlexNet.labels)
        std::string labelFileName = fileNameNoExt(FLAGS_m) + ".labels";
        std::vector<std::string> labels;

        std::ifstream inputFile;
        inputFile.open(labelFileName, std::ios::in);
        if (inputFile.is_open()) {
            std::string strLine;
            while (std::getline(inputFile, strLine)) {
                trim(strLine);
                labels.push_back(strLine);
            }
        }

        // Prints formatted classification results
        ClassificationResult classificationResult(output, valid_image_names, batchSize, N_TOP_RESULTS, labels);
        classificationResult.show();
    } catch (const std::exception& ex) {
        slog::err << ex.what() << slog::endl;
        return EXIT_FAILURE;
    } catch (...) {
        slog::err << "Unknown/internal exception happened." << slog::endl;
        return EXIT_FAILURE;
    }

    return EXIT_SUCCESS;
}

How It Works

At startup, the sample application reads command line parameters and loads the specified model and input images (or a folder with images) to the OpenVINO™ Runtime plugin. The batch size of the model is set according to the number of read images. The batch mode is an independent attribute on the asynchronous mode. Asynchronous mode works efficiently with any batch size.

Then, the sample creates an inference request object and assigns completion callback for it. In scope of the completion callback handling the inference request is executed again.

After that, the application starts inference for the first infer request and waits of 10th inference request execution being completed. The asynchronous mode might increase the throughput of the pictures.

When inference is done, the application outputs data to the standard output stream. You can place labels in .labels file near the model to get pretty output.

You can see the explicit description of each sample step at Integration Steps section of “Integrate OpenVINO™ Runtime with Your Application” guide.

Building

To build the sample, please use instructions available at Build the Sample Applications section in OpenVINO™ Toolkit Samples guide.

Running

Run the application with the -h option to see the usage instructions:

classification_sample_async -h

Usage instructions:

[ INFO ] OpenVINO Runtime version ......... <version>
[ INFO ] Build ........... <build>

classification_sample_async [OPTION]
Options:

    -h                      Print usage instructions.
    -m "<path>"             Required. Path to an .xml file with a trained model.
    -i "<path>"             Required. Path to a folder with images or path to image files: a .ubyte file for LeNet and a .bmp file for other models.
    -d "<device>"           Optional. Specify the target device to infer on (the list of available devices is shown below). Default value is CPU. Use "-d HETERO:<comma_separated_devices_list>" format to specify the HETERO plugin. Sample will look for a suitable plugin for the device specified.

Available target devices: <devices>

To run the sample, you need to specify a model and image:

  • You can use public or Intel’s pre-trained models from the Open Model Zoo. The models can be downloaded using the Model Downloader.

  • You can use images from the media files collection available here <https://storage.openvinotoolkit.org/data/test_data>.

Note

  • By default, OpenVINO™ Toolkit Samples and Demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using mo with reverse_input_channels argument specified. For more information about the argument, refer to When to Reverse Input Channels section of Embedding Preprocessing Computation.

  • Before running the sample with a trained model, make sure the model is converted to the intermediate representation (IR) format (*.xml + *.bin) using the model conversion API.

  • The sample accepts models in ONNX format (.onnx) that do not require preprocessing.

  • Stating flags that take only single option like -m multiple times, for example ./classification_sample_async -m model.xml -m model2.xml, results in only the first value being used.

Example

  1. Install the openvino-dev Python package to use Open Model Zoo Tools:

    python -m pip install openvino-dev[caffe]
    
  2. Download a pre-trained model using:

    omz_downloader --name googlenet-v1
    
  3. If a model is not in the IR or ONNX format, it must be converted. You can do this using the model converter:

    omz_converter --name googlenet-v1
    
  4. Perform inference of dog.bmp using googlenet-v1 model on a GPU, for example:

    classification_sample_async -m googlenet-v1.xml -i dog.bmp -d GPU
    

Sample Output

[ INFO ] OpenVINO Runtime version ......... <version>
[ INFO ] Build ........... <build>
[ INFO ]
[ INFO ] Parsing input parameters
[ INFO ] Files were added: 1
[ INFO ]     /images/dog.bmp
[ INFO ] Loading model files:
[ INFO ] /models/googlenet-v1.xml
[ INFO ] model name: GoogleNet
[ INFO ]     inputs
[ INFO ]         input name: data
[ INFO ]         input type: f32
[ INFO ]         input shape: {1, 3, 224, 224}
[ INFO ]     outputs
[ INFO ]         output name: prob
[ INFO ]         output type: f32
[ INFO ]         output shape: {1, 1000}
[ INFO ] Read input images
[ INFO ] Set batch size 1
[ INFO ] model name: GoogleNet
[ INFO ]     inputs
[ INFO ]         input name: data
[ INFO ]         input type: u8
[ INFO ]         input shape: {1, 224, 224, 3}
[ INFO ]     outputs
[ INFO ]         output name: prob
[ INFO ]         output type: f32
[ INFO ]         output shape: {1, 1000}
[ INFO ] Loading model to the device GPU
[ INFO ] Create infer request
[ INFO ] Start inference (asynchronous executions)
[ INFO ] Completed 1 async request execution
[ INFO ] Completed 2 async request execution
[ INFO ] Completed 3 async request execution
[ INFO ] Completed 4 async request execution
[ INFO ] Completed 5 async request execution
[ INFO ] Completed 6 async request execution
[ INFO ] Completed 7 async request execution
[ INFO ] Completed 8 async request execution
[ INFO ] Completed 9 async request execution
[ INFO ] Completed 10 async request execution
[ INFO ] Completed async requests execution

Top 10 results:

Image /images/dog.bmp

classid probability
------- -----------
156     0.8935547
218     0.0608215
215     0.0217133
219     0.0105667
212     0.0018835
217     0.0018730
152     0.0018730
157     0.0015745
154     0.0012817
220     0.0010099