The primary vehicle for the performance of the CPU codepath in the Inference Engine is the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN), and new CPU kernels extend the Inference Engine plugin for the Intel MKL-DNN. Implementing the InferenceEngine::ILayerExecImpl defines a general CPU-side extension. There are no Intel MKL-DNN specifics in the way you need to implement a kernel.
Implementation Class
All custom kernels for the CPU plugin should be inherited from the InferenceEngine::ILayerExecImpl interface. Based on that, declaration of a kernel implementation class can look as follows:
public:
explicit OpImplementation(const std::shared_ptr<ngraph::Node>& node);
std::vector<InferenceEngine::Blob::Ptr> &outputs,
private:
int64_t add;
ngraph::Shape inShape;
ngraph::Shape outShape;
std::string error;
};
This class provides interface for the implementation with the custom execution code.
Definition: ie_iextension.h:104
virtual StatusCode getSupportedConfigurations(std::vector< LayerConfig > &conf, ResponseDesc *resp) noexcept=0
Gets all supported configurations for the current layer.
virtual StatusCode execute(std::vector< Blob::Ptr > &inputs, std::vector< Blob::Ptr > &outputs, ResponseDesc *resp) noexcept=0
Execute method.
virtual StatusCode init(LayerConfig &config, ResponseDesc *resp) noexcept=0
Initializes the implementation.
StatusCode
This enum contains codes for all possible return values of the interface functions.
Definition: ie_common.h:222
This structure describes Layer configuration.
Definition: ie_iextension.h:68
Represents detailed information for an error.
Definition: ie_common.h:245
Class Fields
The provided implementation has several fields:
add
of the type int64_t
is an attribute of a custom operation
inShape
of the type ngraph::Shape
is an input shape
outShape
of the type ngraph::Shape
is an output shape
error
of the type std::string
is a field to handle errors from a constructor
Constructor of Implementation
An implementation constructor checks parameters of nGraph operation, stores needed attributes, and stores an error message in the case of an error.
OpImplementation::OpImplementation(const std::shared_ptr<ngraph::Node> &node) {
try {
auto castedNode = std::dynamic_pointer_cast<Operation>(node);
if (!castedNode)
if (castedNode->inputs().size() != 1 || castedNode->outputs().size() != 1)
THROW_IE_EXCEPTION <<
"Cannot create implementation for operation with incorrect number of inputs or outputs!";
if (castedNode->get_input_partial_shape(0).is_dynamic() || castedNode->get_output_partial_shape(0).is_dynamic())
if (castedNode->get_input_shape(0).size() != 4 || castedNode->get_output_shape(0).size() != 4)
if (castedNode->get_input_element_type(0) != ngraph::element::f32 || castedNode->get_output_element_type(0) != ngraph::element::f32)
add = castedNode->getAddAttr();
inShape = castedNode->get_input_shape(0);
outShape = castedNode->get_output_shape(0);
} catch (InferenceEngine::details::InferenceEngineException& ex) {
error = ex.what();
}
}
#define THROW_IE_EXCEPTION
A macro used to throw the exception with a notable description.
Definition: ie_exception.hpp:25
getSupportedConfigurations
InferenceEngine::ILayerExecImpl::getSupportedConfigurations method returns all supported configuration formats (input/output tensor layouts) for your implementation. To specify formats of data, use InferenceEngine::TensorDesc. Refer to the Memory Primitives section for instructions on how to do it.
config.dynBatchSupport = false;
size_t offset((std::numeric_limits<size_t>::max)());
if (planar) {
config.inConfs.push_back(inData);
config.outConfs.push_back(outData);
} else {
auto div_up = [](const int a, const int b) -> int {
if (!b)
return 0;
return (a + b - 1) / b;
};
order.push_back(1);
inBlkDims[1] = div_up(inBlkDims[1], 8);
inBlkDims.push_back(8);
outBlkDims[1] = div_up(outBlkDims[1], 8);
outBlkDims.push_back(8);
config.inConfs.push_back(inData);
config.outConfs.push_back(outData);
}
return config;
};
if (!error.empty()) {
if (resp) {
strncpy(resp->msg, error.c_str(), sizeof(resp->msg) - 1);
resp->msg[sizeof(resp->msg)-1] = 0;
}
return InferenceEngine::GENERAL_ERROR;
}
conf.emplace_back(createConfig(inShape, outShape, true));
conf.emplace_back(createConfig(inShape, outShape, false));
return InferenceEngine::OK;
}
@ FP32
Definition: ie_precision.hpp:29
This class defines Tensor description.
Definition: ie_layouts.h:158
std::vector< size_t > SizeVector
Represents tensor size.
Definition: ie_common.h:27
This structure describes data configuration.
Definition: ie_iextension.h:48
TensorDesc desc
Format of memory descriptor.
Definition: ie_iextension.h:52
init
InferenceEngine::ILayerExecImpl::init method gets a runtime-selected configuration from a vector that is populated from the getSupportedConfigurations
method and checks the parameters:
try {
if (config.inConfs.size() != 1 || config.outConfs.size() != 1) {
THROW_IE_EXCEPTION <<
"Operation cannot be initialized with incorrect number of inputs/outputs!";
}
if (config.inConfs[0].desc.getDims().size() != 4 || config.outConfs[0].desc.getDims().size() != 4) {
}
}
} catch (InferenceEngine::details::InferenceEngineException& ex) {
if (resp) {
strncpy(resp->msg, error.c_str(), sizeof(resp->msg) - 1);
resp->msg[sizeof(resp->msg)-1] = 0;
}
return InferenceEngine::GENERAL_ERROR;
}
return InferenceEngine::OK;
}
execute
InferenceEngine::ILayerExecImpl::execute method accepts and processes the actual tenors as input/output blobs:
std::vector<InferenceEngine::Blob::Ptr> &outputs,
const float* src_data = inputs[0]->cbuffer().as<const float *>() + inputs[0]->getTensorDesc().getBlockingDesc().getOffsetPadding();
float *dst_data = outputs[0]->buffer().as<float *>() + outputs[0]->getTensorDesc().getBlockingDesc().getOffsetPadding();
for (size_t i = 0; i < inputs[0]->size(); i++) {
dst_data[i] = src_data[i] + add;
}
return InferenceEngine::OK;
}
Register Implementation in Extension
Class
To register custom kernel implementation in the Extension class, implement the following methods:
getImplTypes
InferenceEngine::IExtension::getImplTypes returns a vector of implementation types for an operation.
std::vector<std::string> Extension::getImplTypes(const std::shared_ptr<ngraph::Node> &node) {
if (std::dynamic_pointer_cast<Operation>(node)) {
return {"CPU"};
}
return {};
}
getImplementation
InferenceEngine::IExtension::getImplementation returns the kernel implementation with a specified type for an operation.
if (std::dynamic_pointer_cast<Operation>(node) && implType == "CPU") {
return std::make_shared<OpImplementation>(node);
}
return nullptr;
}
std::shared_ptr< ILayerImpl > Ptr
A shared pointer to the ILayerImpl interface.
Definition: ie_iextension.h:92
Load Extension with Executable Kernels to Plugin
Use the AddExtension
method of the general plugin interface to load your primitives:
auto extension_ptr = make_so_pointer<InferenceEngine::IExtension>("<shared lib path>");
core.AddExtension(extension_ptr, "CPU");
This class represents Inference Engine Core entity.
Definition: ie_core.hpp:29