Using Encrypted Models with OpenVINO™

Deploying deep-learning capabilities to edge devices can present security challenges. For example, ensuring inference integrity or providing copyright protection of your deep-learning models.

One possible solution is to use cryptography to protect models as they are deployed and stored on edge devices. Model encryption, decryption and authentication are not provided by OpenVINO™ but can be implemented with third-party tools, like OpenSSL*. While implementing encryption, ensure that you use the latest versions of tools and follow cryptography best practices.

This guide demonstrates how to use OpenVINO securely with protected models.

Secure Model Deployment

After a model is optimized by the OpenVINO Model Optimizer, it's then deployed to target devices in the Intermediate Representation (IR) format. An optimized model is stored on an edge device and executed by the Inference Engine.

To protect deep-learning models, you can encrypt an optimized model before deploying it to the edge device. The edge device should keep the stored model protected at all times and have the model decrypted in runtime only for use by the Inference Engine.

deploy_encrypted_model.png

Loading Encrypted Models

The OpenVINO Inference Engine requires model decryption before loading. Allocate a temporary memory block for model decryption, and use InferenceEngine::Core::ReadNetwork method to load the model from memory buffer. For more information, see the InferenceEngine::Core Class Reference Documentation.

std::vector<uint8_t> model;
std::vector<uint8_t> weights;
// Read model files and decrypt them into temporary memory block
decrypt_file(model_file, password, model);
decrypt_file(weights_file, password, weights);

Hardware-based protection, such as Intel® Software Guard Extensions (Intel® SGX), can be utilized to protect decryption operation secrets and bind them to a device. For more information, go to Intel® Software Guard Extensions.

Use InferenceEngine::Core::ReadNetwork() to set model representations and weights respectively.

Core core;
// Load model from temporary memory block
std::string strModel(model.begin(), model.end());
CNNNetwork network = core.ReadNetwork(strModel, make_shared_blob<uint8_t>({Precision::U8, {weights.size()}, C}, weights.data()));

Additional Resources