KServe compatible gRPC API

Introduction

This document gives information about OpenVINO Model Server gRPC API compatible with KServe. It is documented in KServe repository. Using the gRPC interface is recommended for optimal performance due to its faster implementation of input data deserialization. gRPC achieves lower latency, especially with larger input messages like images.

The API includes following endpoints:

Note

Examples of using each of above endpoints can be found in KServe samples.

Server Live API

Gets infromation about server liveness. Server is alive when communication channel can be established successfully.

Check KServe documentation for more details.

Server Ready API

Gets infromation about server readiness. Server is ready when initial configuration has been loaded. Server gets into ready state only once and remains in that state for the rest of its lifetime regardless the outcome of the initial loading phase. If some of the models have not been loaded successfully, server still becomes ready when the loading procedure finishes.

Check KServe documentation for more details.

Server Metadata API

Gets infromation about the server itself.

Check KServe documentation for more details.

Model Ready API

Gets infromation about readiness of the specific model. Model is ready when it’s fully capable to run inference.

Check KServe documentation for more details.

Model Metadata API

Gets information about the specific model.

Check KServe documentation for more details.

Inference API

Run inference with requested model or DAG.

Check KServe documentation for more details.

Note

Inference supports putting tensor buffers either in ModelInferRequest ‘s InferTensorContents and raw_input_contents. There is no support for BF16 data type and there is no support for using FP16 in InferTensorContents. In case of sending raw images jpeg files BYTES data type should be used and data should be put in InferTensorContents ‘s bytes_contents or raw_input_contents for batch size equal to 1.

Check how binary data is handled in OpenVINO Model Server

See Also