KServe compatible gRPC API¶
Introduction¶
This document gives information about OpenVINO Model Server gRPC API compatible with KServe. It is documented in KServe repository. Using the gRPC interface is recommended for optimal performance due to its faster implementation of input data deserialization. gRPC achieves lower latency, especially with larger input messages like images.
The API includes following endpoints:
Note
Examples of using each of above endpoints can be found in KServe samples.
Server Live API¶
Gets infromation about server liveness. Server is alive when communication channel can be established successfully.
Check KServe documentation for more details.
Server Ready API¶
Gets infromation about server readiness. Server is ready when initial configuration has been loaded. Server gets into ready state only once and remains in that state for the rest of its lifetime regardless the outcome of the initial loading phase. If some of the models have not been loaded successfully, server still becomes ready when the loading procedure finishes.
Check KServe documentation for more details.
Server Metadata API¶
Gets infromation about the server itself.
Check KServe documentation for more details.
Model Ready API¶
Gets infromation about readiness of the specific model. Model is ready when it’s fully capable to run inference.
Check KServe documentation for more details.
Model Metadata API¶
Gets information about the specific model.
Check KServe documentation for more details.
Inference API¶
Run inference with requested model or DAG.
Check KServe documentation for more details.
Note
Inference supports putting tensor buffers either in ModelInferRequest
‘s InferTensorContents and raw_input_contents. There is no support for BF16 data type and there is no support for using FP16 in InferTensorContents
. In case of sending raw images jpeg files BYTES data type should be used and data should be put in InferTensorContents
‘s bytes_contents
or raw_input_contents
for batch size equal to 1.
Also, using BYTES
datatype it is possible to send binary encoded images that would be preprocessed by OVMS using opencv and converted to OpenVINO-friendly format. For more information check how binary data is handled in OpenVINO Model Server
See Also¶
Example client code shows how to use GRPC API and REST API.