KServe API Clients

Python Client

Python

When creating a Python-based client application, you can use Triton client library - tritonclient.

Install the Package

pip3 install tritonclient[all]

Request Health Endpoints

import tritonclient.grpc as grpcclient

client = grpcclient.InferenceServerClient("localhost:9000")

# Check server liveness
server_live = client.is_server_live()

# Check server readiness
server_ready = client.is_server_ready()

# Check model readiness
model_ready = client.is_model_ready("model_name")

Request Server Metadata

import tritonclient.grpc as grpcclient

client = grpcclient.InferenceServerClient("localhost:9000")
server_metadata = client.get_server_metadata()

Request Model Metadata

import tritonclient.grpc as grpcclient

client = grpcclient.InferenceServerClient("localhost:9000")
model_metadata = client.get_model_metadata("model_name")

Request Prediction on a Numpy Array

import numpy as np
import tritonclient.grpc as grpcclient

client = grpcclient.InferenceServerClient("localhost:9000")
data = np.array([1.0, 2.0, ..., 1000.0])
infer_input = grpcclient.InferInput("input_name", data.shape, "FP32")
infer_input.set_data_from_numpy(data)
results = client.infer("model_name", [infer_input])

For complete usage examples see Kserve samples.

C++ Client

Creating a client application in C++ follows the same principles as Python. When creating a C++-based client application, you can use Triton client library - tritonclient.

See our C++ samples to learn how to build a sample C++ client application.