OpenAI API completions endpoint#

API Reference#

OpenVINO Model Server includes now the embeddings endpoint using OpenAI API. Please see the OpenAI API Reference for more information on the API. The endpoint is exposed via a path:

http://server_name:port/v3/embeddings

Example request#

curl http://localhost/v3/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gte-large",
    "input": ["This is a test"],
    "encoding_format": "float"
  }'

Example response#

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        -0.03440694510936737,
        -0.02553200162947178,
        -0.010130723007023335,
        -0.013917984440922737,
...
        0.02722850814461708,
        -0.017527244985103607,
        -0.0053995149210095406
      ],
      "index": 0
    }
  ]
}

Request#

Generic#

| Param | OpenVINO Model Server | OpenAI /completions API | Type | Description | |—–|———-|———-|———-|———|—–| | model | ✅ | ✅ | string (required) | Name of the model to use. Name assigned to a MediaPipe graph configured to schedule generation using desired embedding model. | | input | ✅ | ✅ | string/list of strings (required) | Input text to embed, encoded as a string or a list of strings | | encoding_format | ✅ | ✅ | float or base64 (default: float) | The format to return the embeddings in |

Unsupported params from OpenAI service:#

  • user

  • dimensions

Response#

Param

OpenVINO Model Server

OpenAI /completions API

Type

Description

data

array

A list of responses for each string

data.embedding

array of float or base64 string

Vector of embeddings for a string.

data.index

integer

Response index

model

string

Model name

Unsupported params from OpenAI service:#

  • usage

References#

End to end demo with embeddings endpoint

Code snippets

Developer guide for writing custom calculators with REST API extension