OpenAI API completions endpoint#
API Reference#
OpenVINO Model Server includes now the embeddings
endpoint using OpenAI API.
Please see the OpenAI API Reference for more information on the API.
The endpoint is exposed via a path:
http://server_name:port/v3/embeddings
Example request#
curl http://localhost/v3/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "gte-large",
"input": ["This is a test"],
"encoding_format": "float"
}'
Example response#
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [
-0.03440694510936737,
-0.02553200162947178,
-0.010130723007023335,
-0.013917984440922737,
...
0.02722850814461708,
-0.017527244985103607,
-0.0053995149210095406
],
"index": 0
}
]
}
Request#
Generic#
| Param | OpenVINO Model Server | OpenAI /completions API | Type | Description |
|—–|———-|———-|———-|———|—–|
| model | ✅ | ✅ | string (required) | Name of the model to use. Name assigned to a MediaPipe graph configured to schedule generation using desired embedding model. |
| input | ✅ | ✅ | string/list of strings (required) | Input text to embed, encoded as a string or a list of strings |
| encoding_format | ✅ | ✅ | float or base64 (default: float
) | The format to return the embeddings in |
Unsupported params from OpenAI service:#
user
dimensions
Response#
Param |
OpenVINO Model Server |
OpenAI /completions API |
Type |
Description |
---|---|---|---|---|
data |
✅ |
✅ |
array |
A list of responses for each string |
data.embedding |
✅ |
✅ |
array of float or base64 string |
Vector of embeddings for a string. |
data.index |
✅ |
✅ |
integer |
Response index |
model |
✅ |
✅ |
string |
Model name |
Unsupported params from OpenAI service:#
usage
References#
End to end demo with embeddings endpoint
Developer guide for writing custom calculators with REST API extension