OpenAI API text to speech endpoints#
API Reference#
OpenVINO Model Server includes now the audio/speech endpoint using OpenAI API.
It is used to execute text to speech task with OpenVINO GenAI pipeline.
Please see the OpenAI API Create Speech Reference for more information on the API.
The endpoint is exposed via a path:
http://server_name:port/v3/audio/speech
Request body must be in JSON format, and the request must have Content-Type: application/json header.
Example request#
curl http://localhost:8000/v3/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "microsoft/speecht5_tts",
"input": "The quick brown fox jumped over the lazy dog.",
}' \
-o speech.wav
Example response#
speech.wav - audio file in wav format.
Request#
Param |
OpenVINO Model Server |
OpenAI /audio//speech API |
Type |
Description |
|---|---|---|---|---|
model |
✅ |
✅ |
string (required) |
Name of the model to use. Name assigned to a MediaPipe graph configured to schedule generation using desired embedding model. Note: This can also be omitted to fall back to URI based routing. Read more on routing topic TODO |
input |
✅ |
✅ |
string (required) |
The text to generate audio for. |
voice |
❌ |
✅ |
string (required) |
The voice to use when generating the audio. |
instructions |
❌ |
✅ |
string |
Control the voice of your generated audio with additional instructions. |
response_format |
❌ |
✅ |
string |
The format to audio in. |
speed |
❌ |
✅ |
number |
The speed of the generated audio. |
stream_format |
❌ |
✅ |
string |
The format to stream the audio in. |
Error handling#
Endpoint can raise an error related to incorrect request in the following conditions:
Incorrect format of any of the fields based on the schema