openvino_genai.Text2SpeechPipeline#

class openvino_genai.Text2SpeechPipeline#

Bases: pybind11_object

Text-to-speech pipeline

__init__(self: openvino_genai.py_openvino_genai.Text2SpeechPipeline, models_path: os.PathLike | str | bytes, device: str, **kwargs) None#

Text2SpeechPipeline class constructor. models_path (os.PathLike): Path to the model file. device (str): Device to run the model on (e.g., CPU, GPU).

Methods

__delattr__(name, /)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattribute__(name, /)

Return getattr(self, name).

__getstate__()

Helper for pickle.

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init__(self, models_path, device, **kwargs)

Text2SpeechPipeline class constructor.

__init_subclass__

This method is called when a class is subclassed.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(**kwargs)

__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__()

Return repr(self).

__setattr__(name, value, /)

Implement setattr(self, name, value).

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__

Abstract classes can override this to customize issubclass().

_pybind11_conduit_v1_

generate(*args, **kwargs)

Overloaded function.

get_generation_config(self)

set_generation_config(self, config)

Attributes

__annotations__

__annotations__ = {}#
__class__#

alias of pybind11_type

__delattr__(name, /)#

Implement delattr(self, name).

__dir__()#

Default dir() implementation.

__eq__(value, /)#

Return self==value.

__format__(format_spec, /)#

Default object formatter.

Return str(self) if format_spec is empty. Raise TypeError otherwise.

__ge__(value, /)#

Return self>=value.

__getattribute__(name, /)#

Return getattr(self, name).

__getstate__()#

Helper for pickle.

__gt__(value, /)#

Return self>value.

__hash__()#

Return hash(self).

__init__(self: openvino_genai.py_openvino_genai.Text2SpeechPipeline, models_path: os.PathLike | str | bytes, device: str, **kwargs) None#

Text2SpeechPipeline class constructor. models_path (os.PathLike): Path to the model file. device (str): Device to run the model on (e.g., CPU, GPU).

__init_subclass__()#

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__le__(value, /)#

Return self<=value.

__lt__(value, /)#

Return self<value.

__ne__(value, /)#

Return self!=value.

__new__(**kwargs)#
__reduce__()#

Helper for pickle.

__reduce_ex__(protocol, /)#

Helper for pickle.

__repr__()#

Return repr(self).

__setattr__(name, value, /)#

Implement setattr(self, name, value).

__sizeof__()#

Size of object in memory, in bytes.

__str__()#

Return str(self).

__subclasshook__()#

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

_pybind11_conduit_v1_()#
generate(*args, **kwargs)#

Overloaded function.

  1. generate(self: openvino_genai.py_openvino_genai.Text2SpeechPipeline, text: str, speaker_embedding: object = None, **kwargs) -> openvino_genai.py_openvino_genai.Text2SpeechDecodedResults

    Generates speeches based on input texts

    param text(s):

    input text(s) for which to generate speech

    type text(s):

    str or list[str]

    :param speaker_embedding optional speaker embedding tensor representing the unique characteristics of a speaker’s

    voice. If not provided for SpeechT5 TSS model, the 7306-th vector from the validation set of the Matthijs/cmu-arctic-xvectors dataset is used by default.

    type speaker_embedding:

    openvino.Tensor or None

    param properties:

    speech generation parameters specified as properties

    type properties:

    dict

    returns:

    raw audios of the input texts spoken in the specified speaker’s voice, with a sample rate of 16 kHz

    rtype:

    Text2SpeechDecodedResults

    SpeechGenerationConfig

    Speech-generation specific parameters: :param minlenratio: minimum ratio of output length to input text length; prevents output that’s too short. :type minlenratio: float

    param maxlenratio:

    maximum ratio of output length to input text length; prevents excessively long outputs.

    type minlenratio:

    float

    param threshold:

    probability threshold for stopping decoding; when output probability exceeds above this, generation will stop.

    type threshold:

    float

  2. generate(self: openvino_genai.py_openvino_genai.Text2SpeechPipeline, texts: collections.abc.Sequence[str], speaker_embedding: object = None, **kwargs) -> openvino_genai.py_openvino_genai.Text2SpeechDecodedResults

    Generates speeches based on input texts

    param text(s):

    input text(s) for which to generate speech

    type text(s):

    str or list[str]

    :param speaker_embedding optional speaker embedding tensor representing the unique characteristics of a speaker’s

    voice. If not provided for SpeechT5 TSS model, the 7306-th vector from the validation set of the Matthijs/cmu-arctic-xvectors dataset is used by default.

    type speaker_embedding:

    openvino.Tensor or None

    param properties:

    speech generation parameters specified as properties

    type properties:

    dict

    returns:

    raw audios of the input texts spoken in the specified speaker’s voice, with a sample rate of 16 kHz

    rtype:

    Text2SpeechDecodedResults

    SpeechGenerationConfig

    Speech-generation specific parameters: :param minlenratio: minimum ratio of output length to input text length; prevents output that’s too short. :type minlenratio: float

    param maxlenratio:

    maximum ratio of output length to input text length; prevents excessively long outputs.

    type minlenratio:

    float

    param threshold:

    probability threshold for stopping decoding; when output probability exceeds above this, generation will stop.

    type threshold:

    float

get_generation_config(self: openvino_genai.py_openvino_genai.Text2SpeechPipeline) openvino_genai.py_openvino_genai.SpeechGenerationConfig#
set_generation_config(self: openvino_genai.py_openvino_genai.Text2SpeechPipeline, config: openvino_genai.py_openvino_genai.SpeechGenerationConfig) None#