openvino_genai.WhisperPipeline#
- class openvino_genai.WhisperPipeline#
Bases:
pybind11_object
Automatic speech recognition pipeline
- __init__(self: openvino_genai.py_openvino_genai.WhisperPipeline, models_path: os.PathLike, device: str, **kwargs) None #
WhisperPipeline class constructor. models_path (os.PathLike): Path to the model file. device (str): Device to run the model on (e.g., CPU, GPU).
Methods
__delattr__
(name, /)Implement delattr(self, name).
__dir__
()Default dir() implementation.
__eq__
(value, /)Return self==value.
__format__
(format_spec, /)Default object formatter.
__ge__
(value, /)Return self>=value.
__getattribute__
(name, /)Return getattr(self, name).
__gt__
(value, /)Return self>value.
__hash__
()Return hash(self).
__init__
(self, models_path, device, **kwargs)WhisperPipeline class constructor.
This method is called when a class is subclassed.
__le__
(value, /)Return self<=value.
__lt__
(value, /)Return self<value.
__ne__
(value, /)Return self!=value.
__new__
(**kwargs)Helper for pickle.
__reduce_ex__
(protocol, /)Helper for pickle.
__repr__
()Return repr(self).
__setattr__
(name, value, /)Implement setattr(self, name, value).
Size of object in memory, in bytes.
__str__
()Return str(self).
Abstract classes can override this to customize issubclass().
generate
(self, raw_speech_input[, ...])High level generate that receives raw speech as a vector of floats and returns decoded output.
get_generation_config
(self)get_tokenizer
(self)set_generation_config
(self, config)- __class__#
alias of
pybind11_type
- __delattr__(name, /)#
Implement delattr(self, name).
- __dir__()#
Default dir() implementation.
- __eq__(value, /)#
Return self==value.
- __format__(format_spec, /)#
Default object formatter.
- __ge__(value, /)#
Return self>=value.
- __getattribute__(name, /)#
Return getattr(self, name).
- __gt__(value, /)#
Return self>value.
- __hash__()#
Return hash(self).
- __init__(self: openvino_genai.py_openvino_genai.WhisperPipeline, models_path: os.PathLike, device: str, **kwargs) None #
WhisperPipeline class constructor. models_path (os.PathLike): Path to the model file. device (str): Device to run the model on (e.g., CPU, GPU).
- __init_subclass__()#
This method is called when a class is subclassed.
The default implementation does nothing. It may be overridden to extend subclasses.
- __le__(value, /)#
Return self<=value.
- __lt__(value, /)#
Return self<value.
- __ne__(value, /)#
Return self!=value.
- __new__(**kwargs)#
- __reduce__()#
Helper for pickle.
- __reduce_ex__(protocol, /)#
Helper for pickle.
- __repr__()#
Return repr(self).
- __setattr__(name, value, /)#
Implement setattr(self, name, value).
- __sizeof__()#
Size of object in memory, in bytes.
- __str__()#
Return str(self).
- __subclasshook__()#
Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).
- generate(self: openvino_genai.py_openvino_genai.WhisperPipeline, raw_speech_input: list[float], generation_config: openvino_genai.py_openvino_genai.WhisperGenerationConfig | None = None, streamer: Callable[[str], bool] | openvino_genai.py_openvino_genai.StreamerBase | None = None, **kwargs) openvino_genai.py_openvino_genai.DecodedResults #
High level generate that receives raw speech as a vector of floats and returns decoded output.
- Parameters:
raw_speech_input (List[float]) – inputs in the form of list of floats. Required to be normalized to near [-1, 1] range and have 16k Hz sampling rate.
generation_config (WhisperGenerationConfig or a Dict) – generation_config
streamer – streamer either as a lambda with a boolean returning flag whether generation should be stopped. Streamer supported for short-form audio (< 30 seconds) with return_timestamps=False only
:type : Callable[[str], bool], ov.genai.StreamerBase
- Parameters:
kwargs – arbitrary keyword arguments with keys corresponding to WhisperGenerationConfig fields.
:type : Dict
- Returns:
return results in encoded, or decoded form depending on inputs type
- Return type:
WhisperGenerationConfig :param max_length: the maximum length the generated tokens can have. Corresponds to the length of the input prompt +
max_new_tokens. Its effect is overridden by max_new_tokens, if also set.
- Parameters:
max_new_tokens (int) – the maximum numbers of tokens to generate, excluding the number of tokens in the prompt. max_new_tokens has priority over max_length.
eos_token_id (int) – End of stream token id.
Whisper specific parameters:
- Parameters:
decoder_start_token_id (int) – Corresponds to the ”<|startoftranscript|>” token.
pad_token_id (int) – Padding token id.
translate_token_id (int) – Translate token id.
transcribe_token_id (int) – Transcribe token id.
no_timestamps_token_id (int) – No timestamps token id.
is_multilingual (bool)
begin_suppress_tokens (list[int]) – A list containing tokens that will be suppressed at the beginning of the sampling process.
suppress_tokens (list[int]) – A list containing the non-speech tokens that will be suppressed during generation.
language (Optional[str]) – Language token to use for generation in the form of <|en|>. You can find all the possible language tokens in the generation_config.json lang_to_id dictionary.
lang_to_id (Dict[str, int]) – Language token to token_id map. Initialized from the generation_config.json lang_to_id dictionary.
task (int) – Task to use for generation, either “translate” or “transcribe”
return_timestamps (bool) –
If true the pipeline will return timestamps along the text for segments of words in the text. For instance, if you get WhisperDecodedResultChunk
start_ts = 0.5 end_ts = 1.5 text = “ Hi there!”
then it means the model predicts that the segment “Hi there!” was spoken after 0.5 and before 1.5 seconds. Note that a segment of text refers to a sequence of one or more words, rather than individual words.
- get_generation_config(self: openvino_genai.py_openvino_genai.WhisperPipeline) openvino_genai.py_openvino_genai.WhisperGenerationConfig #
- get_tokenizer(self: openvino_genai.py_openvino_genai.WhisperPipeline) openvino_genai.py_openvino_genai.Tokenizer #
- set_generation_config(self: openvino_genai.py_openvino_genai.WhisperPipeline, config: openvino_genai.py_openvino_genai.WhisperGenerationConfig) None #