openvino_genai#

openvino genai module namespace, exposing pipelines and configs to create these pipelines.

Functions

`draft_model`(models_path[, device])	device on which inference will be performed
`get_version`()	OpenVINO GenAI version

Classes

`Adapter`	Immutable LoRA Adapter that carries the adaptation matrices and serves as unique adapter identifier.
`AdapterConfig`	Adapter config that defines a combination of LoRA adapters with blending parameters.
`AggregationMode`	Represents the mode of per-token score aggregation when determining least important tokens for eviction from cache
`AutoencoderKL`	AutoencoderKL class.
`AutoencoderKLLTXVideo`	AutoencoderKLLTXVideo class for LTX-Video VAE decoding.
`CLIPTextModel`	CLIPTextModel class.
`CLIPTextModelWithProjection`	CLIPTextModelWithProjection class.
`CacheEvictionConfig`	Configuration struct for the cache eviction algorithm.
`ChatHistory`	ChatHistory stores conversation messages and optional metadata for chat templates.
`ContinuousBatchingPipeline`	This class is used for generation with LLMs with continuous batchig
`CppStdGenerator`	This class wraps std::mt19937 pseudo-random generator.
`DecodedResults`	Structure to store resulting batched text outputs and scores for each batch.
`DeepSeekR1ReasoningIncrementalParser`
`DeepSeekR1ReasoningParser`
`EncodedResults`	Structure to store resulting batched tokens and scores for each batch sequence.
`FluxTransformer2DModel`	FluxTransformer2DModel class.
`GenerationConfig`	Structure to keep generation config parameters.
`GenerationFinishReason`	Members:
`GenerationResult`	GenerationResult stores resulting batched tokens and scores.
`GenerationStatus`	Members:
`Generator`	This class is used for storing pseudo-random generator.
`Image2ImagePipeline`	This class is used for generation with image-to-image models.
`ImageGenerationConfig`	This class is used for storing generation config for image generation pipeline.
`ImageGenerationPerfMetrics`	Holds performance metrics for each generate call.
`IncrementalParser`
`InpaintingPipeline`	This class is used for generation with inpainting models.
`KVCrushAnchorPointMode`	Represents the anchor point types for KVCrush cache eviction
`KVCrushConfig`	Configuration for KVCrush cache eviction algorithm
`LLMPipeline`	This class is used for generation with LLMs
`LTXVideoTransformer3DModel`	LTXVideoTransformer3DModel class for LTX-Video denoising.
`Llama3JsonToolParser`
`Llama3PythonicToolParser`
`Parser`
`PerfMetrics`	Holds performance metrics for each generate call.
`Phi4ReasoningIncrementalParser`
`Phi4ReasoningParser`
`RawImageGenerationPerfMetrics`	Structure with raw performance metrics for each generation before any statistics are calculated.
`RawPerfMetrics`	Structure with raw performance metrics for each generation before any statistics are calculated.
`ReasoningIncrementalParser`
`ReasoningParser`
`SD3Transformer2DModel`	SD3Transformer2DModel class.
`Scheduler`	Scheduler for image generation pipelines.
`SchedulerConfig`	SchedulerConfig to construct ContinuousBatchingPipeline
`SparseAttentionConfig`	Configuration struct for the sparse attention functionality.
`SparseAttentionMode`	Represents the mode of sparse attention applied during generation.
`SpeechGenerationConfig`	Speech-generation specific parameters: :param minlenratio: minimum ratio of output length to input text length; prevents output that's too short.
`SpeechGenerationPerfMetrics`	Structure with raw performance metrics for each generation before any statistics are calculated.
`StopCriteria`	StopCriteria controls the stopping condition for grouped beam search.
`StreamerBase`	Base class for streamers.
`StreamingStatus`	Members:
`StructuralTagItem`	Structure to keep generation config parameters for structural tags in structured output generation.
`StructuralTagsConfig`	Configures structured output generation by combining regular sampling with structural tags.
`StructuredOutputConfig`	Structure to keep generation config parameters for structured output generation.
`T5EncoderModel`	T5EncoderModel class.
`Text2ImagePipeline`	This class is used for generation with text-to-image models.
`Text2SpeechDecodedResults`	Structure that stores the result from the generate method, including a list of waveform tensors sampled at 16 kHz, along with performance metrics
`Text2SpeechPipeline`	Text-to-speech pipeline
`Text2VideoPipeline`
`TextEmbeddingPipeline`	Text embedding pipeline
`TextParserStreamer`	Base class for text streamers which works with parsed messages.
`TextRerankPipeline`	Text rerank pipeline
`TextStreamer`	TextStreamer is used to decode tokens into text and call a user-defined callback function.
`TokenizedInputs`
`Tokenizer`	The class is used to encode prompts and decode resulting tokens
`TorchGenerator`	This class provides OpenVINO GenAI Generator wrapper for torch.Generator
`UNet2DConditionModel`	UNet2DConditionModel class.
`VLLMParserWrapper`
`VLMPipeline`	This class is used for generation with VLMs
`VideoGenerationConfig`
`VideoGenerationPerfMetrics`
`VideoGenerationResult`
`WhisperGenerationConfig`	Whisper specific parameters: :param decoder_start_token_id: Corresponds to the ”<\|startoftranscript\|>” token.
`WhisperPerfMetrics`	Structure with raw performance metrics for each generation before any statistics are calculated.
`WhisperPipeline`	Automatic speech recognition pipeline
`WhisperRawPerfMetrics`	Structure with whisper specific raw performance metrics for each generation before any statistics are calculated.
`WhisperWordTiming`	Structure to store word-level timestamps