openvino_genai.GenerationConfig#

class openvino_genai.GenerationConfig#

Bases: pybind11_object

Structure to keep generation config parameters. For a selected method of decoding, only parameters from that group and generic parameters are used. For example, if do_sample is set to true, then only generic parameters and random sampling parameters will be used while greedy and beam search parameters will not affect decoding at all.

Parameters: max_length: the maximum length the generated tokens can have. Corresponds to the length of the input prompt +

max_new_tokens. Its effect is overridden by max_new_tokens, if also set.

max_new_tokens: the maximum numbers of tokens to generate, excluding the number of tokens in the prompt. max_new_tokens has priority over max_length. ignore_eos: if set to true, then generation will not stop even if <eos> token is met. eos_token_id: token_id of <eos> (end of sentence) min_new_tokens: set 0 probability for eos_token_id for the first eos_token_id generated tokens. Ignored for non continuous batching. stop_strings: a set of strings that will cause pipeline to stop generating further tokens. include_stop_str_in_output: if set to true stop string that matched generation will be included in generation output (default: false) stop_token_ids: a set of tokens that will cause pipeline to stop generating further tokens. echo: if set to true, the model will echo the prompt in the output. logprobs: number of top logprobs computed for each position, if set to 0, logprobs are not computed and value 0.0 is returned.

Currently only single top logprob can be returned, so any logprobs > 1 is treated as logprobs == 1. (default: 0).

Beam search specific parameters: num_beams: number of beams for beam search. 1 disables beam search. num_beam_groups: number of groups to divide num_beams into in order to ensure diversity among different groups of beams. diversity_penalty: value is subtracted from a beam’s score if it generates the same token as any beam from other group at a particular time. length_penalty: exponential penalty to the length that is used with beam-based generation. It is applied as an exponent to

the sequence length, which in turn is used to divide the score of the sequence. Since the score is the log likelihood of the sequence (i.e. negative), length_penalty > 0.0 promotes longer sequences, while length_penalty < 0.0 encourages shorter sequences.

num_return_sequences: the number of sequences to return for grouped beam search decoding. no_repeat_ngram_size: if set to int > 0, all ngrams of that size can only occur once. stop_criteria: controls the stopping condition for grouped beam search. It accepts the following values:

“openvino_genai.StopCriteria.EARLY”, where the generation stops as soon as there are num_beams complete candidates; “openvino_genai.StopCriteria.HEURISTIC” is applied and the generation stops when is it very unlikely to find better candidates; “openvino_genai.StopCriteria.NEVER”, where the beam search procedure only stops when there cannot be better candidates (canonical beam search algorithm).

Random sampling parameters: temperature: the value used to modulate token probabilities for random sampling. top_p: if set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. top_k: the number of highest probability vocabulary tokens to keep for top-k-filtering. do_sample: whether or not to use multinomial random sampling that add up to top_p or higher are kept. repetition_penalty: the parameter for repetition penalty. 1.0 means no penalty.

__init__(*args, **kwargs)#

Overloaded function.

  1. __init__(self: openvino_genai.py_openvino_genai.GenerationConfig, json_path: os.PathLike) -> None

path where generation_config.json is stored

  1. __init__(self: openvino_genai.py_openvino_genai.GenerationConfig, **kwargs) -> None

Methods

__delattr__(name, /)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattribute__(name, /)

Return getattr(self, name).

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init__(*args, **kwargs)

Overloaded function.

__init_subclass__

This method is called when a class is subclassed.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(**kwargs)

__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__()

Return repr(self).

__setattr__(name, value, /)

Implement setattr(self, name, value).

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__

Abstract classes can override this to customize issubclass().

is_beam_search(self)

is_greedy_decoding(self)

is_speculative_decoding(self)

set_eos_token_id(self, tokenizer_eos_token_id)

update_generation_config(self, config_map)

Attributes

adapters

assistant_confidence_threshold

diversity_penalty

do_sample

echo

eos_token_id

frequency_penalty

ignore_eos

include_stop_str_in_output

length_penalty

logprobs

max_length

max_new_tokens

min_new_tokens

no_repeat_ngram_size

num_assistant_tokens

num_beam_groups

num_beams

num_return_sequences

presence_penalty

repetition_penalty

rng_seed

stop_criteria

stop_strings

stop_token_ids

temperature

top_k

top_p

__class__#

alias of pybind11_type

__delattr__(name, /)#

Implement delattr(self, name).

__dir__()#

Default dir() implementation.

__eq__(value, /)#

Return self==value.

__format__(format_spec, /)#

Default object formatter.

__ge__(value, /)#

Return self>=value.

__getattribute__(name, /)#

Return getattr(self, name).

__gt__(value, /)#

Return self>value.

__hash__()#

Return hash(self).

__init__(*args, **kwargs)#

Overloaded function.

  1. __init__(self: openvino_genai.py_openvino_genai.GenerationConfig, json_path: os.PathLike) -> None

path where generation_config.json is stored

  1. __init__(self: openvino_genai.py_openvino_genai.GenerationConfig, **kwargs) -> None

__init_subclass__()#

This method is called when a class is subclassed.

The default implementation does nothing. It may be overridden to extend subclasses.

__le__(value, /)#

Return self<=value.

__lt__(value, /)#

Return self<value.

__ne__(value, /)#

Return self!=value.

__new__(**kwargs)#
__reduce__()#

Helper for pickle.

__reduce_ex__(protocol, /)#

Helper for pickle.

__repr__()#

Return repr(self).

__setattr__(name, value, /)#

Implement setattr(self, name, value).

__sizeof__()#

Size of object in memory, in bytes.

__str__()#

Return str(self).

__subclasshook__()#

Abstract classes can override this to customize issubclass().

This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).

property adapters#
property assistant_confidence_threshold#
property diversity_penalty#
property do_sample#
property echo#
property eos_token_id#
property frequency_penalty#
property ignore_eos#
property include_stop_str_in_output#
is_greedy_decoding(self: openvino_genai.py_openvino_genai.GenerationConfig) bool#
is_speculative_decoding(self: openvino_genai.py_openvino_genai.GenerationConfig) bool#
property length_penalty#
property logprobs#
property max_length#
property max_new_tokens#
property min_new_tokens#
property no_repeat_ngram_size#
property num_assistant_tokens#
property num_beam_groups#
property num_beams#
property num_return_sequences#
property presence_penalty#
property repetition_penalty#
property rng_seed#
set_eos_token_id(self: openvino_genai.py_openvino_genai.GenerationConfig, tokenizer_eos_token_id: int) None#
property stop_criteria#
property stop_strings#
property stop_token_ids#
property temperature#
property top_k#
property top_p#
update_generation_config(self: openvino_genai.py_openvino_genai.GenerationConfig, config_map: dict[str, openvino._pyopenvino.OVAny]) None#