openvino_genai.SparseAttentionMode#
- class openvino_genai.SparseAttentionMode#
 Bases:
pybind11_object- Represents the mode of sparse attention applied during generation.
 - param SparseAttentionMode.TRISHAPE:
 Sparse attention will be applied to prefill stage only, with a configurable number of start and recent cache tokens to be retained. A number of prefill tokens in the end of the prompt can be configured to have dense attention applied to them instead, to retain generation accuracy.
- param SparseAttentionMode.XATTENTION:
 Following https://arxiv.org/pdf/2503.16428, introduces importance score threshold-based block sparsity into the prefill stage. Computing importance scores introduces an overhead, but the total inference time is expected to be reduced even more.
Members:
TRISHAPE
XATTENTION
- __init__(self: openvino_genai.py_openvino_genai.SparseAttentionMode, value: SupportsInt) None#
 
Methods
__delattr__(name, /)Implement delattr(self, name).
__dir__()Default dir() implementation.
__eq__(self, other, /)__format__(format_spec, /)Default object formatter.
__ge__(value, /)Return self>=value.
__getattribute__(name, /)Return getattr(self, name).
__getstate__(self, /)__gt__(value, /)Return self>value.
__hash__(self, /)__index__(self, /)__init__(self, value)This method is called when a class is subclassed.
__int__(self, /)__le__(value, /)Return self<=value.
__lt__(value, /)Return self<value.
__ne__(self, other, /)__new__(**kwargs)Helper for pickle.
__reduce_ex__(protocol, /)Helper for pickle.
__repr__(self, /)__setattr__(name, value, /)Implement setattr(self, name, value).
__setstate__(self, state, /)Size of object in memory, in bytes.
__str__(self, /)Abstract classes can override this to customize issubclass().
Attributes
__entries- TRISHAPE = <SparseAttentionMode.TRISHAPE: 0>#
 
- XATTENTION = <SparseAttentionMode.XATTENTION: 1>#
 
- __annotations__ = {}#
 
- __class__#
 alias of
pybind11_type
- __delattr__(name, /)#
 Implement delattr(self, name).
- __dir__()#
 Default dir() implementation.
- __eq__(self: object, other: object, /) bool#
 
- __format__(format_spec, /)#
 Default object formatter.
Return str(self) if format_spec is empty. Raise TypeError otherwise.
- __ge__(value, /)#
 Return self>=value.
- __getattribute__(name, /)#
 Return getattr(self, name).
- __getstate__(self: object, /) int#
 
- __gt__(value, /)#
 Return self>value.
- __hash__(self: object, /) int#
 
- __index__(self: openvino_genai.py_openvino_genai.SparseAttentionMode, /) int#
 
- __init__(self: openvino_genai.py_openvino_genai.SparseAttentionMode, value: SupportsInt) None#
 
- __init_subclass__()#
 This method is called when a class is subclassed.
The default implementation does nothing. It may be overridden to extend subclasses.
- __int__(self: openvino_genai.py_openvino_genai.SparseAttentionMode, /) int#
 
- __le__(value, /)#
 Return self<=value.
- __lt__(value, /)#
 Return self<value.
- __members__ = {'TRISHAPE': <SparseAttentionMode.TRISHAPE: 0>, 'XATTENTION': <SparseAttentionMode.XATTENTION: 1>}#
 
- __ne__(self: object, other: object, /) bool#
 
- __new__(**kwargs)#
 
- __reduce__()#
 Helper for pickle.
- __reduce_ex__(protocol, /)#
 Helper for pickle.
- __repr__(self: object, /) str#
 
- __setattr__(name, value, /)#
 Implement setattr(self, name, value).
- __setstate__(self: openvino_genai.py_openvino_genai.SparseAttentionMode, state: SupportsInt, /) None#
 
- __sizeof__()#
 Size of object in memory, in bytes.
- __str__(self: object, /) str#
 
- __subclasshook__()#
 Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).
- _pybind11_conduit_v1_()#
 
- property name#
 
- property value#