openvino.runtime.opset13.scaled_dot_product_attention

openvino.runtime.opset13.scaled_dot_product_attention(query: Union[openvino._pyopenvino.Node, int, float, numpy.ndarray], key: Union[openvino._pyopenvino.Node, int, float, numpy.ndarray], value: Union[openvino._pyopenvino.Node, int, float, numpy.ndarray], attention_mask: Optional[Union[openvino._pyopenvino.Node, int, float, numpy.ndarray]] = None, scale: Optional[Union[openvino._pyopenvino.Node, int, float, numpy.ndarray]] = None, causal: bool = False, name: Optional[str] = None) openvino._pyopenvino.Node

Return a node which implements Scaled Dot Product Attention.

Parameters
  • query – Query tensor of shape [N, …, L, E] and floating-point datatype.

  • key – Key tensor of shape [N, …, S, E] and floating-point datatype.

  • value – Value tensor of shape [N, …, S, Ev] and floating-point datatype.

  • attention_mask – Optional attention mask tensor of shape [N, …, L, S] or scalar float type zero value. Refer to the operation specification for a complete description.

  • scale – Optional alternative scale, a floating-point type scalar.

  • causal – If true, then autogenerates causal attention mask instead of using attention_mask input. In this case attention_mask input is ignored.

  • name – The optional new name for output node.

Returns

The new node performing Scaled Dot Product Attention operation.