openvino.runtime.opset13.scaled_dot_product_attention#

openvino.runtime.opset13.scaled_dot_product_attention(query: Node | int | float | ndarray, key: Node | int | float | ndarray, value: Node | int | float | ndarray, attention_mask: Node | int | float | ndarray | None = None, scale: Node | int | float | ndarray | None = None, causal: bool = False, name: str | None = None) Node#

Return a node which implements Scaled Dot Product Attention.

Parameters:
  • query – Query tensor of shape [N, …, L, E] and floating-point datatype.

  • key – Key tensor of shape [N, …, S, E] and floating-point datatype.

  • value – Value tensor of shape [N, …, S, Ev] and floating-point datatype.

  • attention_mask – Optional attention mask tensor of shape [N, …, L, S] or scalar float type zero value. Refer to the operation specification for a complete description.

  • scale – Optional alternative scale, a floating-point type scalar.

  • causal – If true, then autogenerates causal attention mask instead of using attention_mask input. In this case attention_mask input is ignored.

  • name – The optional new name for output node.

Returns:

The new node performing Scaled Dot Product Attention operation.