Class ov::pass::PackMultiHeadAttention#

class PackMultiHeadAttention : public ov::pass::ModelPass#

Common: model-level transformation that orchestrates packing/canonicalization of MHA/GQA.

Runs step-by-step merging passes (projections, RoPE, SDPA, KV-cache, DQ) to produce a compact packed form.