interface ngraph::snippets::pass::FakeQuantizeDecomposition

Overview

FakeQuantizeDecomposition transformation decomposes FakeQuantize layer. More…

#include <fq_decomposition.hpp>

template FakeQuantizeDecomposition: public ov::pass::MatcherPass
{
    // methods

    static bool isAllScalarConstant(const std::shared_ptr<const ngraph::Node>& node);

    static bool getScalesAndShifts(
        const std::shared_ptr<const ngraph::op::v0::FakeQuantize>& fq_node,
        std::vector<float>& cl,
        std::vector<float>& ch,
        std::vector<float>& isc,
        std::vector<float>& ish,
        std::vector<float>& osc,
        std::vector<float>& osh
        );

    static std::vector<float> calculateScales(
        const ngraph::element::Type& out_type,
        const std::vector<float>& cl,
        const std::vector<float>& ch,
        const std::vector<float>& isc,
        const std::vector<float>& ish,
        const std::vector<float>& osc,
        const std::vector<float>& osh
        );
};

Inherited Members

public:
    // typedefs

    typedef DiscreteTypeInfo type_info_t;

    // methods

    bool get_property(const PassPropertyMask& prop_mask) const;
    void set_name(const std::string& name);
    std::string get_name() const;
    void set_callback(const param_callback& callback);
    virtual void set_pass_config(const std::shared_ptr<PassConfig>& pass_config);
    std::shared_ptr<PassConfig> get_pass_config();
    bool m_transformation_callback(const std::shared_ptr<const Node>& node);
    bool transformation_callback(const std::shared_ptr<const Node>& node);
    virtual const type_info_t& get_type_info() const = 0;
    OPENVINO_RTTI("ov::pass::MatcherPass");
    MatcherPass& operator = (const MatcherPass&);
    bool apply(std::shared_ptr<ov::Node> node);

    template <typename T, class... Args>
    std::shared_ptr<T> register_new_node(Args&&... args);

    template <typename T>
    std::shared_ptr<T> register_new_node(const std::shared_ptr<T>& node);

    std::shared_ptr<ov::Node> register_new_node_(const std::shared_ptr<ov::Node>& node);
    const std::vector<std::shared_ptr<ov::Node>>& get_new_nodes();
    void clear_new_nodes();
    std::shared_ptr<pattern::Matcher> get_matcher();

Detailed Documentation

FakeQuantizeDecomposition transformation decomposes FakeQuantize layer.

Expression from specification: if x <= min(il, ih): output = ol elif x > max(il, ih): output = oh else: output = round((x - il) / (ih - il) * (levels-1)) / (levels-1) * (oh - ol) + ol

Expand brackets: round(x * (levels-1) / (ih - il) - il * (levels-1) / (ih - il)) * (oh - ol) / (levels-1) + ol

Marking:

  • isc := (levels-1) / (ih - il)

  • ish := -il * isc

  • osc := (oh - ol) / (levels-1)

  • osh := ol Final expression: round(x * isc + ish) * osc + osh

Some optimizations (example for scalars):

  1. If output element type of FQ is U8 and il = 0, ish = 0, osc = 1, osh = 0, there is enough expression: x * isc

  2. If output element type of FQ is I8 and ish ~= 128, osc = 1, osh ~= -128, il * isc ~= -128, ih * isc ~= 127 there is enough expression: x * isc

  3. If osc = 1, osh = 0, there isn’t dequantization

  4. If there isn’t dequantization and output element type of FQ isn’t FP32, there isn’t rounding

This transformation doesn’t support following cases:

  1. At least one ‘range’ input is not Constant

  2. At least one ‘il’ input value greater or equal than ‘ih’ input value