interface ngraph::snippets::pass::FakeQuantizeDecomposition¶
Overview¶
FakeQuantizeDecomposition transformation decomposes FakeQuantize layer. More…
#include <fq_decomposition.hpp>
template FakeQuantizeDecomposition: public ov::pass::MatcherPass
{
// methods
static bool isAllScalarConstant(const std::shared_ptr<const ngraph::Node>& node);
static bool getScalesAndShifts(
const std::shared_ptr<const ngraph::op::v0::FakeQuantize>& fq_node,
std::vector<float>& cl,
std::vector<float>& ch,
std::vector<float>& isc,
std::vector<float>& ish,
std::vector<float>& osc,
std::vector<float>& osh
);
static std::vector<float> calculateScales(
const ngraph::element::Type& out_type,
const std::vector<float>& cl,
const std::vector<float>& ch,
const std::vector<float>& isc,
const std::vector<float>& ish,
const std::vector<float>& osc,
const std::vector<float>& osh
);
};
Inherited Members¶
public:
// typedefs
typedef DiscreteTypeInfo type_info_t;
// methods
bool get_property(const PassPropertyMask& prop_mask) const;
void set_name(const std::string& name);
std::string get_name() const;
void set_callback(const param_callback& callback);
virtual void set_pass_config(const std::shared_ptr<PassConfig>& pass_config);
std::shared_ptr<PassConfig> get_pass_config();
bool m_transformation_callback(const std::shared_ptr<const Node>& node);
bool transformation_callback(const std::shared_ptr<const Node>& node);
virtual const type_info_t& get_type_info() const = 0;
OPENVINO_RTTI("ov::pass::MatcherPass");
MatcherPass& operator = (const MatcherPass&);
bool apply(std::shared_ptr<ov::Node> node);
template <typename T, class... Args>
std::shared_ptr<T> register_new_node(Args&&... args);
template <typename T>
std::shared_ptr<T> register_new_node(const std::shared_ptr<T>& node);
std::shared_ptr<ov::Node> register_new_node_(const std::shared_ptr<ov::Node>& node);
const std::vector<std::shared_ptr<ov::Node>>& get_new_nodes();
void clear_new_nodes();
std::shared_ptr<pattern::Matcher> get_matcher();
Detailed Documentation¶
FakeQuantizeDecomposition transformation decomposes FakeQuantize layer.
Expression from specification: if x <= min(il, ih): output = ol elif x > max(il, ih): output = oh else: output = round((x - il) / (ih - il) * (levels-1)) / (levels-1) * (oh - ol) + ol
Expand brackets: round(x * (levels-1) / (ih - il) - il * (levels-1) / (ih - il)) * (oh - ol) / (levels-1) + ol
Marking:
isc := (levels-1) / (ih - il)
ish := -il * isc
osc := (oh - ol) / (levels-1)
osh := ol Final expression: round(x * isc + ish) * osc + osh
Some optimizations (example for scalars):
If output element type of FQ is U8 and il = 0, ish = 0, osc = 1, osh = 0, there is enough expression: x * isc
If output element type of FQ is I8 and ish ~= 128, osc = 1, osh ~= -128, il * isc ~= -128, ih * isc ~= 127 there is enough expression: x * isc
If osc = 1, osh = 0, there isn’t dequantization
If there isn’t dequantization and output element type of FQ isn’t FP32, there isn’t rounding
This transformation doesn’t support following cases:
At least one ‘range’ input is not Constant
At least one ‘il’ input value greater or equal than ‘ih’ input value