Group Intel GNA specific properties¶

group ov_runtime_gna_prop_cpp_api

Set of Intel GNA specific properties.

Enums

enum class ExecutionMode

Enum to define software acceleration mode.

Values:

enumerator AUTO¶: Uses Intel GNA if available, otherwise uses software execution mode on CPU.

enumerator HW¶: Uses Intel GNA if available, otherwise raises an error.

enumerator HW_WITH_SW_FBACK¶: Uses Intel GNA if available, otherwise raises an error. If the hardware queue is not empty, automatically falls back to CPU in the bit-exact mode.

enumerator SW_EXACT¶: Executes the GNA-compiled graph on CPU performing calculations in the same precision as the Intel GNA in the bit-exact mode.

enumerator SW_FP32¶: Executes the GNA-compiled graph on CPU but substitutes parameters and calculations from low precision to floating point

enum class HWGeneration¶

Enum to define HW compile and execution targets.

Values:

enumerator UNDEFINED¶: GNA HW generation is undefined.

enumerator GNA_1_0¶: GNA HW generation 1.0.

enumerator GNA_1_0_E¶: GNA HW generation 1.0 embedded.

enumerator GNA_2_0¶: GNA HW generation 2.0.

enumerator GNA_3_0¶: GNA HW generation 3.0.

enumerator GNA_3_1¶: GNA HW generation 3.1.

enumerator GNA_3_5¶: GNA HW generation 3.5.

enumerator GNA_3_5_E¶: GNA HW generation 3.5 embedded.

enumerator GNA_3_6¶: GNA HW generation 3.6.

enumerator GNA_4_0¶: GNA HW generation 4.0.

enum class PWLDesignAlgorithm¶

Enum to define PWL design algorithm.

Values:

enumerator UNDEFINED¶: PWL approximation algorithm is undefined.

enumerator RECURSIVE_DESCENT¶: Recursive Descent Algorithm.

enumerator UNIFORM_DISTRIBUTION¶: Uniform distribution algorithm.

Variables

static constexpr Property<std::string, PropertyMutability::RO> library_full_version = {"GNA_LIBRARY_FULL_VERSION"}¶: Property to get an std::string of GNA Library version, usually in the form <API_REVISION>.<RELEASE_LINE>.<RELEASE>.<BUILD>

static constexpr Property<std::map<std::string, float>> scale_factors_per_input = {"GNA_SCALE_FACTOR_PER_INPUT"}¶

Scale factor provided by the user to use static quantization. This option should be used with floating point value serialized to string with . (dot) as a decimal separator.

In the case of multiple inputs, individual scale factors can be provided using the map where key is layer name and value is scale factor. The input name shall not contain symbol “:”. Example:

ov::Core core;
auto model = core.read_model(model_path);
std::map<std::string, float> scale_factors;
for (auto& input : model->inputs()) {
    scale_factors[input.get_any_name()] = 1.0f;
}
core.set_property("GNA", ov::intel_gna::scale_factors_per_input(scale_factors));

static constexpr Property<std::string> firmware_model_image_path = {"GNA_FIRMWARE_MODEL_IMAGE"}¶: if turned on, dump GNA firmware model into specified file

static constexpr Property<ExecutionMode> execution_mode = {"GNA_DEVICE_MODE"}: GNA proc_type setting that should be one of AUTO, HW, GNA_HW_WITH_SW_FBACK, GNA_SW_EXACT or SW_FP32.

static constexpr Property<HWGeneration> execution_target = {"GNA_HW_EXECUTION_TARGET"}¶: The option to override the GNA HW execution target. May be one of GNA_2_0, GNA_3_0, GNA_3_5. By default (in case of no value set) the behavior depends on GNA HW availability: If GNA HW is present, use the option corresponding to this HW. If HW is not present, use the option corresponding to the latest fully supported GNA HW generation. A fully supported GNA HW generation means it must be supported by both the OV GNA Plugin and the core GNA Library. Currently, the latest supported GNA HW generation corresponds to GNA_3_5.

static constexpr Property<HWGeneration> compile_target = {"GNA_HW_COMPILE_TARGET"}¶: The option to override the GNA HW compile target. May be one of GNA_2_0, GNA_3_0, GNA_3_5. By default the same as execution_target.

static constexpr Property<bool> memory_reuse = {"GNA_COMPACT_MODE"}¶: if enabled produced minimum memory footprint for compiled model in GNA memory, default value is true

static constexpr Property<PWLDesignAlgorithm> pwl_design_algorithm = {"GNA_PWL_DESIGN_ALGORITHM"}¶

The option to set PWL design algorithm. By default the optimized algorithm called “Recursive Descent Algorithm for Finding

the Optimal Minimax Piecewise Linear Approximation of Convex Functions” is used. If value is UNIFORM_DISTRIBUTION then simple uniform distribution is used to create PWL approximation of activation functions. Uniform distribution usually gives poor approximation with the same number of segments.

static constexpr Property<float> pwl_max_error_percent = {"GNA_PWL_MAX_ERROR_PERCENT"}¶: The option to allow to specify the maximum error percent that the optimized algorithm finding will be used to find PWL functions. By default (in case of NO value set), 1.0 value is used.