Group Intel GNA specific properties

group ov_runtime_gna_prop_cpp_api

Set of Intel GNA specific properties.

Enums

enum class ExecutionMode

Enum to define software acceleration mode.

Values:

enumerator AUTO

Uses Intel GNA if available, otherwise uses software execution mode on CPU.

enumerator HW

Uses Intel GNA if available, otherwise raises an error.

enumerator HW_WITH_SW_FBACK

Uses Intel GNA if available, otherwise raises an error. If the hardware queue is not empty, automatically falls back to CPU in the bit-exact mode.

enumerator SW_EXACT

Executes the GNA-compiled graph on CPU performing calculations in the same precision as the Intel GNA in the bit-exact mode.

enumerator SW_FP32

Executes the GNA-compiled graph on CPU but substitutes parameters and calculations from low precision to floating point

enum class HWGeneration

Enum to define HW compile and execution targets.

Values:

enumerator UNDEFINED

GNA HW generation is undefined.

enumerator GNA_1_0

GNA HW generation 1.0.

enumerator GNA_1_0_E

GNA HW generation 1.0 embedded.

enumerator GNA_2_0

GNA HW generation 2.0.

enumerator GNA_3_0

GNA HW generation 3.0.

enumerator GNA_3_1

GNA HW generation 3.1.

enumerator GNA_3_5

GNA HW generation 3.5.

enumerator GNA_3_5_E

GNA HW generation 3.5 embedded.

enumerator GNA_3_6

GNA HW generation 3.6.

enumerator GNA_4_0

GNA HW generation 4.0.

enum class PWLDesignAlgorithm

Enum to define PWL design algorithm.

Values:

enumerator UNDEFINED

PWL approximation algorithm is undefined.

enumerator RECURSIVE_DESCENT

Recursive Descent Algorithm.

enumerator UNIFORM_DISTRIBUTION

Uniform distribution algorithm.

Variables

static constexpr Property<std::string, PropertyMutability::RO> library_full_version = {"GNA_LIBRARY_FULL_VERSION"}

Property to get an std::string of GNA Library version, usually in the form <API_REVISION>.<RELEASE_LINE>.<RELEASE>.<BUILD>

static constexpr Property<std::map<std::string, float>> scale_factors_per_input = {"GNA_SCALE_FACTOR_PER_INPUT"}

Scale factor provided by the user to use static quantization. This option should be used with floating point value serialized to string with . (dot) as a decimal separator.

In the case of multiple inputs, individual scale factors can be provided using the map where key is layer name and value is scale factor. The input name shall not contain symbol “:”. Example:

ov::Core core;
auto model = core.read_model(model_path);
std::map<std::string, float> scale_factors;
for (auto& input : model->inputs()) {
    scale_factors[input.get_any_name()] = 1.0f;
}
core.set_property("GNA", ov::intel_gna::scale_factors_per_input(scale_factors));

static constexpr Property<std::string> firmware_model_image_path = {"GNA_FIRMWARE_MODEL_IMAGE"}

if turned on, dump GNA firmware model into specified file

static constexpr Property<ExecutionMode> execution_mode = {"GNA_DEVICE_MODE"}

GNA proc_type setting that should be one of AUTO, HW, GNA_HW_WITH_SW_FBACK, GNA_SW_EXACT or SW_FP32.

static constexpr Property<HWGeneration> execution_target = {"GNA_HW_EXECUTION_TARGET"}

The option to override the GNA HW execution target. May be one of GNA_2_0, GNA_3_0, GNA_3_5. By default (in case of no value set) the behavior depends on GNA HW availability: If GNA HW is present, use the option corresponding to this HW. If HW is not present, use the option corresponding to the latest fully supported GNA HW generation. A fully supported GNA HW generation means it must be supported by both the OV GNA Plugin and the core GNA Library. Currently, the latest supported GNA HW generation corresponds to GNA_3_5.

static constexpr Property<HWGeneration> compile_target = {"GNA_HW_COMPILE_TARGET"}

The option to override the GNA HW compile target. May be one of GNA_2_0, GNA_3_0, GNA_3_5. By default the same as execution_target.

static constexpr Property<bool> memory_reuse = {"GNA_COMPACT_MODE"}

if enabled produced minimum memory footprint for compiled model in GNA memory, default value is true

static constexpr Property<PWLDesignAlgorithm> pwl_design_algorithm = {"GNA_PWL_DESIGN_ALGORITHM"}

The option to set PWL design algorithm. By default the optimized algorithm called “Recursive Descent Algorithm for Finding

the Optimal Minimax Piecewise Linear Approximation of Convex Functions” is used. If value is UNIFORM_DISTRIBUTION then simple uniform distribution is used to create PWL approximation of activation functions. Uniform distribution usually gives poor approximation with the same number of segments.

static constexpr Property<float> pwl_max_error_percent = {"GNA_PWL_MAX_ERROR_PERCENT"}

The option to allow to specify the maximum error percent that the optimized algorithm finding will be used to find PWL functions. By default (in case of NO value set), 1.0 value is used.