Class ov#

ov : private pass::low_precision::BaseMatcherPass public ov::pass::MatcherPass , private pass::low_precision::LowPrecision public ov::pass::ModelPass

Public Types

enum Direction#

Enumerate directions.

Values:

enumerator FORWARD#

enumerator BACKWARD#

enum ColumnOfProcessorTypeTable#

This enum contains definition of each columns in processor type table which bases on cpu core types. Will extend to support other CPU core type like ARM.

The following are two example of processor type table.

Processor table of 4 numa nodes and 2 socket server

Processor table of 1 numa node desktop

ALL_PROC | MAIN_CORE | EFFICIENT_CORE | LP_EFFICIENT_CORE | HYPER_THREADING | NUMA_NODE_ID | SOCKET_ID 16 4 8 4 0 0 0

Values:

enumerator ALL_PROC#: All processors, regardless of backend cpu.

enumerator MAIN_CORE_PROC#: Processor based on physical core of Intel Performance-cores.

enumerator EFFICIENT_CORE_PROC#: Processor based on Intel Efficient-cores.

enumerator LP_EFFICIENT_CORE_PROC#: Processor based on Intel Low Power Efficient-cores.

enumerator HYPER_THREADING_PROC#: Processor based on logical core of Intel Performance-cores.

enumerator PROC_NUMA_NODE_ID#: Numa node id of processors in this row.

enumerator PROC_SOCKET_ID#: Socket id of processors in this row.

enumerator PROC_TYPE_TABLE_SIZE#: Size of processor type table.

enum ProcessorUseStatus#

Definition of CPU_MAP_USED_FLAG column in CPU mapping table.

Values:

enumerator CPU_BLOCKED#: Processor is blocked to use.

enumerator NOT_USED#: Processor is not bound to thread.

enumerator CPU_USED#: CPU is in using.

enum ColumnOfCPUMappingTable#

This enum contains definition of each columns in CPU mapping table which use processor id as index.

GROUP_ID is generated according to the following rules.

If one MAIN_CORE_PROC and one HYPER_THREADING_PROC are based on same Performance-cores, they are in one group.
If some EFFICIENT_CORE_PROC share one L2 cachle, they are in one group.
There are no duplicate group IDs in the system

The following is the example of CPU mapping table.

Four processors of two Pcore
Four processors of four Ecores shared L2 cache

Values:

enumerator CPU_MAP_PROCESSOR_ID#: column for processor id of the processor

enumerator CPU_MAP_NUMA_NODE_ID#: column for node id of the processor

enumerator CPU_MAP_SOCKET_ID#: column for socket id of the processor

enumerator CPU_MAP_CORE_ID#: column for hardware core id of the processor

enumerator CPU_MAP_CORE_TYPE#: column for CPU core type corresponding to the processor

enumerator CPU_MAP_GROUP_ID#: column for group id to the processor. Processors in one group have dependency.

enumerator CPU_MAP_USED_FLAG#: column for resource management of the processor

enumerator CPU_MAP_TABLE_SIZE#: Size of CPU mapping table.

enum ColumnOfCpuStreamsInfoTable#

This enum contains definition of each columns in cpu streams information table.

The following are two example of processor type table.

8 streams on hybrid platform which has 4 threads per stream (TPS). 1.1 2 streams (4 TPS) on physical core of Intel Performance-cores 1.2 4 streams (4 TPS) on Intel Efficient-cores 1.3 2 streams (4 TPS) on logic core of Intel Performance-cores

NUMBER_OF_STREAMS | PROC_TYPE | THREADS_PER_STREAM | STREAM_NUMA_NODE_ID | STREAM_SOCKET_ID 2 1 4 0 0 4 2 4 0 0 2 3 4 0 0

1 stream (10 TPS) on hybrid platform which has 2 threads on physical core and 8 threads on Ecore. 2.1 1 streams (10 TPS) on multiple types of processors 2.2 2 threads on physical core of Intel Performance-cores 2.3 8 threads on Intel Efficient-cores

NUMBER_OF_STREAMS | PROC_TYPE | THREADS_PER_STREAM | STREAM_NUMA_NODE_ID | STREAM_SOCKET_ID 1 0 10 0 0 0 1 2 0 0 0 2 8 0 0

Values:

enumerator NUMBER_OF_STREAMS#: Number of streams on specific CPU core tpye.

enumerator PROC_TYPE#: Core type of current streams.

enumerator THREADS_PER_STREAM#: Number of threads per stream of current streams.

enumerator STREAM_NUMA_NODE_ID#: Numa node id of processors in this row.

enumerator STREAM_SOCKET_ID#: Socket id of processors in this row.

enumerator CPU_STREAMS_TABLE_SIZE#: Size of streams info table.

enum class PropertyMutability#

Enum to define property value mutability.

Values:

enumerator RO#: Read-only property values can not be passed as input parameter.

enumerator RW#: Read/Write property key may change readability in runtime.

enumerator WO#: Write-only property can not be read.

enum class WorkloadType#

Enum to define possible workload types.

Workload type represents the execution priority for an inference.

Values:

enumerator DEFAULT#

enumerator EFFICIENT#

enum class CacheMode#

Enum to define possible cache mode.

Values:

enumerator OPTIMIZE_SIZE#: smaller cache size

enumerator OPTIMIZE_SPEED#: faster loading time

using TensorSymbol = std::vector<std::shared_ptr<Symbol>>#: Alias for symbol tensor.

using TensorSymbolVector = std::vector<TensorSymbol>#: Alias for vector of symbol tensors.

using TensorNames = std::unordered_set<std::string>#: Alias for set of tensor names.

using EvaluationContext = ov::RTMap#: EvaluationContext stores and manages a context (additional parameters, values and environment) for evaluating ov::Model.

using Rank = Dimension#: Alias for Dimension, used when the value represents the number of axes in a shape, rather than the size of one dimension in a shape.

using FileHandleProvider = std::function<FileHandle()>#: Type definition for file handle provider callback (cross-platform). Function that takes no arguments and returns a platform-specific file handle. The callback implementation must release ownership, caller should close the FileHandle. On Linux/Unix: returns int (file descriptor) On Windows: returns void* (HANDLE cast to void*) This is useful for scenarios where file access needs to be controlled externally, such as Android content providers or Windows restricted file access scenarios.

using TensorVector = std::vector<Tensor>#: A vector of Tensor’s.

using SupportedOpsMap = std::map<std::string, std::string>#

This type of map is used for result of Core::query_model.

key means operation name
value means device name supporting this operation

Public Members

constexpr Property<std::filesystem::path> cache_path = {"CACHE_PATH"}#

This property defines the path which will be used to store any data cached by plugins.

The underlying cache structure is not defined and might differ between OpenVINO releases Cached data might be platform / device specific and might be invalid after OpenVINO version change

If the path is a directory, it has the same effect as the cache_dir property. Regular caching is used in this case.

If this property is not specified or value is empty string, then caching is disabled. The property might enable caching for the plugin using the following code:

ie.set_property("GPU", ov::cache_path("cache/")); // enables cache for GPU plugin

The following code enables caching of compiled network blobs for devices where import/export is supported

ie.set_property(ov::cache_path("cache/")); // enables models cache

constexpr Property<uint64_t, PropertyMutability::RW> cache_blob_id = {"CACHE_BLOB_ID"}#

The property allows setting a user ID for a cache entry.

This overrides the internal ID generation mechanism and the user must manage the ID. If defined by the user, the same ID must be used to restore the model from the cache; otherwise, the model will be compiled. The custom ID can allow importing a model from the cache file without the original model.

The following code allows to compile a model with a custom ID.

// store compiled model to cache with ID "746352"
core.compile_model(model, "NPU", ov::AnyMap{ov::cache_blob_id("746352"), ov::cache_path("cache_dir")});

The following code allows to import a model from the cache if the original model is not available.

core.compile_model(empty_model, "NPU", ov::AnyMap{ov::cache_blob_id("746352"), ov::cache_path("cache_dir")});

Public Static Attributes

static constexpr Property<std::vector<PropertyName>, PropertyMutability::RO> supported_properties{"SUPPORTED_PROPERTIES"}#: Read-only property to get a std::vector<PropertyName> of supported read-only properties. This can be used as a compiled model property as well.

static constexpr Property<std::vector<std::string>, PropertyMutability::RO> available_devices = {"AVAILABLE_DEVICES"}#: Read-only property to get a std::vector<std::string> of available device IDs.

static constexpr Property<std::string, PropertyMutability::RO> model_name = {"NETWORK_NAME"}#: Read-only property to get a name of name of a model.

static constexpr Property<uint32_t, PropertyMutability::RO> optimal_number_of_infer_requests{"OPTIMAL_NUMBER_OF_INFER_REQUESTS"}#: Read-only property to get an unsigned integer value of optimal number of compiled model infer requests.

static constexpr Property<bool> enable_profiling = {"PERF_COUNT"}#: The name for setting performance counters option.

static constexpr Property<std::string> cache_dir = {"CACHE_DIR"}#

This property defines the directory which will be used to store any data cached by plugins.

The underlying cache structure is not defined and might differ between OpenVINO releases Cached data might be platform / device specific and might be invalid after OpenVINO version change If this property is not specified or value is empty string, then caching is disabled. The property might enable caching for the plugin using the following code:

ie.set_property("GPU", ov::cache_dir("cache/")); // enables cache for GPU plugin

The following code enables caching of compiled network blobs for devices where import/export is supported

ie.set_property(ov::cache_dir("cache/")); // enables models cache

static constexpr Property<bool, PropertyMutability::RO> loaded_from_cache = {"LOADED_FROM_CACHE"}#: Read-only property to notify user that compiled model was loaded from the cache.

static constexpr Property<std::filesystem::path, PropertyMutability::WO> cache_model_path = {"CACHE_MODEL_PATH"}#

Write property to specify the origin path of compiled model to speed cache model ID calculation.

The property has meaning when used in core::compile_model(const std::shared_ptr<const ov::Model>& model, ...) and cache feature is enabled.

static constexpr Property<WorkloadType, PropertyMutability::RW> workload_type = {"WORKLOAD_TYPE"}#: Read-write property to select in which mode the workload will be executed This is only supported by NPU.

static constexpr Property<CacheMode, PropertyMutability::RW> cache_mode = {"CACHE_MODE"}#: Read-write property to select the cache mode between OPTIMIZE_SIZE and OPTIMIZE_SPEED. If OPTIMIZE_SPEED is selected (default), loading time will decrease but the cache file size will increase. If OPTIMIZE_SIZE is selected, smaller cache files will be created. The cache model default behaviour can be overridden by ENABLE_WEIGHTLESS property.

static constexpr Property<bool, PropertyMutability::RW> enable_weightless = {"ENABLE_WEIGHTLESS"}#: Read-write property to enable/disable weightless cache.

static constexpr Property<EncryptionCallbacks, PropertyMutability::WO> cache_encryption_callbacks{"CACHE_ENCRYPTION_CALLBACKS"}#

Write-only property to set encryption/decryption function for saving/loading model cache. If cache_encryption_callbacks is set, the model topology will be encrypted when saving to the cache and decrypted when loading from the cache. This property is set in core.compile_model only.

First value of the struct is encryption function.
Second value of the struct is decryption function.

static constexpr Property<std::tuple<unsigned int, unsigned int>, PropertyMutability::RO> range_for_streams{"RANGE_FOR_STREAMS"}#

Read-only property to provide information about a range for streams on platforms where streams are supported.

Property returns a value of std::tuple<unsigned int, unsigned int> type, where:

First value is bottom bound.
Second value is upper bound.

static constexpr Property<unsigned int, PropertyMutability::RO> optimal_batch_size = {"OPTIMAL_BATCH_SIZE"}#

Read-only property to query information optimal batch size for the given device and the network.

Property returns a value of unsigned int type, Returns optimal batch size for a given network on the given device. The returned value is aligned to power of 2. Also, ov::hint::model is the required option for this metric since the optimal batch size depends on the model, so if the ov::hint::model is not given, the result of the metric is always 1. For the GPU the metric is queried automatically whenever the OpenVINO performance hint for the throughput is used, so that the result (>1) governs the automatic batching (transparently to the application). The automatic batching can be disabled with ALLOW_AUTO_BATCHING set to NO

static constexpr Property<uint32_t, PropertyMutability::RO> max_batch_size = {"MAX_BATCH_SIZE"}#: Read-only property to get maximum batch size which does not cause performance degradation due to memory swap impact.

static constexpr Property<uint32_t, PropertyMutability::RW> auto_batch_timeout = {"AUTO_BATCH_TIMEOUT"}#: Read-write property to set the timeout used to collect the inputs for the auto-batching impact.

static constexpr Property<std::tuple<unsigned int, unsigned int, unsigned int>, PropertyMutability::RO> range_for_async_infer_requests = {"RANGE_FOR_ASYNC_INFER_REQUESTS"}#

Read-only property to provide a hint for a range for number of async infer requests. If device supports streams, the metric provides range for number of IRs per stream.

Property returns a value of std::tuple<unsigned int, unsigned int, unsigned int> type, where:

First value is bottom bound.
Second value is upper bound.
Third value is step inside this range.

static constexpr Property<bool, PropertyMutability::RW> force_tbb_terminate = {"FORCE_TBB_TERMINATE"}#

Read-write property to set whether force terminate tbb when ov core destruction value type: boolean.

True explicitly terminate tbb when ov core destruction
False will not involve additional tbb operations when core destruction

static constexpr Property<bool, PropertyMutability::RW> enable_mmap = {"ENABLE_MMAP"}#

Read-write property to configure mmap() use for model read. Enabled by default. For the moment only IR Frontend supports the property.

value type: boolean

True enable mmap() use and map model
False disable mmap() use and read model

static constexpr Property<streams::Num, PropertyMutability::RW> num_streams = {"NUM_STREAMS"}#: The number of executor logical partitions.

static constexpr Property<int32_t, PropertyMutability::RW> inference_num_threads = {"INFERENCE_NUM_THREADS"}#: Maximum number of threads that can be used for inference tasks.

static constexpr Property<int32_t, PropertyMutability::RW> compilation_num_threads = {"COMPILATION_NUM_THREADS"}#: Maximum number of threads that can be used for compilation tasks.

static constexpr Property<std::vector<std::string>, PropertyMutability::RO> execution_devices = {"EXECUTION_DEVICES"}#: The devices that the inference task been executed.

static constexpr Property<std::string, PropertyMutability::RW> weights_path = {"WEIGHTS_PATH"}#: Path to the file with model’s weights.

Note

This property is used for weightless caching. Only used when ov::CacheMode Property is set to “OPTIMIZE_SIZE”.

static constexpr Property<element::Type, PropertyMutability::RW> key_cache_precision = {"KEY_CACHE_PRECISION"}#: The precision of key cache compression.

static constexpr Property<element::Type, PropertyMutability::RW> value_cache_precision = {"VALUE_CACHE_PRECISION"}#: The precision of value cache compression.

static constexpr Property<uint64_t, PropertyMutability::RW> key_cache_group_size = {"KEY_CACHE_GROUP_SIZE"}#: The group_size of key cache compression.

static constexpr Property<uint64_t, PropertyMutability::RW> value_cache_group_size = {"VALUE_CACHE_GROUP_SIZE"}#: The group_size of value cache compression.