Run Group Inference

DL Workbench provides a graphical interface to find the optimal configuration of batches and parallel requests on a certain machine. To learn more about optimal configurations on specific hardware, refer to Deploy and Integrate Performance Criteria into Application.

To run a range of inference streams, go to the Profile tab of the Selected Configuration section on the Configurations Page. Select Group Inference and click Configure. On the Configure Group Inference page, select combinations of stream and batch parameters by clicking corresponding cells in the table. The cells you select are indicated with the check mark. Dark cells represent previously executed inferences. You can select them as well.

group_inference-b.png

Click Show Next 10 Columns and Rows to expand the table:

show_next_10-b.png

Select Range 20-28 to see batch values only as degrees of 2:

degrees_of_2-b.png

The estimated execution time is displayed under the table:

group_inference_time-b.png

A table with the inferences you selected is on the right:

selected_inferences-b.png

Once you click Execute, the inference starts and you cannot proceed until it is done:

banner-b.png

The graph in the Inference Results section shows points that represent each inference with a certain batch/parallel request configuration:

inference_results_01-b.png

Right above the graph, you can specify maximum latency to find the optimal configuration with the best throughput. The point corresponding to this configuration turns blue:

inference_results_02-b.png

To view information about latency, throughput, batch, and parallel requests of a specific job, hover your cursor over the corresponding point on the graph.

inference_results_03-b.png

Use Expand and Collapse buttons to change sizes of the chart and the table.

expand-b.png
collapse-b.png

NOTE: For details about inference processes, see the Inference Engine documentation.


See Also