Deep Learning accuracy validation framework



Install prerequisites first:

1. Python

accuracy checker uses Python 3. Install it first:

sudo apt-get install python3 python3-dev python3-setuptools python3-pip

Python setuptools and python package manager (pip) install packages into system directory by default. Installation of accuracy checker tested only via virtual environment.

In order to use virtual environment you should install it first:

python3 -m pip install virtualenv
python3 -m virtualenv -p `which python3` <directory_for_environment>

Before starting to work inside virtual environment, it should be activated:

source <directory_for_environment>/bin/activate

Virtual environment can be deactivated using command


2. Frameworks

The next step is installing backend frameworks for Accuracy Checker.

In order to evaluate some models required frameworks have to be installed. Accuracy-Checker supports these frameworks:

You can use any of them or several at a time. For correct work, Accuracy Checker requires at least one. You are able postpone installation of other frameworks and install them when they will be necessary.

Install accuracy checker

If all prerequisite are installed, then you are ready to install accuracy checker:

python3 install

Accuracy Checker is modular tool and have some task-specific dependencies, all specific required modules can be found in file. You can install only core part of the tool without additional dependencies and manage them by your-self using following command instead of standard installation:

python install_core


You may test your installation and get familiar with accuracy checker by running sample.

Once you installed accuracy checker you can evaluate your configurations with:

accuracy_check -c path/to/configuration_file -m /path/to/models -s /path/to/source/data -a /path/to/annotation

All relative paths in config files will be prefixed with values specified in command line:

You may refer to -h, --help to full list of command line options. Some optional arguments are:

You are also able to replace some command line arguments with environment variables for path prefixing. Supported following list of variables:


There is config file which declares validation process. Every validated model has to have its entry in models list with distinct name and other properties described below.

There is also definitions file, which declares global options shared across all models. Config file has priority over definitions file.


- name: model_name
- framework: caffe
model: public/alexnet/caffe/bvlc_alexnet.prototxt
weights: public/alexnet/caffe/bvlc_alexnet.caffemodel
adapter: classification
batch: 128
- name: dataset_name

Optionally you can use global configuration. It can be useful for avoiding duplication if you have several models which should be run on the same dataset. Example of global definitions file can be found here. Global definitions will be merged with evaluation config in the runtime by dataset name. Parameters of global configuration can be overwritten by local config (e.g. if in definitions specified resize with destination size 224 and in the local config used resize with size 227, the value in config - 227 will be used as resize parameter) You can use field global_definitions for specifying path to global definitions directly in the model config or via command line arguments (-d, --definitions).


Launcher is a description of how your model should be executed. Each launcher configuration starts with setting framework name. Currently caffe, dlsdk, mxnet, tf, tf_lite, opencv, onnx_runtime supported. Launcher description can have differences. Please view:


Dataset entry describes data on which model should be evaluated, all required preprocessing and postprocessing/filtering steps, and metrics that will be used for evaluation.

If your dataset data is a well-known competition problem (COCO, Pascal VOC, and others) and/or can be potentially reused for other models it is reasonable to declare it in some global configuration file (definition file). This way in your local configuration file you can provide only name and all required steps will be picked from global one. To pass path to this global configuration use --definition argument of CLI.

If you want to evaluate models using prepared config files and well-known datasets, you need to organize folders with validation datasets in a certain way. More detailed information about dataset preparation you can find in Dataset Preparation Guide.

Each dataset must have:

And optionally:

Also it must contain data related to annotation. You can convert annotation in-place using:

or use existing annotation file and dataset meta:

example of dataset definition:

- name: dataset_name
annotation: annotation.pickle
data_source: images_folder
- type: resize
dst_width: 256
dst_height: 256
- type: normalization
mean: imagenet
- type: crop
dst_width: 227
dst_height: 227
- type: accuracy

Preprocessing, Metrics, Postprocessing

Each entry of preprocessing, metrics, postprocessing must have type field, other options are specific to type. If you do not provide any other option, then it will be picked from definitions file.

You can find useful following instructions:

You may optionally provide reference field for metric, if you want calculated metric tested against specific value (i.e. reported in canonical paper).

Some metrics support providing vector results ( e. g. mAP is able to return average precision for each detection class). You can change view mode for metric results using presenter (e.g. print_vector, print_scalar).


- type: accuracy
top_k: 5
reference: 86.43
threshold: 0.005

Testing new models

Typical workflow for testing new model include:

  1. Convert annotation of your dataset. Use one of the converters from annotation-converters, or write your own if there is no converter for your dataset. You can find detailed instruction how to use converters in Annotation Conversion Guide.
  2. Choose one of adapters or write your own. Adapter converts raw output produced by framework to high level problem specific representation (e.g. ClassificationPrediction, DetectionPrediction, etc).
  3. Reproduce preprocessing, metrics and postprocessing from canonical paper.
  4. Create entry in config file and execute.

Customizing Evaluation

Standard Accuracy Checker validation pipeline: Annotation Reading -> Data Reading -> Preprocessing -> Inference -> Postprocessing -> Metrics. In some cases it can be unsuitable (e.g. if you have sequence of models). You are able to customize validation pipeline using own evaluator. More details about custom evaluations can be found in the related section.