API usage sample for speech task on GNA¶
This sample demonstrates the use of the Post-training Optimization Tool API for the task of quantizing a speech model for GNA device. Quantization for GNA is different from CPU quantization due to device specific: GNA supports quantized inputs in INT16 and INT32 (for activations) precision and quantized weights in INT8 and INT16 precision.
This sample contains pre-selected quantization options based on the DefaultQuantization algorithm and created for models from Kaldi framework, and its data format. A custom
ArkDataLoader is created to load the dataset from files with .ark extension for speech analysis task.
How to prepare the data¶
To run this sample, you will need to use the .ark files for each model input from your
<DATA_FOLDER>. For generating data from original formats to .ark, please, follow the Kaldi data preparation tutorial.
How to Run the Sample¶
In the instructions below, the Post-Training Optimization Tool directory
<POT_DIR> is referred to:
<ENV>/lib/python<version>/site-packages/in the case of PyPI installation, where
<ENV>is a Python* environment where OpenVINO is installed and
<version>is a Python* version, e.g.
<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkitin the case of OpenVINO distribution package.
<INSTALL_DIR>is the directory where Intel Distribution of OpenVINO toolkit is installed.
To get started, follow the Installation Guide.
python3 <PATH_TO_MODEL_OPTIMIZER>/mo.py --input_model <PATH_TO_KALDI_MODEL> [MODEL_OPTIMIZER_OPTIONS]
Launch the sample script:
python3 <POT_DIR>/compression/api/samples/speech/gna_sample.py -m <PATH_TO_IR_XML> -w <PATH_TO_IR_BIN> -d <DATA_FOLDER> --input_names [LIST_OF_MODEL_INPUTS] --files_for_input [LIST_OF_INPUT_FILES]
--input_namesoption. Defines list of model inputs;
--files_for_inputoption. Defines list of filenames (.ark) mapped with input names. You should define names without extension, for example: FILENAME_1, FILENAME_2 maps with INPUT_1, INPUT_2. Optional parameters:
--presetoption. Defines preset for quantization:
performancefor INT8 weights,
accuracyfor INT16 weights;
--subset_sizeoption. Defines subset size for calibration;
--outputoption. Defines output folder for quantized model.
Validate your INT8 model using
./speech_samplefrom the Inference Engine samples. Follow the speech sample description link for details.