Converting a Kaldi* Model¶
A summary of the steps for optimizing and deploying a model that was trained with Kaldi*:
Configure the Model Optimizer for Kaldi*.
Convert a Kaldi* Model to produce an optimized Intermediate Representation (IR) of the model based on the trained network topology, weights, and biases values.
Test the model in the Intermediate Representation format using the Inference Engine in the target environment via provided Inference Engine sample applications.
Integrate the Inference Engine in your application to deploy the model in the target environment.
Note
The Model Optimizer supports the nnet1 and nnet2 formats of Kaldi models. Support of the nnet3 format is limited.
Supported Topologies¶
Convolutional Neural Networks (CNN):
Wall Street Journal CNN (wsj_cnn4b)
Resource Management CNN (rm_cnn4a_smbr)
Long Short Term Memory (LSTM) Networks:
Resource Management LSTM (rm_lstm4f)
TED-LIUM LSTM (ted_lstm4f)
Deep Neural Networks (DNN):
Wall Street Journal DNN (wsj_dnn5b_smbr);
TED-LIUM DNN (ted_dnn_smbr)
Time delay neural network (TDNN)
TDNN-LSTM model
Convert a Kaldi* Model¶
To convert a Kaldi* model, run Model Optimizer with the path to the input model .nnet
or .mdl
file and to an output directory where you have write permissions:
cd <INSTALL_DIR>/deployment_tools/model_optimizer/
python3 mo.py --input_model <INPUT_MODEL>.nnet --output_dir <OUTPUT_MODEL_DIR>
mo --input_model <INPUT_MODEL>.nnet --output_dir <OUTPUT_MODEL_DIR>
Two groups of parameters are available to convert your model:
Framework-agnostic parameters are used to convert a model trained with any supported framework. For details, see see the General Conversion Parameters section on the Converting a Model to Intermediate Representation (IR) page.
Kaldi-specific parameters are used to convert only Kaldi* models.
Using Kaldi*-Specific Conversion Parameters¶
The following list provides the Kaldi*-specific parameters.
Kaldi-specific parameters:
--counts COUNTS A file name with full path to the counts file
--remove_output_softmax
Removes the Softmax that is the output layer
--remove_memory Remove the Memory layer and add new inputs and outputs instead
Examples of CLI Commands¶
To launch the Model Optimizer for the wsj_dnn5b_smbr model with the specified
.nnet
file and an output directory where you have write permissions:python3 mo.py --input_model wsj_dnn5b_smbr.nnet --output_dir <OUTPUT_MODEL_DIR>
mo --input_model wsj_dnn5b_smbr.nnet --output_dir <OUTPUT_MODEL_DIR>
To launch the Model Optimizer for the wsj_dnn5b_smbr model with existing file that contains counts for the last layer with biases and a writable output directory:
python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --output_dir <OUTPUT_MODEL_DIR>_
mo --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --output_dir <OUTPUT_MODEL_DIR>_
The Model Optimizer normalizes сounts in the following way:
\[S = \frac{1}{\sum_{j = 0}^{|C|}C_{j}}\]\[C_{i}=log(S*C_{i})\]where \(C\) - the counts array, \(C_{i} - i^{th}\) element of the counts array, \(|C|\) - number of elements in the counts array;
The normalized counts are subtracted from biases of the last or next to last layer (if last layer is SoftMax).
If you want to remove the last SoftMax layer in the topology, launch the Model Optimizer with the
--remove_output_softmax
flag:python3 mo.py --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --remove_output_softmax --output_dir <OUTPUT_MODEL_DIR>_
mo --input_model wsj_dnn5b_smbr.nnet --counts wsj_dnn5b_smbr.counts --remove_output_softmax --output_dir <OUTPUT_MODEL_DIR>_
The Model Optimizer finds the last layer of the topology and removes this layer only if it is a SoftMax layer.
Note
Model Optimizer can remove SoftMax layer only if the topology has one output.
Note
For sample inference of Kaldi models, you can use the Inference Engine Speech Recognition sample application. The sample supports models with one output. If your model has several outputs, specify the desired one with the --output
option.
If you want to convert a model for inference on Intel® Movidius™ Myriad™, use the --remove_memory
option. It removes Memory layers from the IR. Instead of it, additional inputs and outputs appear in the IR. The Model Optimizer outputs the mapping between inputs and outputs. For example:
[ WARNING ] Add input/output mapped Parameter_0_for_Offset_fastlstm2.r_trunc__2Offset_fastlstm2.r_trunc__2_out -> Result_for_Offset_fastlstm2.r_trunc__2Offset_fastlstm2.r_trunc__2_out
[ WARNING ] Add input/output mapped Parameter_1_for_Offset_fastlstm2.r_trunc__2Offset_fastlstm2.r_trunc__2_out -> Result_for_Offset_fastlstm2.r_trunc__2Offset_fastlstm2.r_trunc__2_out
[ WARNING ] Add input/output mapped Parameter_0_for_iteration_Offset_fastlstm3.c_trunc__3390 -> Result_for_iteration_Offset_fastlstm3.c_trunc__3390
Based on this mapping, link inputs and outputs in your application manually as follows:
Initialize inputs from the mapping as zeros in the first frame of an utterance.
Copy output blobs from the mapping to the corresponding inputs. For example, data from
Result_for_Offset_fastlstm2.r_trunc__2Offset_fastlstm2.r_trunc__2_out
must be copied toParameter_0_for_Offset_fastlstm2.r_trunc__2Offset_fastlstm2.r_trunc__2_out
.
Supported Kaldi* Layers¶
Refer to Supported Framework Layers for the list of supported standard layers.