Converting a TensorFlow Language Model on One Billion Word Benchmark¶
Downloading a Pre-trained Language Model on One Billion Word Benchmark¶
TensorFlow provides a pretrained Language Model on One Billion Word Benchmark.
To download the model for IR conversion, follow the instructions:
Create new directory to store the model:
mkdir lm_1b
Go to the
lm_1b
directory:cd lm_1b
Download the model GraphDef file:
wget http://download.tensorflow.org/models/LM_LSTM_CNN/graph-2016-09-10.pbtxt
Create new directory to store 12 checkpoint shared files:
mkdir ckpt
Go to the
ckpt
directory:cd ckpt
Download 12 checkpoint shared files:
wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-base wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-char-embedding wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-lstm wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax0 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax1 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax2 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax3 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax4 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax5 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax6 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax7 wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax8
Once you have downloaded the pretrained model files, you will have the lm_1b
directory with the following hierarchy:
lm_1b/
graph-2016-09-10.pbtxt
ckpt/
ckpt-base
ckpt-char-embedding
ckpt-lstm
ckpt-softmax0
ckpt-softmax1
ckpt-softmax2
ckpt-softmax3
ckpt-softmax4
ckpt-softmax5
ckpt-softmax6
ckpt-softmax7
ckpt-softmax8
The frozen model still has two variables: Variable
and Variable_1
. It means that the model keeps training those variables at each inference.
At the first inference of this graph, the variables are initialized by initial values. After executing the lstm
nodes, results of execution are assigned to these two variables.
With each inference of the lm_1b
graph, lstm
initial states data is taken from previous inference from variables, and states of current inference of lstm
is reassigned to the same variables.
It helps the model to remember the context of the words that it takes as input.
Converting a TensorFlow Language Model on One Billion Word Benchmark to IR¶
Model Optimizer assumes that output model is for inference only. Therefore, you should cut those variables off and resolve keeping cell and hidden states on application level.
There is a certain limitation for the model conversion: the original model cannot be reshaped, so you should keep original shapes.
To generate the lm_1b
Intermediate Representation (IR), provide TensorFlow lm_1b
model to the Model Optimizer with parameters:
mo
--input_model lm_1b/graph-2016-09-10.pbtxt \
--input_checkpoint lm_1b/ckpt \
--input_model_is_text \
--input_shape [50],[50],[1,9216],[1,9216] \
--output softmax_out,lstm/lstm_0/concat_2,lstm/lstm_1/concat_2 \
--input char_embedding/EmbeddingLookupUnique/Unique:0,char_embedding/EmbeddingLookupUnique/Unique:1,Variable/read,Variable_1/read
Where:
--input char_embedding/EmbeddingLookupUnique/Unique:0,char_embedding/EmbeddingLookupUnique/Unique:1,Variable/read,Variable_1/read
and--input_shape [50],[50],[1,9216],[1,9216]
replace the variables with a placeholder.--output softmax_out,lstm/lstm_0/concat_2,lstm/lstm_1/concat_2
specifies output node name and names of LSTM cell states.