Download the Pre-trained Language Model on One Billion Word Benchmark

TensorFlow* provides a pre-trained Language Model on One Billion Word Benchmark.

To download the model for IR conversion, please follow the instruction:

Create new directory to store the model:
mkdir lm_1b
Go to the lm_1b directory:
cd lm_1b
Download the model GraphDef file:
wget http://download.tensorflow.org/models/LM_LSTM_CNN/graph-2016-09-10.pbtxt
Create new directory to store 12 checkpoint shared files:
mkdir ckpt
Go to the ckpt directory:
cd ckpt
Download 12 checkpoint shared files:
wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-base

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-char-embedding

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-lstm

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax0

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax1

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax2

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax3

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax4

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax5

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax6

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax7

wget http://download.tensorflow.org/models/LM_LSTM_CNN/all_shards-2016-09-10/ckpt-softmax8

After you download the pre-trained model files, you will have the lm_1b directory with the following hierarchy:

lm_1b/
    graph-2016-09-10.pbtxt
    ckpt/
        ckpt-base
        ckpt-char-embedding
        ckpt-lstm
        ckpt-softmax0
        ckpt-softmax1        
        ckpt-softmax2        
        ckpt-softmax3        
        ckpt-softmax4        
        ckpt-softmax5        
        ckpt-softmax6        
        ckpt-softmax7        
        ckpt-softmax8        

As you can see, the frozen model still has two variables: Variable and Variable_1. It means that the model keeps training those variables at each inference.

At the first inference of this graph, the variables are initialized by initial values. After executing the lstm nodes, results of execution are assigned to these two variables.

With each inference of the lm_1b graph, lstm initial states data is taken from previous inference from variables and states of current inference of lstm is reassigned to the same variables.

It helps the model to remember the context of the words that it takes as input.

Convert TensorFlow Language Model on One Billion Word Benchmark to IR

The Model Optimizer assumes that output model is for inference only. That is why you should cut those variables off and resolve keeping cell and hidden states on application level.

There is a certain limitations for the model conversion:

Original model cannot be reshaped, so you should keep original shapes.

To generate the lm_1b Intermediate Representation (IR), provide TensorFlow lm_1b model to the Model Optimizer with parameters:

python3 ./mo_tf.py
--input_model lm_1b/graph-2016-09-10.pbtxt  \
--input_checkpoint lm_1b/ckpt               \
--input_model_is_text                       \
--input_shape [50],[50],[1,9216],[1,9216]    \
--output softmax_out,lstm/lstm_0/concat_2,lstm/lstm_1/concat_2 \
--input char_embedding/EmbeddingLookupUnique/Unique:0,char_embedding/EmbeddingLookupUnique/Unique:1,Variable/read,Variable_1/read

Where:

--input char_embedding/EmbeddingLookupUnique/Unique:0,char_embedding/EmbeddingLookupUnique/Unique:1,Variable/read,Variable_1/read and --input_shape [50],[50],[1,9216],[1,9216] replace the variables with a placeholder
--output softmax_out,lstm/lstm_0/concat_2,lstm/lstm_1/concat_2 specifies output node name and names of LSTM cell states.