GPT-2 Text Prediction Python* Demo¶

This demo shows how to use gpt-2 model for inference to perform interactive conditional text prediction, where content is generated based on text provided by user.

How It Works¶

On startup the demo application reads command line parameters and loads a model to OpenVINO™ Runtime plugin. It also encodes a user input prompt received via command line arguments or user input, and then uses it to predict the output sequence.

Preparing to Run¶

The list of models supported by the demo is in <omz_dir>/demos/gpt2_text_prediction_demo/python/models.lst file. This file can be used as a parameter for Model Downloader and Converter to download and, if necessary, convert models to OpenVINO IR format (*.xml + *.bin).

An example of using the Model Downloader:

omz_downloader --list models.lst

An example of using the Model Converter:

omz_converter --list models.lst

Supported Models¶

gpt-2

NOTE: Refer to the tables Intel’s Pre-Trained Models Device Support and Public Pre-Trained Models Device Support for the details on models inference support at different devices.

Running¶

Running the application with the -h option yields the following usage message:

usage: gpt2_text_prediction_demo.py [-h] -m MODEL -v VOCAB --merges MERGES
                                    [-i INPUT]
                                    [--max_sample_token_num MAX_SAMPLE_TOKEN_NUM]
                                    [--top_k TOP_K] [--top_p TOP_P]
                                    [-d DEVICE]

Options:
  -h, --help            Show this help message and exit.
  -m MODEL, --model MODEL
                        Required. Path to an .xml file with a trained model
  -v VOCAB, --vocab VOCAB
                        Required. Path to the vocabulary file with tokens
  --merges MERGES       Required. Path to the merges file
  -i INPUT, --input INPUT
                        Optional. Input prompt
  --max_sample_token_num MAX_SAMPLE_TOKEN_NUM
                        Optional. Maximum number of tokens in generated sample
  --top_k TOP_K         Optional. Number of tokens with the highest
                        probability which will be kept for generation
  --top_p TOP_P         Optional. Maximum probability, tokens with such a
                        probability and lower will be kept for generation
  -d DEVICE, --device DEVICE
                        Optional. Target device to perform inference
                        on. Default value is CPU
  --dynamic_shape       Run model with dynamic input sequence. If not
                        provided, input sequence will be padded to
                        max_seq_len
  --max_seq_len MAX_SEQ_LEN
                        Optional. Maximum sequence length for processing. Default value is 1024

Demo Inputs¶

The application reads and encodes text from input string, then performs transformations and uses it as model input.

Demo Outputs¶

The application outputs predicted text, continuing input string for each input string.

Example Demo Cmd-Line¶

You can use the following command to try the demo (assuming the used model from the Open Model Zoo, downloaded and converted with the Model Downloader):

    python3 gpt2_text_prediction_demo.py
            --model=<path_to_model>/gpt-2.xml
            --vocab=<models_dir>/models/public/gpt-2/gpt2/vocab.json
            --merges=<models_dir>/models/public/gpt-2/gpt2/merges.txt