GPT-2 Text Prediction Python* Demo#

This demo shows how to use gpt-2 model for inference to perform interactive conditional text prediction, where content is generated based on text provided by user.

How It Works#

On startup the demo application reads command line parameters and loads a model to OpenVINO™ Runtime plugin. It also encodes a user input prompt received via command line arguments or user input, and then uses it to predict the output sequence.

Preparing to Run#

The list of models supported by the demo is in <omz_dir>/demos/gpt2_text_prediction_demo/python/models.lst file. This file can be used as a parameter for Model Downloader and Converter to download and, if necessary, convert models to OpenVINO IR format (*.xml + *.bin).

An example of using the Model Downloader:

omz_downloader --list models.lst

An example of using the Model Converter:

omz_converter --list models.lst

Supported Models#

gpt-2

NOTE: Refer to the tables Intel’s Pre-Trained Models Device Support and Public Pre-Trained Models Device Support for the details on models inference support at different devices.

Running#

Running the application with the -h option yields the following usage message:

usage: gpt2_text_prediction_demo.py [-h] -m MODEL -v VOCAB --merges MERGES
                                    [-i INPUT]
                                    [--max_sample_token_num MAX_SAMPLE_TOKEN_NUM]
                                    [--top_k TOP_K] [--top_p TOP_P]
                                    [-d DEVICE]

Options:
  -h, --help            Show this help message and exit.
  -m MODEL, --model MODEL
                        Required. Path to an .xml file with a trained model
  -v VOCAB, --vocab VOCAB
                        Required. Path to the vocabulary file with tokens
  --merges MERGES       Required. Path to the merges file
  -i INPUT, --input INPUT
                        Optional. Input prompt
  --max_sample_token_num MAX_SAMPLE_TOKEN_NUM
                        Optional. Maximum number of tokens in generated sample
  --top_k TOP_K         Optional. Number of tokens with the highest
                        probability which will be kept for generation
  --top_p TOP_P         Optional. Maximum probability, tokens with such a
                        probability and lower will be kept for generation
  -d DEVICE, --device DEVICE
                        Optional. Target device to perform inference
                        on. Default value is CPU
  --dynamic_shape       Run model with dynamic input sequence. If not
                        provided, input sequence will be padded to
                        max_seq_len
  --max_seq_len MAX_SEQ_LEN
                        Optional. Maximum sequence length for processing. Default value is 1024

Demo Inputs#

The application reads and encodes text from input string, then performs transformations and uses it as model input.

Demo Outputs#

The application outputs predicted text, continuing input string for each input string.

Example Demo Cmd-Line#

You can use the following command to try the demo (assuming the used model from the Open Model Zoo, downloaded and converted with the Model Downloader):

    python3 gpt2_text_prediction_demo.py
            --model=<path_to_model>/gpt-2.xml
            --vocab=<models_dir>/models/public/gpt-2/gpt2/vocab.json
            --merges=<models_dir>/models/public/gpt-2/gpt2/merges.txt