gpt-2#

Use Case and High-Level Description#

The gpt-2 model is a one of Generative Pre-trained Transformer (GPT) model family, pre-trained on a very large corpus of English data in a self-supervised fashion. The GPT architecture implements a deep neural network, specifically a transformer model, which uses attention in place of previous recurrence- and convolution-based architectures. Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text.

More details provided in the paper, repository and model card.

Specification#

Metric

Value

Type

Text Prediction

GFlops

293.0489

MParams

175.6203

Source framework

PyTorch*

GFlops calculated for 1, 1024 input shape, that is suitable for long context

Accuracy#

Perplexity obtained on WikiText-2 raw character level data dataset for converted model.

Metric

Value

Perplexity

29.00%

Input#

Original model#

Token ids, name: input, dynamic shape in the format B, L, where:

  • B - batch size

  • L - sequence length

Converted model#

Token ids, name: input, dynamic shape in the format B, L, where:

  • B - batch size

  • L - sequence length

Output#

Original model#

Prediction scores of language modeling head, name: output, dynamic shape B, L, 50257 in the format B, L, S, where:

  • B - batch size

  • L - sequence length

  • S - vocab size

Converted model#

Prediction scores of language modeling head, name: output, dynamic shape B, L, 50257 in the format B, L, S, where:

  • B - batch size

  • L - sequence length

  • S - vocab size

Download a Model and Convert it into OpenVINO™ IR Format#

You can download models and if necessary convert them into OpenVINO™ IR format using the Model Downloader and other automation tools as shown in the examples below.

An example of using the Model Downloader:

omz_downloader --name <model_name>

An example of using the Model Converter:

omz_converter --name <model_name>

Demo usage#

The model can be used in the following demos provided by the Open Model Zoo to show its capabilities: