gpt-2

Use Case and High-Level Description

The gpt-2 model is a one of Generative Pre-trained Transformer (GPT) model family, pre-trained on a very large corpus of English data in a self-supervised fashion. The GPT architecture implements a deep neural network, specifically a transformer model, which uses attention in place of previous recurrence- and convolution-based architectures. Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text.

More details provided in the paper, repository and model card.

Specification

Metric

Value

Type

Text Prediction

GFlops

293.0489

MParams

175.6203

Source framework

PyTorch*

GFlops calculated for 1, 1024 input shape, that is suitable for long context

Accuracy

Perplexity obtained on WikiText-2 raw character level data dataset for converted model.

Metric

Value

Perplexity

29.00%

Input

Original model

Token ids, name: input, dynamic shape in the format B, L, where:

  • B - batch size

  • L - sequence length

Converted model

Token ids, name: input, dynamic shape in the format B, L, where:

  • B - batch size

  • L - sequence length

Output

Original model

Prediction scores of language modeling head, name: output, dynamic shape B, L, 50257 in the format B, L, S, where:

  • B - batch size

  • L - sequence length

  • S - vocab size

Converted model

Prediction scores of language modeling head, name: output, dynamic shape B, L, 50257 in the format B, L, S, where:

  • B - batch size

  • L - sequence length

  • S - vocab size

Download a Model and Convert it into OpenVINO™ IR Format

You can download models and if necessary convert them into OpenVINO™ IR format using the Model Downloader and other automation tools as shown in the examples below.

An example of using the Model Downloader:

omz_downloader --name <model_name>

An example of using the Model Converter:

omz_converter --name <model_name>

Demo usage

The model can be used in the following demos provided by the Open Model Zoo to show its capabilities: