Converting a PyTorch BERT-NER Model#

Danger

The code described here has been deprecated! Do not use it to avoid working with a legacy solution. It will be kept for some time to ensure backwards compatibility, but you should not use it in contemporary applications.

This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the Python tutorials.

The goal of this article is to present a step-by-step guide on how to convert PyTorch BERT-NER model to OpenVINO IR. First, you need to download the model and convert it to ONNX.

Downloading and Converting the Model to ONNX#

To download a pretrained model or train the model yourself, refer to the instructions in the BERT-NER model repository. The model with configuration files is stored in the out_base directory.

To convert the model to ONNX format, create and run the following script in the root directory of the model repository. If you download the pretrained model, you need to download bert.py to run the script. The instructions were tested with the commit-SHA: e5be564156f194f1becb0d82aeaf6e762d9eb9ed.

import torch

from bert import Ner

ner = Ner("out_base")

input_ids, input_mask, segment_ids, valid_positions = ner.preprocess('Steve went to Paris')
input_ids = torch.tensor([input_ids], dtype=torch.long, device=ner.device)
input_mask = torch.tensor([input_mask], dtype=torch.long, device=ner.device)
segment_ids = torch.tensor([segment_ids], dtype=torch.long, device=ner.device)
valid_ids = torch.tensor([valid_positions], dtype=torch.long, device=ner.device)

ner_model, tknizr, model_config = ner.load_model("out_base")

with torch.no_grad():
    logits = ner_model(input_ids, segment_ids, input_mask, valid_ids)
torch.onnx.export(ner_model,
                  (input_ids, segment_ids, input_mask, valid_ids),
                  "bert-ner.onnx",
                  input_names=['input_ids', 'segment_ids', 'input_mask', 'valid_ids'],
                  output_names=['output'],
                  dynamic_axes={
                      "input_ids": {0: "batch_size"},
                      "segment_ids": {0: "batch_size"},
                      "input_mask": {0: "batch_size"},
                      "valid_ids": {0: "batch_size"},
                      "output": {0: "output"}
                  },
                  opset_version=11,
                  )

The script generates ONNX model file bert-ner.onnx.

Converting an ONNX BERT-NER model to IR#

mo --input_model bert-ner.onnx --input "input_mask[1,128],segment_ids[1,128],input_ids[1,128]"

where 1 is batch_size and 128 is sequence_length.