This model is a pair of encoder and decoder. The encoder is HRNetV2-W48 and the decoder is C1 (one convolution module and interpolation). HRNetV2-W48 is semantic-segmentation model based on architecture described in paper High-Resolution Representations for Labeling Pixels and Regions. This is PyTorch* implementation based on retaining high resolution representations throughout the model and pretrained on ADE20k dataset. For details about implementation of model, check out the Semantic Segmentation on MIT ADE20K dataset in PyTorch repository.
Metric | Value |
---|---|
Type | Segmentation |
GFLOPs | 81.9930 |
MParams | 66.4768 |
Source framework | PyTorch* |
Metric | Original model | Converted model |
---|---|---|
Pixel accuracy | 77.69% | 77.69% |
mean IoU | 33.02% | 33.02% |
Image, name - image
, shape - [1x3x320x320]
, format is [BxCxHxW]
, where:
B
- batch sizeH
- heightW
- widthC
- channelChannel order is RGB
. Mean values - [123.675,116.28,103.53], scale values - [58.395,57.12,57.375].
Image, name - input.1
, shape - [1x3x320x320]
, format is [BxCxHxW]
, where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Semantic-segmentation mask according to ADE20k classes, name - softmax
, shape - 1,150,320,320
, output data format is B,C,H,W
where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] rangeH
- heightW
- widthSemantic-segmentation mask according to ADE20k classes, name - softmax
, shape - 1,150,320,320
, output data format is B,C,H,W
where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] rangeH
- heightW
- widthYou can download models and if necessary convert them into Inference Engine format using the Model Downloader and other automation tools as shown in the examples below.
An example of using the Model Downloader:
An example of using the Model Converter:
The original model is distributed under the following license: