This model is a pair of encoder and decoder. The encoder is HRNetV2-W48 and the decoder is C1 (one convolution module and interpolation). HRNetV2-W48 is semantic-segmentation model based on architecture described in paper High-Resolution Representations for Labeling Pixels and Regions. This is PyTorch* implementation based on retaining high resolution representations throughout the model and pretrained on ADE20k dataset. For details about implementation of model, check out the Semantic Segmentation on MIT ADE20K dataset in PyTorch repository.
Metric | Value |
---|---|
Type | Segmentation |
GFLOPs | 81.9930 |
MParams | 66.4768 |
Source framework | PyTorch* |
Metric | Original model | Converted model |
---|---|---|
Pixel accuracy | 77.69% | 77.69% |
mean IoU | 33.02% | 33.02% |
Image, name - image
, shape - [1x3x320x320]
, format is [BxCxHxW]
, where:
B
- batch sizeH
- heightW
- widthC
- channelChannel order is RGB
. Mean values - [123.675,116.28,103.53], scale values - [58.395,57.12,57.375].
Image, name - input.1
, shape - [1x3x320x320]
, format is [BxCxHxW]
, where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Semantic-segmentation mask according to ADE20k classes, name - softmax
, shape - 1,150,320,320
, output data format is B,C,H,W
where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] rangeH
- heightW
- widthSemantic-segmentation mask according to ADE20k classes, name - softmax
, shape - 1,150,320,320
, output data format is B,C,H,W
where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] rangeH
- heightW
- widthThe original model is distributed under the following license: