icnet-camvid-ava-sparse-60-0001#
Use Case and High-Level Description#
A trained model of ICNet for fast semantic segmentation, trained on the CamVid dataset from scratch using the TensorFlow* framework. The trained model has 60% sparsity (ratio of zeros within all the convolution kernel weights). For details about the original floating-point model, check out the ICNet for Real-Time Semantic Segmentation on High-Resolution Images.
The model input is a blob that consists of a single image of 1, 720, 960, 3
in the BGR
order. The pixel values are integers in the [0, 255] range.
The model output for icnet-camvid-ava-sparse-60-0001
is the predicted class index of each input pixel belonging to one of the 12 classes of the CamVid dataset:
Sky
Building
Pole
Road
Pavement
Tree
SignSymbol
Fence
Vehicle
Pedestrian
Bike
Unlabeled
Specification#
Metric |
Value |
---|---|
GFlops |
75.8180 |
MParams |
26.7043 |
Source framework |
TensorFlow* |
Accuracy#
The quality metrics were calculated on the CamVid validation dataset. The unlabeled
class had been ignored during metrics calculation.
Metric |
Value |
---|---|
mIoU |
75.79% |
IOU=TP/(TP+FN+FP)
, where:TP
- number of true positive pixels for given classFN
- number of false negative pixels for given classFP
- number of false positive pixels for given class
Input#
Image, name: data
, shape - 1, 720, 960, 3
, format is B, H, W, C
, where:
B
- batch sizeH
- heightW
- widthC
- channel
Channel order is BGR
.
Output#
Semantic segmentation class prediction map, shape - 1, 720, 960
, output data format is B, H, W
, where:
B
- batch sizeH
- horizontal coordinate of the input pixelW
- vertical coordinate of the input pixel
Output contains the class prediction result of each pixel.
Demo usage#
The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
Legal Information#
[*] Other names and brands may be claimed as the property of others.