icnet-camvid-ava-sparse-30-0001

Use Case and High-Level Description

A trained model of ICNet for fast semantic segmentation, trained on the CamVid* dataset from scratch using the TensorFlow* framework. The trained model has 30% sparsity (ratio of zeros within all the convolution kernel weights). For details about the original floating-point model, check out the ICNet for Real-Time Semantic Segmentation on High-Resolution Images.

The model input is a blob that consists of a single image of 1x3x720x960 in the BGR order. The pixel values are integers in the [0, 255] range.

The model output for icnet-camvid-ava-sparse-30-0001 is the predicted class index of each input pixel belonging to one of the 12 classes of the CamVid dataset.

Specification

Metric Value
GFlops 151.82Bn
MParams 25.45
Source framework TensorFlow*

Accuracy

The quality metrics were calculated on the CamVid* validation dataset. The unlabeled class had been ignored during metrics calculation.

Metric Value
mIoU 69.99%

Performance

Input

Image, shape - 1,3,720,960, format is B,C,H,W where:

Channel order is BGR.

Output

Semantic segmentation class prediction map, shape - 1,720,960, output data format is B,H,W where:

Output contains the class prediction result of each pixel.

Legal Information

[*] Other names and brands may be claimed as the property of others.