Use Case and High-Level Description

A trained model of ICNet for fast semantic segmentation, trained on the CamVid* dataset from scratch using the TensorFlow* framework. The trained model has 60% sparsity (ratio of 0's within all the convolution kernel weights). For more details about the original floating point model, check out the paper.

The model input is a blob that consists of a single image of "1x3x720x960" in BGR order. The pixel values are integers in the [0, 255] range.

The model output for icnet-camvid-ava-sparse-60-0001 is the predicted class index of each input pixel belonging to one of the 12 classes of the CamVid dataset.


Metric Value
GFlops 151.82Bn
MParams 25.45
Source framework TensorFlow*


The quality metrics were calculated on the CamVid* validation dataset. The 'unlabeled' class had been ignored during metrics calculation.

Metric Value
mIoU 69.91%



Image, shape - 1,3,720,960, format is B,C,H,W where:

Channel order is BGR.


Semantic segmentation class prediction map, shape - 1,720,960, output data format is B,H,W where:

containing the class prediction result of each pixel.

Legal Information

[*] Other names and brands may be claimed as the property of others.