# midasnet¶

## Use Case and High-Level Description¶

MidasNet is a model for monocular depth estimation trained by mixing several datasets; as described in the following paper: Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer

The model input is a blob that consists of a single image of 1, 3, 384, 384 in RGB order.

The model output is an inverse depth map that is defined up to an unknown scale factor.

See here

Metric

Value

Type

Monodepth

GFLOPs

207.25144

MParams

104.081

Source framework

PyTorch*

Metric

Value

rmse

0.07071

## Input¶

### Original Model¶

Image, name - image, shape - 1, 3, 384, 384, format is B, C, H, W, where:

• B - batch size

• C - channel

• H - height

• W - width

Channel order is RGB.

Mean values - [123.675, 116.28, 103.53]. Scale values - [51.525, 50.4, 50.625].

### Converted Model¶

Image, name - image, shape - 1, 3, 384, 384, format is B, C, H, W, where:

• B - batch size

• C - channel

• H - height

• W - width

Channel order is BGR.

## Output¶

### Original Model¶

Inverse depth map, name - inverse_depth, shape - 1, 384, 384, format is B, H, W, where:

• B - batch size

• H - height

• W - width

Inverse depth map is defined up to an unknown scale factor.

### Converted Model¶

Inverse depth map, name - inverse_depth, shape - 1, 384, 384, format is B, H, W, where:

• B - batch size

• H - height

• W - width

Inverse depth map is defined up to an unknown scale factor.

You can download models and if necessary convert them into OpenVINO™ IR format using the Model Downloader and other automation tools as shown in the examples below.

omz_downloader --name <model_name>
omz_converter --name <model_name>