MidasNet is a model for monocular depth estimation trained by mixing several datasets; as described in the following paper: "Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer" https://arxiv.org/abs/1907.01341
The model input is a blob that consists of a single image of "1x3x384x384" in RGB
order.
The model output is an inverse depth map that is defined up to an unknown scale factor.
See here
Metric | Value |
---|---|
Type | Monodepth |
GFLOPs | 207.25144 |
MParams | 104.081 |
Source framework | PyTorch* |
Metric | Value |
---|---|
rmse | 0.07071 |
Image, name - image
, shape - 1,3,384,384
, format is B,C,H,W
where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is RGB
.
Mean values - [123.675, 116.28, 103.53]. Scale values - [51.525, 50.4, 50.625].
Image, name - image
, shape - 1,3,384,384
, format is B,C,H,W
where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Inverse depth map, name - inverse_depth
, shape - 1,384,384
, format is B,H,W
where:
B
- batch sizeH
- heightW
- widthInverse depth map is defined up to an unknown scale factor.
Inverse depth map, name - inverse_depth
, shape - 1,384,384
, format is B,H,W
where:
B
- batch sizeH
- heightW
- widthInverse depth map is defined up to an unknown scale factor.
You can download models and if necessary convert them into Inference Engine format using the Model Downloader and other automation tools as shown in the examples below.
An example of using the Model Downloader:
An example of using the Model Converter:
The original model is released under the following license:
[*] Other names and brands may be claimed as the property of others.