MidasNet is a model for monocular depth estimation trained by mixing several datasets; as described in the following paper: "Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer" https://arxiv.org/abs/1907.01341
The model input is a blob that consists of a single image of "1x3x384x384" in RGB
order.
The model output is an inverse depth map that is defined up to an unknown scale factor.
NOTE: Originally the model weights are stored at Google Drive,
which is unstable to download from due to weights size. Currently, weights can be downloaded using OpenVINO Model Downloader.
See here
Metric | Value |
---|---|
Type | Monodepth |
GFLOPs | 207.4915 |
MParams | 104.0814 |
Source framework | PyTorch* |
Metric | Value |
---|---|
rmse | 7.5878 |
Image, name - image
, shape - 1,3,384,384
, format is B,C,H,W
where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is RGB
.
Mean values - [123.675, 116.28, 103.53]. Scale values - [51.525, 50.4, 50.625].
Image, name - image
, shape - 1,3,384,384
, format is B,C,H,W
where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Inverse depth map, name - inverse_depth
, shape - 1,384,384
, format is B,H,W
where:
B
- batch sizeH
- heightW
- widthInverse depth map is defined up to an unknown scale factor.
Inverse depth map, name - inverse_depth
, shape - 1,384,384
, format is B,H,W
where:
B
- batch sizeH
- heightW
- widthInverse depth map is defined up to an unknown scale factor.
The original model is released under the following license:
[*] Other names and brands may be claimed as the property of others.