MidasNet is a model for monocular depth estimation trained by mixing several datasets; as described in the following paper: "Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer" https://arxiv.org/abs/1907.01341
The model input is a blob that consists of a single image of "1x3x384x384" in RGB
order.
The model output is an inverse depth map that is defined up to an unknown scale factor.
NOTE: Originally the model weights are stored at Google Drive,
which is unstable to download from due to weights size. Weights were additionally uploaded to https://download.01.org/opencv/public_models, OpenVINO Model Downloader uses this location for downloading.
See here
Metric | Value |
---|---|
Type | Monodepth |
GFLOPs | 207.4915 |
MParams | 104.0814 |
Source framework | PyTorch* |
Metric | Value |
---|---|
rmse | 7.5878 |
Image, name - image
, shape - 1,3,384,384
, format is B,C,H,W
where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is RGB
.
Mean values - [123.675, 116.28, 103.53]. Scale values - [51.525, 50.4, 50.625].
Image, name - image
, shape - 1,3,384,384
, format is B,C,H,W
where:
B
- batch sizeC
- channelH
- heightW
- widthChannel order is BGR
.
Inverse depth map, name - inverse_depth
, shape - 1,384,384
, format is B,H,W
where:
B
- batch sizeH
- heightW
- widthInverse depth map is defined up to an unknown scale factor.
Inverse depth map, name - inverse_depth
, shape - 1,384,384
, format is B,H,W
where:
B
- batch sizeH
- heightW
- widthInverse depth map is defined up to an unknown scale factor.
The original model is released under the following license:
[*] Other names and brands may be claimed as the property of others.