Towards General Purpose and Geometry Preserving Single-View Depth Estimation
Single-view depth estimation plays a crucial role in scene understanding for AR applications and 3D modelling as it allows to retrieve the geometry of a scene. However, it is only possible if the inverse depth estimates are unbiased, i.e. they are either absolute or Up-to-Scale (UTS). In recent years, great progress has been made in general-purpose single-view depth estimation. Nevertheless, the latest general-purpose models were trained using ranking or on Up-to-Shift-Scale (UTSS) data. As a result, they provide UTSS predictions that cannot be used to reconstruct scene geometry. In this work, we strive to build a general-purpose single-view UTS depth estimation model. Following Ranftl et. al., we train our model on a mixture of datasets and test it on several previously unseen datasets. We show that our method outperforms previous state-of-the-art UTS models. We train several light-weight models following the proposed training scheme and prove that our ideas are applicable for computationally efficient depth estimation.
READ FULL TEXT