Rethinking Monocular Depth Estimation with Adversarial Training
Monocular depth estimation is an extensively studied computer vision problem with a vast variety of applications. This work introduces a novel paradigm for monocular depth estimation using deep networks that incorporates adversarial loss. We describe a variety of deep learning architectures that include a structured loss term with conditional generative adversarial networks. In this framework, the generator learns a mapping between an RGB image and its corresponding depth map, while the discriminator learns to distinguish estimated depth maps from ground truth. We benchmark this approach on the NYUv2 and Make3D datasets, and observe that the addition of adversarial training reduces relative error significantly, achieving SOTA performance on Make3D. These results suggest that adversarial training is a powerful technique for improving depth estimation performance of deep networks.
READ FULL TEXT