Depth estimation using an improved stereo network

Wanpeng XU; Ling ZOU; Lingda WU; Yue QI; Zhaoyong QIAN

doi:10.1631/FITEE.2000676

PDF(2738 KB)

Front. Inform. Technol. Electron. Eng ›› 2022, Vol. 23 ›› Issue (5) : 777-789. DOI: 10.1631/FITEE.2000676

Orginal Article

Depth estimation using an improved stereo network

Author information +

History +

Abstract

Self-supervised depth estimation approaches present excellent results that are comparable to those of the fully supervised approaches, by employing view synthesis between the target and reference images in the training data. ResNet, which serves as a backbone network, has some structural deficiencies when applied to downstream fields, because its original purpose was to cope with classification problems. The low-texture area also deteriorates the performance. To address these problems, we propose a set of improvements that lead to superior predictions. First, we boost the information flow in the network and improve the ability to learn spatial structures by improving the network structures. Second, we use a binary mask to remove the pixels in low-texture areas between the target and reference images to more accurately reconstruct the image. Finally, we input the target and reference images randomly to expand the dataset and pre-train it on ImageNet, so that the model obtains a favorable general feature representation. We demonstrate state-of-the-art performance on an Eigen split of the KITTI driving dataset using stereo pairs.

Keywords

Monocular depth estimation / Self-supervised / Image reconstruction

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Wanpeng XU, Ling ZOU, Lingda WU, Yue QI, Zhaoyong QIAN. Depth estimation using an improved stereo network. Front. Inform. Technol. Electron. Eng, 2022, 23(5): 777‒789 https://doi.org/10.1631/FITEE.2000676