High Quality Monocular Video Depth Estimation Based on Mask Guided Refinement

doi:10.15918/j.jbit1004-0579.2024.092

Journal of Beijing Institute of Technology ›› 2025, Vol. 34 ›› Issue (1) :18 -27. DOI: 10.15918/j.jbit1004-0579.2024.092

Huixiao Pan, Qiang Zhao

Author information +

History +

PDF (1957KB)

Abstract

Depth maps play a crucial role in various practical applications such as computer vision, augmented reality, and autonomous driving. How to obtain clear and accurate depth information in video depth estimation is a significant challenge faced in the field of computer vision. However, existing monocular video depth estimation models tend to produce blurred or inaccurate depth information in regions with object edges and low texture. To address this issue, we propose a monocular depth estimation model architecture guided by semantic segmentation masks, which introduces semantic information into the model to correct the ambiguous depth regions. We have evaluated the proposed method, and experimental results show that our method improves the accuracy of edge depth, demonstrating the effectiveness of our approach.

Keywords

monocular video depth estimation / depth refinement / edge depth accuracy / semantic segmentation

Cite this article

Download citation ▾

Huixiao Pan, Qiang Zhao. High Quality Monocular Video Depth Estimation Based on Mask Guided Refinement. Journal of Beijing Institute of Technology, 2025, 34(1): 18-27 DOI:10.15918/j.jbit1004-0579.2024.092