NeOR: neural exploration with feature-based visual odometry and tracking-failure-reduction policy

Ziheng Zhu, Jialing Liu, Kaiqi Chen, Qiyi Tong, Ruyu Liu

Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (5) : 290-297.

Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (5) : 290-297. DOI: 10.1007/s11801-025-4034-8
Article

NeOR: neural exploration with feature-based visual odometry and tracking-failure-reduction policy

Author information +
History +

Abstract

Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy (NeOR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning (DRL)-based exploration policies and leverages feature-based visual odometry (VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that NeOR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.

Cite this article

Download citation ▾
Ziheng Zhu, Jialing Liu, Kaiqi Chen, Qiyi Tong, Ruyu Liu. NeOR: neural exploration with feature-based visual odometry and tracking-failure-reduction policy. Optoelectronics Letters, 2025, 21(5): 290‒297 https://doi.org/10.1007/s11801-025-4034-8

References

[1]
Yamauchi B. A frontier-based approach for autonomous exploration[C]. Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA’97. Towards New Computational Principles for Robotics and Automation, July 10–11, 1997, Monterey, USA, 1997. New York: IEEE 146-151.
[2]
CHAPLOT D S, GANDHI D, GUPTA S, et al. Learning to explore using active neural SLAM[EB/OL]. (2020-4-10) [2024-3-11]. https://arxiv.org/pdf/2004.05155.
[3]
Ramakrishnan S K, Al-Halah Z, Grauman K. Occupancy anticipation for efficient exploration and navigation[C]. 16th European Conference on Computer Vision, August 23–28, 2020, Glasgow, UK, 2020. New York: Springer International Publishing 400-418.
[4]
Bigazzi R, Landi F, Cascianelli S, et al. Focus on impact: indoor exploration with intrinsic motivation[J]. IEEE robotics and automation letters, 2022, 7(2):2985-2992.
[5]
Georgakis G, Bucher B, Arapin A, et al. Uncertainty-driven planner for exploration and navigation[C]. 2022 International Conference on Robotics and Automation (ICRA), May 23–27, 2022, Philadelphia, USA, 2022. New York: IEEE 11295-11302.
[6]
Yang X, Yu C, Gao J, et al. Save: spatial-attention visual exploration[C]. 2022 IEEE International Conference on Image Processing (ICIP), October 16–19, 2022, Bordeaux, France, 2022. New York: IEEE 1356-1360.
[7]
LIU S, SUGANUMA M, OKATANI T. Symmetry-aware neural architecture for embodied visual navigation[J]. International journal of computer vision, 2023: 1–17.
[8]
Partsey R, Wijmans E, Yokoyama N, et al. Is mapping necessary for realistic pointgoal navigation?[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–24, 2022, New Orleans, USA, 2022. New York: IEEE 17232-17241.
[9]
Cao Y, Zhang X, Luo F, et al. Unsupervised visual odometry and action integration for pointgoal navigation in indoor environment[J]. IEEE transactions on circuits and systems for video technology, 2023, 33(10):6173-6184.
[10]
Mur-Artal R, Tardós J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE transactions on robotics, 2017, 33(5):1255-1262.
[11]
Zhang J, Liu J, Chen K, et al. Map recovery and fusion for collaborative augment reality of multiple mobile devices[J]. IEEE transactions on industrial informatics, 2020, 17(3):2081-2089.
[12]
Jin S, Meng Q, Dai X, et al. Safe-nav: learning to prevent pointgoal navigation failure in unknown environments[J]. Complex & intelligent systems, 2022, 8(3):2273-2290.
[13]
Naveed K, Anjum M L, Hussain W, et al. Deep introspective SLAM: deep reinforcement learning based approach to avoid tracking failure in visual SLAM[J]. Autonomous robots, 2022, 46(6):705-724.
[14]
Dai X Y, Meng Q H, Jin S, et al. Camera view planning based on generative adversarial imitation learning in indoor active exploration[J]. Applied soft computing, 2022, 129: 109621.
[15]
Bartolomei L, Teixeira L, Chli M. Semantic-aware active perception for UAVs using deep reinforcement learning[C]. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, September 28–30, 2021, Prague, Czech Republic, 2021. New York: IEEE 3101-3108.
[16]
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]. Advances in Neural Information Processing Systems, December 4–9, 2017, Long Beach, USA, 2017. New York: Curran Associates Inc. 6000-6010.
[17]
Radford A, Kim J W, Hallacy C, et al. Learning transferable visual models from natural language supervision[C]. International Conference on Machine Learning, July 18–24, 2021, Virtual, 2021. New York: PMLR 8748-8763.
[18]
Placed J A, Strader J, Carrillo H, et al. A survey on active simultaneous localization and mapping: state of the art and new frontiers[J]. IEEE transactions on robotics, 2023, 39(3):1686-1705.
[19]
Bonetto E, Goldschmid P, Pabst M, et al. Irotate: active visual SLAM for omnidirectional robots[J]. Robotics and autonomous systems, 2022, 154: 104102.
[20]
Zhang H, Wang S, Liu Y, et al. EFP: efficient frontier-based autonomous UAV exploration strategy for unknown environments[J]. IEEE robotics and automation letters, 2024, 9(3):2941-2948.
[21]
Huang J, Zhou B, Fan Z, et al. FAEL: fast autonomous exploration for large-scale environments with a mobile robot[J]. IEEE robotics and automation letters, 2023, 8(3):1667-1674.
[22]
Yuwen X, Zhang H, Yan F, et al. Gaze control for active visual SLAM via panoramic cost map[J]. IEEE transactions on intelligent vehicles, 2022, 8(2):1813-1825.
[23]
Yang Y, Zhang J, Qian W, et al. Autonomous exploration for mobile robot in three dimensional multi-layer space[C]. International Conference on Intelligent Robotics and Applications, July 5–7, 2023, Hangzhou, China, 2023. Singapore: Springer Nature Singapore 254-266.
[24]
Deng X, Zhang Z, Sintov A, et al. Feature-constrained active visual SLAM for mobile robot navigation[C]. 2018 IEEE International Conference on Robotics and Automation (ICRA), May 21–25, 2018, Brisbane, Australia, 2018. New York: IEEE 7233-7238.
[25]
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 26–July 1, 2016, Las Vegas, USA, 2016. New York: IEEE 770-778.
[26]
CHO K, VAN MERRIËNBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL]. (2014-9-3) [2024-3-11]. https://arxiv.org/pdf/1406.1078.
[27]
Ye J, Batra D, Das A, et al. Auxiliary tasks and exploration enable objectgoal navigation[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, Canada, 2021. New York: IEEE 16117-16126.
[28]
SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. (2017-8-28) [2024-3-11]. https://arxiv.org/pdf/1707.06347.
[29]
Savva M, Kadian A, Maksymets O, et al. Habitat: a platform for embodied AI research[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea, 2019. New York: IEEE 9339-9347.
[30]
Xia F, Zamir A R, He Z, et al. Gibson ENV: real-world perception for embodied agents[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 19–21, 2018, Salt Lake City, USA, 2018. New York: IEEE 9068-9079.

Accesses

Citations

Detail

Sections
Recommended

/