A Hybrid of RRT and TD3 Deep Reinforcement Learning Algorithm for UAV Path Planning in 3D Partially Unknown Environments

Yanxi HE , Jie QI , Nailong WU

Journal of Donghua University(English Edition) ›› 2025, Vol. 42 ›› Issue (6) : 639 -649.

PDF (10624KB)
Journal of Donghua University(English Edition) ›› 2025, Vol. 42 ›› Issue (6) :639 -649. DOI: 10.19884/j.1672-5220.202407004
Information Technology and Artificial Intelligence
research-article

A Hybrid of RRT and TD3 Deep Reinforcement Learning Algorithm for UAV Path Planning in 3D Partially Unknown Environments

Author information +
History +
PDF (10624KB)

Abstract

To guide an unmanned aerial vehicle (UAV) flying in complex three-dimensional (3D) environments with unknown obstacles, a novel UAV path planning algorithm named IRRT-2TD3 is proposed.The algorithm combines the rapidly-exploring random tree star (RRT) algorithm with the twin delayed deep deterministic policy gradients (TD3) algorithm (a deep reinforcement learning algorithm).By employing exploration strategies from reinforcement learning, IRRT-C2TD3 improves the RRT algorithm.IRRT-C2TD3 is a two-stage path planning algorithm comprising pre-planning and real-time planning.It performs re-planning of paths by generating paths based on geometric connections toward the goal and smoothing them using cubic B-spline curves.By designing the network architecture and reward function of the TD3 algorithm, real-time planning in unknown environments is achieved based on the pre-planned path from the first stage.Simulation results show that IRRT-C2TD3 demonstrates better path planning performance in 3D partially unknown environments than RRT-C2TD3, M-C2TD3 and MODRRT algorithms.

Keywords

3D path planning / deep reinforcement learning / rapidly-exploring random tree (RRT) / UAV

Cite this article

Download citation ▾
Yanxi HE, Jie QI, Nailong WU. A Hybrid of RRT and TD3 Deep Reinforcement Learning Algorithm for UAV Path Planning in 3D Partially Unknown Environments. Journal of Donghua University(English Edition), 2025, 42(6): 639-649 DOI:10.19884/j.1672-5220.202407004

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

LIU Z H, LIU Q, XU W J, et al. Robot learning towards smart robotic manufacturing: a review[J]. Robotics and Computer-Integrated Manufacturing, 2022, 77: 102360.

[2]

HU Z J, GAO X G, WAN K F, et al. Relevant experience learning: a deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments[J]. Chinese Journal of Aeronautics, 2021, 34(12): 187-204.

[3]

SENTHILNATH J, KANDUKURI M, DOKANIA A, et al. Application of UAV imaging platform for vegetation analysis based on spectral-spatial methods[J]. Computers and Electronics in Agriculture, 2017, 140: 8-24.

[4]

SHAHSAVANI H. An aeromagnetic survey carried out using a rotary-wing UAV equipped with a low-cost magneto-inductive sensor[J]. International Journal of Remote Sensing, 2021, 42(23): 8805-8818.

[5]

TANG G, TANG C Q, CLARAMUNT C, et al. Geometric A-star algorithm: an improved A-star algorithm for AGV path planning in a port environment[J]. IEEE Access, 2021, 9: 59196-59210.

[6]

HART P E, NILSSON N J, RAPHAEL B. A formal basis for the heuristic determination of minimum cost paths[J]. IEEE Transactions on Systems Science and Cybernetics, 1968, 4 (2): 100-107.

[7]

XU H Q, XING H X, LIU Y. Path planning of UAV by combining improved ant colony system and dynamic window algorithm[J]. Journal of Donghua University (English Edition), 2023, 40 (6): 676-683.

[8]

KARAMAN S, FRAZZOLI E. Sampling-based algorithms for optimal motion planning[J]. The International Journal of Robotics Research, 2011, 30(7): 846-894.

[9]

LAVALLE S M, KUFFNER J J Jr. Randomized kinodynamic planning[J]. International Journal of Robotics Research, 2001, 20(5): 378-400.

[10]

GAMMELL J D, SRINIVASA S S, BARFOOT T D. Informed RRT∗: optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic[C]//2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. New York: IEEE, 2014: 2997-3004.

[11]

LI Y J, WEI W, GAO Y, et al. PQ-RRT∗: an improved path planning algorithm for mobile robots[J]. Expert Systems with Applications, 2020, 152: 113425.

[12]

WANG J K, LI T G, LI B P, et al. GMRRRT: sampling-based path planning using Gaussian mixture regression[J]. IEEE Transactions on Intelligent Vehicles, 2022, 7 (3): 690-700.

[13]

ESHTEHARDIAN S A, KHODAYGAN S. A continuous RRT∗-based path planning method for non-holonomic mobile robots using B-spline curves[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (7): 8693-8702.

[14]

SUN Z Y, SHEN B, PAN A Q, et al. A modified self-adaptive sparrow search algorithm for robust multi-UAV path planning[J]. Journal of Donghua University (English Edition), 2024, 41(6): 630-643.

[15]

QI J, YANG H, SUN H X. MOD-RRT: a sampling-based algorithm for robot path planning in dynamic environment[J]. IEEE Transactions on Industrial Electronics, 2021, 68 (8): 7244-7251.

[16]

VASHISTH A, RÜCKIN J, MAGISTRI F, et al. Deep reinforcement learning with dynamic graphs for adaptive informative path planning[J]. IEEE Robotics and Automation Letters, 2024, 9(9): 7747-7754.

[17]

LI W J, YUE M, SHANGGUAN J Y, et al. Navigation of mobile robots based on deep reinforcement learning: reward function optimization and knowledge transfer[J]. International Journal of Control, Automation and Systems, 2023, 21(2): 563-574.

[18]

LEE M H, MOON J. Deep reinforcement learning-based model-free path planning and collision avoidance for UAVs: a soft actor-critic with hindsight experience replay approach[J]. ICT Express, 2023, 9(3): 403-408.

[19]

ANDRYCHOWICZ M, WOLSKI F, RAY A, et al.Hindsight experience replay[EB/OL].(2018-02-23)[2024-07-20].https://arxiv.org/abs/1707.01495.

[20]

WANG J K, CHI W Z, LI C M, et al. Neural RRT: learning-based optimal path planning[J]. IEEE Transactions on Automation Science and Engineering, 2020, 17(4): 1748-1758.

[21]

WANG J K, JIA X, ZHANG T Y, et al. Deep neural network enhanced sampling-based path planning in 3D space[J]. IEEE Transactions on Automation Science and Engineering, 2022, 19 (4): 3434-3443.

[22]

URAIN J, LE A T, LAMBERT A, et al. Learning implicit priors for motion optimization[C]//2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). New York: IEEE, 2022: 7672-7679.

[23]

LIU B L, JIANG G D, ZHAO F, et al. Collision-free motion generation based on stochastic optimization and composite signed distance field networks of articulated robot[J]. IEEE Robotics and Automation Letters, 2023, 8 (11): 7082-7089.

[24]

FUJIMOTO S, HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]//International conference on machine learning. [S.l.]: PMLR, 2018: 1587-1596.

[25]

MIENYE I D, SUN Y X. A survey of ensemble learning: concepts, algorithms, applications, and prospects[J]. IEEE Access, 2022, 10: 99129-99149.

[26]

YANG C P, ZHAO Y Q, CAI X, et al. Path planning algorithm for unmanned surface vessel based on multiobjective reinforcement learning[J]. Computational Intelligence and Neuroscience, 2023, 2023 (1): 2146314.

[27]

HUANG S Q, WU X R, HUANG G M.Deep reinforcement learning-based multi-objective 3D path planning for vehicles[C]//Proceedings of 2023 Chinese Intelligent Systems Conference. Singapore: Springer, 2023: 867-875.

[28]

LIU X F, ZHANG P, FANG H, et al. Multiobjective reactive power optimization based on improved particle swarm optimization with ε -greedy strategy and Pareto archive algorithm[J]. IEEE Access, 2021, 9: 65650-65659.

[29]

QU C Z, GAI W D, ZHONG M Y, et al. A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (UAVs) path planning[J]. Applied Soft Computing, 2020, 89: 106099.

[30]

JOHNSON D. The triangular distribution as a proxy for the beta distribution in risk analysis[J]. Journal of the Royal Statistical Society: Series D (The Statistician), 1997, 46(3): 387-398.

Funding

National Natural Science Foundation of China(62173084)

Foundation of Shanghai Committee of Science and Technology, China(23ZR1401800)

Foundation of Shanghai Committee of Science and Technology, China(22JC1401403)

PDF (10624KB)

133

Accesses

0

Citation

Detail

Sections
Recommended

/