Path Planning of Lunar Surface Sampling Manipulator for Chang'E-5 Mission

doi:10.15982/j.issn.2096-9287.2021.20210095

PDF(3021 KB)

Journal of Deep Space Exploration ›› 2021, Vol. 8 ›› Issue (6) : 564-571. DOI: 10.15982/j.issn.2096-9287.2021.20210095

Topic：Lunar and planetary TT&C Technology

Path Planning of Lunar Surface Sampling Manipulator for Chang'E-5 Mission

HU Xiaodong, ZHANG Kuan, XIE Yuan, ZHANG Hui, LU Hao, LIU Chuankai, CHEN Xiang, ZHAO Huanzhou, XIE Jianfeng

Author information +

History +

Abstract

Aiming at the problem of precise control of the sampling manipulator in the lunar surface sampling mission of "Chang'E-5"，a path planning method based on deep reinforcement learning is proposed. By designing the multi-constraint reward function of the deep reinforcement learning algorithm，a motion path that satisfies the three constraints of safety，speed and reachability is planned. The precise control of the sampling robotic arm is realized. Under the advance of meeting the task safety，the interaction time between heaven and earth is greatly shortened，and the control effect of the manipulator is more stable. Experimental results show that this method has high accuracy and robustness，and can provide reference for subsequent on orbit sampling tasks.

Keywords

Lunar surface sampling / manipulator / path planning / deep reinforcement learning.

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

HU Xiaodong, ZHANG Kuan, XIE Yuan, ZHANG Hui, LU Hao, LIU Chuankai, CHEN Xiang, ZHAO Huanzhou, XIE Jianfeng. Path Planning of Lunar Surface Sampling Manipulator for Chang'E-5 Mission. Journal of Deep Space Exploration, 2021, 8(6): 564‒571 https://doi.org/10.15982/j.issn.2096-9287.2021.20210095

This is a preview of subscription content, contact us for subscripton.

References

[1] 王琼,侯军,刘然,等. 我国首次月面采样返回任务综述[J]. 中国航天,2021(3):34-39
[2] 马如奇,姜清水,刘宾,等. 月球采样机械臂系统设计及试验验证[J]. 宇航学报,2018,39(12):5-12
MA R Q,JIANG Q S,LIU B,et al. Design and verification of a lunar sampling manipulator system[J]. Journal of Astronautics,2018,39(12):5-12
[3] 唐玲,梁常春,王耀兵,等. 基于柔性补偿的行星表面采样机械臂控制策略研究[J]. 机械工程学报,2017,53(11):97-103
TANG L,LIANG C C,WANG Y B,et al. Research on flexible compensation control strategy for planetary surface sampling manipulator[J]. Journal of Mechanical Engineering,2017,53(11):97-103
[4] NAKANISHI H，YOSHIDA K. Impedance control for free-flying space robots -basic equations and applications[C]//International Conference on Intelligent Robots and Systems. [S. l]：IEEE，2006.
[5] SCHIELE A，HIRZINGER G. A new generation of ergonomic exoskeletons-the high-performance X-Arm-2 for space robotics telepresence[C]//International Conference on Intelligent Robots and Systems. [S. l]：IEEE，2011.
[6] NANOS K,PAPADOPOULOS E. On the use of free-floating space robots in the presence of angular momentum[J]. Intelligent Service Robotics,2011,4(1):3-15
[7] SUTTON R S，BARTO A G. Introduction to reinforcement learning[M]. Cambridge：MIT press，1998.
[8] MAEDA Y，WATANABE T，MORIYAMA Y. View-based programming with reinforcement learning for robotic manipulation[C]//IEEE International Symposium on Assembly and Manufacturing. [S. l]：IEEE，2011.
[9] PARK J J,KIM J H,SONG J B. Path planning for a robot manipulator based on probabilistic roadmap and reinforcement learning[J]. International Journal of Control Automation & Systems,2007,5(6):674-680
[10] LANGE S，RIEDMILLER M，VOIGTLANDER A. Autonomous reinforcement learning on raw visual input data in a real world application[C]//International Joint Conference on Neural Networks. [S. l]：IEEE，2012.
[11] LECUN Y,BENGIO Y,HINTON G. Deep learning[J]. Nature,2015,521(7553):436
[12] KRIZHEVSKY A，SUTSKEVER I，HINTON G E. ImageNet classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems. [S. l]：Curran Associates Incorperation，2012.
[13] REN S,HE K,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,2015,39(6):1137-1149
[14] MNIH V，KAVUKCUOGLU K，SILVER D，et al. Playing atari with deep reinforcement learning[J/OL]. （2021-10-9）. https://arxiv.org/abs/1312.5602.
[15] OSTAFEW C J，SCHOELLIG A P，BARFOOT T D. Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments[C]//IEEE International Conference on Robotics and Automation. [S. l]：IEEE，2016.
[16] LEI T，MING L. A robot exploration strategy based on Q-learning network[C]//IEEE International Conference on Real-Time Computing and Robotics. [S. l]：IEEE，2016.
[17] ZHANG F Y，LEITNER J，MILFORD M，et al. Towards vision-based deep reinforcement learning for robotic motion control[C]//proceedings of Australasian Conference on Robotics and Automation（ACRA）. Australasian：IEEE，2015.
[18] HASSELT H V，GUEZ A，SILVER D. Deep reinforcement learning with double Q-learning[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence Computer Science. Phoenix，Arizona，USA：AIAA，2016.
[19] SCGAUL T , QUAN J , ANTONOGLOU I , et al. Prioritized Experience Replay[EB/OL]. （2015-11-18）. https://www.semanticscholar.org/paper/Prioritized-Experience-Replay-Schaul-Quan/c6170fa90d3b2efede5a2e1660cb23e1c824f2ca?p2df.
[20] WANG Z，SCHAUL T，HESSEL M，et al. Dueling network architectures for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York ，USA：JMLR. ，2015.
[21] 裴照宇,任俊杰,彭兢,等. “嫦娥五号”任务总体方案权衡设计[J]. 深空探测学报(中英文),2021,8(3):215-226
PEI Z Y,REN J J,PENG J,et al. Overall scheme trade-off design of Chang’E-5 mission[J]. Journal of Deep Space Exploration,2021,8(3):215-226
[22] GOMES E R，KOWALCZYK R. Dynamic analysis of multiagent Q -learning with ε-greedy exploration[C]//International Conference on Machine Learning. [S. l]：ACM，2009.