Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot

Chao Ji , Diyuan Liu , Wei Gao , Shiwu Zhang

Biomimetic Intelligence and Robotics ›› 2025, Vol. 5 ›› Issue (1) : 100213 -100213.

PDF (2623KB)
Biomimetic Intelligence and Robotics ›› 2025, Vol. 5 ›› Issue (1) : 100213 -100213. DOI: 10.1016/j.birob.2025.100213
Research Article
research-article

Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot

Author information +
History +
PDF (2623KB)

Abstract

The ability of bipedal humanoid robots to walk adaptively on varied terrain is a critical challenge for practical applications, drawing substantial attention from academic and industrial research communities in recent years. Traditional model-based locomotion control methods have high modeling complexity, especially in complex terrain environments, making locomotion stability difficult to ensure. Reinforcement learning offers an end-to-end solution for locomotion control in humanoid robots. This approach typically relies solely on proprioceptive sensing to generate control policies, often resulting in increased robot body collisions during practical applications. Excessive collisions can damage the biped robot hardware, and more critically, the absence of multimodal input, such as vision, limits the robot’s ability to perceive environmental context and adjust its gait trajectory promptly. This lack of multimodal perception also hampers stability and robustness during tasks. In this paper, visual information is added to the locomotion control problem of humanoid robot, and a three-stage multi-objective constraint policy distillation optimization algorithm is innovantly proposed. The expert policies of different terrains to meet the requirements of gait aesthetics are trained through reinforcement learning, and these expert policies are distilled into student through policy distillation. Experimental results demonstrate a significant reduction in collision rates when utilizing a control policy that integrates multimodal perception, especially in challenging terrains like stairs, thresholds, and mixed surfaces. This advancement supports the practical deployment of bipedal humanoid robots.

Keywords

Bipedal humanoid robot / Deep reinforcement learning / Multimodal perception

Cite this article

Download citation ▾
Chao Ji, Diyuan Liu, Wei Gao, Shiwu Zhang. Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot. Biomimetic Intelligence and Robotics, 2025, 5(1): 100213-100213 DOI:10.1016/j.birob.2025.100213

登录浏览全文

4963

注册一个新账户 忘记密码

1 CRediT authorship contribution statement

Chao Ji: Writing - original draft, Visualization, Validation, Methodology, Formal analysis, Data curation. Diyuan Liu: Software. Wei Gao: Methodology. Shiwu Zhang: Writing - review & editing, Methodology, Funding acquisition.

2 Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

3 Acknowledgments

This work is supported by the National Natural Science Foundation of China (U21A20119, 62103395, and 51975550).

References

[1]

H. Chai, Y. Li, R. Song, et al., A survey of the development of quadruped robots: Joint configuration, dynamic locomotion control method and mo-bile manipulation approach, Biomim. Intell. Robot. 2 (1) (2022) 100029, http://dx.doi.org/10.1016/j.birob.2021.100029.

[2]

M. Shafiee-Ashtiani, A. Yousefi-Koma, M. Shariat-Panahi, Robust bipedal locomotion control based on model predictive control and divergent com-ponent of motion, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Singapore, 2017, pp. 3505-3510.

[3]

A. Hereid, E.A. Cousineau, C.M. Hubicki, et al., 3D dynamic walking with underactuated humanoid robots: A direct collocation framework for optimizing hybrid zero dynamics, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Stockholm, Sweden, 2016, pp. 1447-1454.

[4]

T. Apgar, P. Clary, K. Green, et al., Fast online trajectory optimization for the bipedal robot Cassie, in: Robotics: Sci. and Syst., 2018, p. 14.

[5]

W.C. Martin, A. Wu, H. Geyer, Experimental evaluation of deadbeat running on the ATRIAS biped, IEEE Robot. Autom. Lett. 2 (2) (2017) 1085-1092, http://dx.doi.org/10.1109/LRA.2017.2658020.

[6]

K. Green, Y. Godse, J. Dao, et al., Learning spring mass locomotion: Guiding policies with a reduced-order model, IEEE Robot. Autom. Lett. 6 (2) (2021) 3926-3932, http://dx.doi.org/10.1109/LRA.2021.3066833.

[7]

J. Reher, W.L. Ma, A.D. Ames,Dynamic walking with compliance on a Cassie bipedal robot, in:Proc. Eur. Control Conf., ECC, Naples, Italy, 2019, pp. 2589-2595.

[8]

K. Sreenath, H.W. Park, I. Poulakakis, et al., Embedding active force control within the compliant hybrid zero dynamics to achieve stable, fast running on MABEL, Int. J. Robot. Res. 32 (3) (2013) 324-345, http://dx.doi.org/10.1177/0278364912473344.

[9]

Y. Gong, R. Hartley, X. Da, et al., Feedback control of a Cassie bipedal robot: Walking, standing, and riding a segway,in:Proc. Amer. Control Conf., ACC, Philadelphia, PA, USA, 2019, pp. 4559-4566.

[10]

Z. Li, C. Zhou, N. Tsagarakis, et al., Compliance control for stabilizing the humanoid on the changing slope based on terrain inclination estimation, Auton. Robots 40 (6) (2016) 955-971, http://dx.doi.org/10.1007/s10514-015-9504-6.

[11]

J. Chen, K. Xu, X. Ding, Adaptive gait planning for quadruped robot based on center of inertia over rough terrain, Biomim. Intell. Robot. 2 (1) (2022) 100031, http://dx.doi.org/10.1016/j.birob.2021.100031.

[12]

X. Da, R. Hartley, J.W. Grizzle, Supervised learning for stabilizing underac-tuated bipedal robot locomotion, with outdoor experiments on the wave field, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Singapore, 2017, pp. 3476-3483.

[13]

G.A. Castillo, B. Weng, A. Hereid, et al., Reinforcement learning meets hybrid zero dynamics: A case study for RABBIT, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Montreal, QC, Canada, 2019, pp. 284-290.

[14]

G.A. Castillo, B. Weng, W. Zhang, et al., Hybrid zero dynamics inspired feedback control policy design for 3D bipedal locomotion using reinforce-ment learning, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Paris, France, 2020, pp. 8746-8752.

[15]

G.A. Castillo, B. Weng, W. Zhang, et al., Robust feedback motion policy design using reinforcement learning on a 3D digit bipedal robot,in:Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., IROS, Prague, Czech Republic, 2021, pp. 5136-5143.

[16]

J. Siekmann, Y. Godse, A. Fern, J. Hurst, Sim-to-real learning of all common bipedal gaits via periodic reward composition, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Xi’an, China, 2021, pp. 7309-7315.

[17]

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms, 2017, pp. 1-15, http://dx.doi.org/10.48550/, arXiv. 1707.06347, arXiv e-prints, arXiv:1707.06347.

[18]

X.B. Peng, M. Andrychowicz, W. Zaremba, et al., Sim-to-real transfer of robotic control with dynamics randomization, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, Brisbane, QLD, Australia, 2018, pp. 3803-3810.

[19]

A. Agrawal, S. Chen, A. Rai, K. Sreenath, Vision-aided dynamic quadrupedal locomotion on discrete terrain using motion libraries, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, 2022, pp. 4708-4714.

[20]

S. Gangapurwala, M. Geisert, R. Orsolino, M. Fallon, I. Havoutis, RLOC: Terrain-aware legged locomotion using reinforcement learning and optimal control, IEEE Trans. Robot. 38 (5) (2022) 2908-2927, [Online]. Available: http://arxiv.org/abs/2012.03094.

[21]

W. Yu, D. Jain, A. Escontrela, A. Iscen, Visual-locomotion: Learning to walk on complex terrains with vision, in: 5th Conf. Robot Learn., CoRL, 2021, pp. 1-12.

[22]

R. Yang, M. Zhang, N. Hansen, H. Xu, X. Wang,Learning vision-guided quadrupedal locomotion end-to-end with cross-modal transformers, 2021, CoRR, abs/2107.03996.

[23]

H. Duan, et al., Learning vision-based bipedal locomotion for challenging terrain, in: Proc. IEEE Int. Conf. Robot. Autom., ICRA, 2024, pp. 56-62.

[24]

C. Ji, D. Liu, W. Gao, S. Zhang, Blind walking balance control and disturbance rejection of the bipedal humanoid robot Xiao-Man via rein-forcement learning, in: Proc. IEEE Int. Conf. Robot. Biomimetics, ROBIO, Koh Samui, Thailand, 2023, pp. 1-7, http://dx.doi.org/10.1109/ROBIO58561.2023.10354629.

[25]

C. Yang, K. Yuan, S. Heng, et al., Learning natural locomotion behaviors for humanoid robots using human bias, 2020, http://dx.doi.org/10.1109/LRA.2020.2972879.

AI Summary AI Mindmap
PDF (2623KB)

666

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/