Leveraging large language models for comprehensive locomotion control in humanoid robots design

Shilong Sun , Chiyao Li , Zida Zhao , Haodong Huang , Wenfu Xu

Biomimetic Intelligence and Robotics ›› 2024, Vol. 4 ›› Issue (4) : 100187 -100187.

PDF (4897KB)
Biomimetic Intelligence and Robotics ›› 2024, Vol. 4 ›› Issue (4) :100187 -100187. DOI: 10.1016/j.birob.2024.100187
Research Article
research-article

Leveraging large language models for comprehensive locomotion control in humanoid robots design

Author information +
History +
PDF (4897KB)

Abstract

This paper investigates the utilization of large language models (LLMs) for the comprehensive control of humanoid robot locomotion. Traditional reinforcement learning (RL) approaches for robot locomotion are resource-intensive and rely heavily on manually designed reward functions. To address these challenges, we propose a method that employs LLMs as the primary designer to handle key aspects of locomotion control, such as trajectory planning, inverse kinematics solving, and reward function design. By using user-provided prompts, LLMs generate and optimize code, reducing the need for manual intervention. Our approach was validated through simulations in Unity, demonstrating that LLMs can achieve human-level performance in humanoid robot control. The results indicate that LLMs can simplify and enhance the development of advanced locomotion control systems for humanoid robots.

Keywords

Humanoid robots / Large language models / Locomotion control / Reinforcement learning

Cite this article

Download citation ▾
Shilong Sun, Chiyao Li, Zida Zhao, Haodong Huang, Wenfu Xu. Leveraging large language models for comprehensive locomotion control in humanoid robots design. Biomimetic Intelligence and Robotics, 2024, 4(4): 100187-100187 DOI:10.1016/j.birob.2024.100187

登录浏览全文

4963

注册一个新账户 忘记密码

CRediT authorship contribution statement

Shilong Sun: Writing - review & editing, Supervision, Funding acquisition, Formal analysis, Conceptualization. Chiyao Li: Writing - original draft, Methodology. Zida Zhao: Validation, Investigation, Formal analysis, Data curation. Haodong Huang: Validation, Software, Formal analysis, Conceptualization. Wenfu Xu: Writing - review & editing, Supervision, Funding acquisition.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the Guangdong Basic and Applied Basic Research Foundation, China (2024A1515012041), and in part by the Shenzhen Higher Education Stability Support Plan, China (GXWD20231130195340002), and in part by the Program of Shenzhen Peacock Innovation Team, Guangdong, China (KQTD20210811090146075), and the Basic Research Program of Shenzhen, China (JCYJ20220818102415034).

References

[1]

A. Joseph, B. Christian, A.A. Abiodun, F. Oyawale, A review on humanoid robotics in healthcare, MATEC Web Conf. 153 (2018) 02004.

[2]

K. Darvish, et al., Teleoperation of humanoid robots: A survey, IEEE Trans. Robot. 39 (3) (2023) 1706-1727.

[3]

M. Chignoli, D. Kim, E. Stanger-Jones, S. Kim, The MIT humanoid robot: Design, motion planning, and control for acrobatic behaviors, in: 2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids), IEEE, 2021, pp. 1-8.

[4]

M. Konishi, K. Kojima, K. Okada, M. Inaba, K. Kawasaki,ZMP feedback balance control of humanoid in response to ground acceleration, in:IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 8525-8531.

[5]

K. Ishihara, T.D. Itoh, J. Morimoto, Full-body optimal control toward versatile and agile behaviors in a humanoid robot, IEEE Robot. Autom. Lett. 5 (1) (2020) 119-126.

[6]

E. Dantec, et al., Whole-body model predictive control for biped lo-comotion on a torque-controlled humanoid robot,in: IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), 2022, pp. 638-644.

[7]

C. Piazza, C.D. Santina, G. Grioli, A. Bicchi, M.G. Catalano, Analytical model and experimental testing of the SoftFoot: An adaptive robot foot for walking over obstacles and irregular terrains, IEEE Trans. Robot. 40 (2024) 3290-3305.

[8]

S. Hanasaki, Y. Tazaki, H. Nagano, Y. Yokokohji,Running trajectory generation including gait transition between walking based on the time-varying linear inverted pendulum mode, in:IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), 2022, pp. 851-857.

[9]

T. Ishigaki, K. Yamamoto,Dynamics computation of a hybrid multi-link humanoid robot integrating rigid and soft bodies, in:IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 2816-2821.

[10]

J. Siekmann, Y. Godse, A. Fern, J. Hurst, Sim-to-real learning of all common bipedal gaits via periodic reward composition, in: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2021, pp. 7309-7315.

[11]

S.H. Jeon, S. Heim, C. Khazoom, S. Kim, Benchmarking potential based rewards for learning humanoid locomotion, in: Presented At the 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023.

[12]

M. Seo, et al., Deep imitation learning for humanoid loco-manipulation through human teleoperation, in: 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), IEEE, 2023, pp. 1-8.

[13]

M. Kobayashi, J. Yamada, M. Hamaya, K. Tanaka, Lfdt: Learning dual-arm manipulation from demonstration translated from a human and robotic arm,in: IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), 2023, pp. 1-8.

[14]

R. Firoozi, et al. Foundation models in robotics: Applications, challenges, and the future, 2023, arXiv preprint arXiv:2312.07843.

[15]

M. Ahn, et al., Do as i can, not as i say: Grounding language in robotic affordances, 2022, arXiv preprint arXiv:2204.01691.

[16]

S. Vemprala, R. Bonatti, A. Bucker, A. Kapoor, Chatgpt for robotics: Design principles and model abilities, Microsoft Auton. Syst. Robot. Res. 2 (2023) 20.

[17]

J. Liang, et al., Code as policies: Language model programs for embodied control, in: IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2023, pp. 9493-9500.

[18]

Y. Chen, J. Arkin, Y. Zhang, N. Roy, C. Fan,Autotamp: Autoregressive task and motion planning with llms as translators and checkers, 2023, arXiv preprint arXiv:2306.06531.

[19]

Y. Tang, W. Yu, J. Tan, H. Zen, A. Faust, T. Harada,Saytap: Language to quadrupedal locomotion, 2023, arXiv preprint arXiv:2306.07580.

[20]

M. Kwon, S.M. Xie, K. Bullard, D. Sadigh,Reward design with language models, 2023, arXiv preprint arXiv:2303.00001.

[21]

Y.J. Ma, et al. Eureka: Human-level reward design via coding large language models, 2023, arXiv preprint arXiv:2310.12931.

[22]

W. Yu, et al.Language to rewards for robotic skill synthesis, 2023, arXiv preprint arXiv:2306.08647.

[23]

M.G. Arenas, et al., How to prompt your robot: A promptbook for manipulation skills with code as policies, in: Towards Generalist Robots: Learning Paradigms for Scalable Skill Acquisition@CoRL 2023, 2023.

[24]

J. Shen, Y. Liu, X. Zhang, D. Hong,Optimized jumping of an articulated robotic leg, in:2020 17th International Conference on Ubiquitous Robots (UR), 2020, pp. 205-212.

[25]

M. Saveriano, F.J. Abu-Dakka, A. Kramberger, L. Peternel, Dynamic move-ment primitives in robotics: A tutorial survey, Int. J. Robot. Res. 42 (13)(2023) 1133-1184.

[26]

M. Jin, et al., Low-centroid crawling motion for humanoid robot based on whole-body dynamics and trajectory optimization,in: 2022 7th Interna-tional Conference on Robotics and Automation Engineering (ICRAE), 2022, pp. 199-205.

[27]

S. Schaal, Dynamic movement primitives-a framework for motor control in humans and humanoid robotics, in: Adaptive Motion of Animals and Machines, Springer, 2006, pp. 261-280.

[28]

S.H. Jeon, S. Heim, C. Khazoom, S. Kim, Benchmarking potential based rewards for learning humanoid locomotion, in: 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 9204-9210.

[29]

F.A. Koolen, Balance Control and Locomotion Planning for Humanoid Robots using Nonlinear Centroidal Models, Massachusetts Institute of Technology, 2020.

[30]

G. Colin, J. Byrnes, Y. Sim, P.M. Wensing, J. Ramos, Whole-body dynamic telelocomotion: A step-to-step dynamics approach to human walking reference generation,in: 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), 2023, pp. 1-8.

[31]

C. Mastalli, et al., Crocoddyl: An efficient and versatile framework for multi-contact optimal control, in: 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 2536-2542.

[32]

M. Shridhar, L. Manuelli, D. Fox, Cliport: What and where pathways for robotic manipulation,in:Presented At the Proceedings of the 5th Conference on Robot Learning, Proceedings of Machine Learning Research, 2022.

[33]

Z. Li, X.B. Peng, P. Abbeel, S. Levine, G. Berseth, K. Sreenath,Robust and versatile bipedal jumping control through multi-task reinforcement learning, 2023, arXiv preprint arXiv:2302.09450.

[34]

I. Radosavovic, T. Xiao, B. Zhang, T. Darrell, J. Malik, K. Sreenath, Real-world humanoid locomotion with reinforcement learning, Science Robotics 9 (89)(2024) eadi9579.

[35]

W. Zhao, J.P. Queralta, T. Westerlund, Sim-to-real transfer in deep rein-forcement learning for robotics: a survey, in: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, 2020, pp. 737-744.

[36]

J. Peters, S. Vijayakumar, S. Schaal, Reinforcement learning for humanoid robotics , in:Proceedings of the Third IEEE-RAS International Conference on Humanoid Robots, 2003, pp. 1-20.

[37]

T.-Y. Yang, T. Zhang, L. Luu, S. Ha, J. Tan, W. Yu, Safe reinforcement learning for legged locomotion, in: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2022, pp. 2454-2461.

[38]

Z. Xie, P. Clary, J. Dao, P. Morais, J. Hurst, M. van de Panne, Iterative reinforcement learning based design of dynamic locomotion skills for cassie , 2019, arXiv preprint arXiv:1903.09537.

[39]

L. Krishna, G.A. Castillo, U.A. Mishra, A. Hereid, S. Kolathaya, Linear policies are sufficient to realize robust bipedal walking on challenging terrains, IEEE Robot. Autom. Lett. 7 (2) (2022) 2047-2054.

[40]

J. Hwangbo, et al., Learning agile and dynamic motor skills for legged robots, Science Robotics 4 (26) (2019) eaau5872.

[41]

J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, M. Hutter, Learning quadrupedal locomotion over challenging terrain, Science Robotics 5 (47)(2020) eabc5986.

[42]

T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, M. Hutter, Learning robust perceptive locomotion for quadrupedal robots in the wild, Science Robotics 7 (62) (2022) eabk2822.

[43]

D. Hoeller, N. Rudin, D. Sako, M. Hutter, Anymal parkour: Learning agile navigation for quadrupedal robots , 2023, arXiv preprint arXiv:2306.14874.

[44]

J. Tan, et al., Sim-to-real: Learning agile locomotion for quadruped robots , 2018, arXiv preprint arXiv:1804.10332.

[45]

D. Jain, A. Iscen, K. Caluwaerts, Hierarchical reinforcement learning for quadruped locomotion, in: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2019, pp. 7551-7557.

[46]

C. Yang, K. Yuan, Q. Zhu, W. Yu, Z. Li, Multi-expert learning of adaptive legged locomotion, Science Robotics 5 (49) (2020) eabb2174.

[47]

C.E. Mower, et al., ROS-llm: A ROS framework for embodied AI with task feedback and structured reasoning , 2024, arXiv preprint arXiv:2406.19741.

[48]

F. Stella, C. Della Santina, J. Hughes, How can LLMs transform the robotic design process? Nat. Mach. Intell. (2023).

[49]

D. Honerkamp, M. Buchner, F. Despinoy, T. Welschehold, A. Valada,Language-grounded dynamic scene graphs for interactive object search with mobile manipulation, 2024, arXiv preprint arXiv:2403.08605.

[50]

I. Singh, et al., Progprompt: Generating situated robot task plans using large language models, in: 2023 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2023, pp. 11523-11530.

[51]

Y. Kant, et al. Housekeep: Tidying virtual households using commonsense reasoning,in: European Conference on Computer Vision, Springer, 2022, pp. 355-373.

[52]

J. Zhang, et al., FLTRNN: Faithful long-horizon task planning for robotics with large language models, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 6680-6686.

[53]

M. Han, Y. Zhu, S.-C. Zhu, Y.N. Wu, Y. Zhu,InterPreT: Interactive predicate learning from language feedback for generalizable task planning, in: Robotics: Science and Systems, 2024.

[54]

G. Chen, et al.,Language-augmented symbolic planner for open-world task planning, in: Robotics: Science and Systems, 2024.

[55]

W. Xia, et al., Kinematic-aware prompting for generalizable articulated object manipulation with llms, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 2073-2080.

[56]

K. Shirai, et al., Vision-language interpreter for robot task planning, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 2051-2058.

[57]

Z. Zhou, J. Song, K. Yao, Z. Shu, L. Ma, Isr-llm: Iterative self-refined large language model for long-horizon sequential task planning, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 2081-2088.

[58]

X. Liu, P. Li, W. Yang, D. Guo, H. Liu, Leveraging large language model for heterogeneous ad hoc teamwork collaboration, in: Robotics: Science and Systems, 2024.

[59]

S. Wang, et al., LLM^3: Large language model-based task and motion planning with motion failure reasoning, 2024, arXiv preprint arXiv:2403. 11552.

[60]

Z. Zhao, W.S. Lee, D. Hsu, Large language models as commonsense knowledge for large-scale task planning, Adv. Neural Inf. Process. Syst. 36 (2024).

[61]

S. Tellex, N. Gopalan, H. Kress-Gazit, C. Matuszek, Robots that use language, Annu. Rev. Control. Robot. Auton. Syst. 3 (1) (2020) 25-55.

[62]

H. Kress-Gazit, G.E. Fainekos, G.J. Pappas, Translating structured english to robot controllers, Adv. Robot. 22 (12) (2008) 1343-1359.

[63]

S. Stepputtis, J. Campbell, M. Phielipp, S. Lee, C. Baral, H. Ben Amor, Language-conditioned imitation learning for robot manipulation tasks, Adv. Neural Inf. Process. Syst. Syst. 33 (2020) 13139-13150.

[64]

J.Y. Chai, Q. Gao, L. She, S. Yang, S. Saba-Sadiya, G. Xu, Language to action: Towards interactive task learning with physical agents, in: IJCAI, vol. 7, 2018, pp. 2-9.

[65]

T.M. Howard, S. Tellex, N. Roy, A natural language planner interface for mobile manipulators, in: 2014 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2014, pp. 6652-6659.

[66]

J. Austin, et al., Program synthesis with large language models, 2021, arXiv preprint arXiv:2108.07732.

[67]

Y. Li, et al., Competition-level code generation with alphacode, Science 378 (6624) (2022) 1092-1097.

[68]

F. Alet, et al., A large-scale benchmark for few-shot program induction and synthesis,in:International Conference on Machine Learning, PMLR, 2021, pp. 175-186.

[69]

A. Bucker, et al., Latte: Language trajectory transformer, in: 2023 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2023, pp. 7287-7294.

[70]

K. Lin, C. Agia, T. Migimatsu, M. Pavone, J. Bohg, Text2motion: From natural language instructions to feasible plans, Auton. Robots 47 (8) (2023) 1345-1365.

[71]

X.B. Peng, Z. Ma, P. Abbeel, S. Levine, A. Kanazawa, Amp: Adversarial motion priors for stylized physics-based character control, ACM Trans. Graph. (ToG) 40 (4) (2021) 1-20.

[72]

L. Ye, J. Li, Y. Cheng, X. Wang, B. Liang, Y. Peng, From knowing to doing: Learning diverse motor skills through instruction learning , 2023, arXiv preprint arXiv:2309.09167.

[73]

P.A. Bhounsule, D. Torres, E.H. Hinojosa, A. Alaeddini, Task-level control and poincaré map-based sim-to-real transfer for effective command following of quadrupedal trot gait, in:2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids) , 2023, pp. 1-8.

[74]

R.S. Sutton, A.G. Barto, Reinforcement Learning:An Introduction, MIT Press, 2018.

[75]

B. Zhao, H. Dong, Y. Wang, T. Pan, PPO-TA: Adaptive task alloca-tion via proximal policy optimization for spatio-temporal crowdsourcing, Knowl.-Based Syst. 264 (2023) 110330.

PDF (4897KB)

1020

Accesses

0

Citation

Detail

Sections
Recommended

/